This is a discussion on Search Algorithm within the Website Design Forum forums, part of the Website Design & Development category; Hi I am in the middle of developing a search engine. I have the spider and interface designed and built, ...
| |||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| | #1 (permalink) |
| Moderator | Hi I am in the middle of developing a search engine. I have the spider and interface designed and built, but would like your input on the search algorithm. What would be the best way to handle this? Currently, when a search is performed the keyword is taken and the keywords in the database are searched for this term then the results are output in the order that they appear in the database if they contain the term in their keywords. Obviously this will not offer highly-accurate results. How would I search the database for the term (which it does now) then search those results for the term in the description, then the title, then find how many backlinks point to the site in question? Then output the results in the order that the first result contains the most of every variable? The site is currently coded in php if this helps you with your reply. Thanks for any help! Craig |
| | |
| | #2 (permalink) |
| Cool Newbie | Hi Craig! I've dream to make my own search engine one day but never had time to start work on it :P first of all I don't think php would lead you anywhere far, if you really wish to make good search engine which would be popular you need somehting like C or C++, PHP is good but slow... As for algo I've thought a lot about this and believe that currently the only way to make successful search engine is to provide something different from existing algos which are PR based or something like this. I don't think about particular solution of what should be taken into account for providing relevancy, but I always though to include some human rating system(maybe hidden one) and probably some model which would rank results basing on the data from my directory ie. Pharos-search.com Sorry I wasn't big help here, but I'll look forward to see what kind of ideas have other people, would love to see this topic running
__________________ Denis Pharos Search - A Human Edited Directory [Forum URL is removed according to new rules |
| | |
| | #3 (permalink) |
| Moderator | I understand that php is slow, but this is just a side project (a test if you will) to see if I can do it I would agree that C and C++ are good languages for a search engine, but believe Perl or Java to be the best! The problem with creating a full-scale search engine is the setup cost, as you would need hardware in the region of: 1. A dedicated database server with around 1.4 terabytes hard-disk, a 3GHz pentium 4 processor and around 4GB of RAM 2. Around 4 query/application servers consisting of 2GB RAM each The above would probably be sufficient for an index of around 250 Million pages but the cost involved is way out of reach of most people. Building a successful search engine is also a dream of mine, and I have lost count of the number of business plans I have written that weren't quite 'there' with regards to going ahead with them. Kind Regards, Craig |
| | |
| | #4 (permalink) |
| Cool Newbie | using part of perl would be good as it have wide possibilities on dealing with the text, but do you think it's good idea to write whole engine on it as it's slow as well(I think), so as Java(from what I've heard as I never used it myself). anyway wish you luck with your project, would be curious to see how it will perform.
__________________ Denis Pharos Search - A Human Edited Directory [Forum URL is removed according to new rules |
| | |
| | #5 (permalink) |
| Senior Member | Getting a headache just thinking about this. I guess the size of this kind of project is just too mind boggling. Not only is the hardware and software a major expense but the people needed supervise the data that the spiders collect would add up too. I usually go to bed when I start to think about how amazing what Google does really is. They log every phone number, every email address, every snail mail address, every word and misspelled word, every name, and how far each is from is from each other and in what order. Just html and web and 1 show the size of this project. You might want to consider something like having twenty children first. |
| | |
| | #6 (permalink) |
| Moderator | All that RAM makes my head hurt. I only have 192mb.
__________________ Free .org domain - Click for details! Do not pm asking for things. My inbox is full. Add me to MSN! |
| | |
| Tags |
| algorithm, search |
| Thread Tools | |
| Display Modes | |
| |
| ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Search in UK | chrisfr | Search Engine Optimisation (SEO) | 8 | 10-05-2006 13:48 |
| MSN Search | monaghan | Search Engine Optimisation (SEO) | 0 | 14-11-2005 23:56 |
| What should your own search facility do | andylong | Website Design Forum | 1 | 01-07-2005 09:45 |
| search bar | benf | Programming / Scripting / Coding | 11 | 23-04-2005 19:26 |
| MSN Search? | woodflooristcom | Website Development Forum | 1 | 15-02-2005 14:11 |