UK Webmaster Talk - Online Marketing - SEO


 

Search Algorithm

This is a discussion on Search Algorithm within the Website Design Forum forums, part of the Website Design & Development category; Hi I am in the middle of developing a search engine. I have the spider and interface designed and built, ...


Go Back   UK Webmaster Talk - Online Marketing - SEO > Website Design & Development > Website Design Forum

Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Notices

Reply

 

LinkBack Thread Tools Display Modes
Old 23-03-2005, 17:56   #1 (permalink)
Moderator
 
Join Date: Mar 2005
Location: Barnsley, UK
Posts: 1,001
iTrader: 0 / 0%
Wistow has a reputation beyond reputeWistow has a reputation beyond reputeWistow has a reputation beyond reputeWistow has a reputation beyond reputeWistow has a reputation beyond reputeWistow has a reputation beyond repute
Default Search Algorithm

Hi

I am in the middle of developing a search engine. I have the spider and interface designed and built, but would like your input on the search algorithm.

What would be the best way to handle this? Currently, when a search is performed the keyword is taken and the keywords in the database are searched for this term then the results are output in the order that they appear in the database if they contain the term in their keywords.

Obviously this will not offer highly-accurate results. How would I search the database for the term (which it does now) then search those results for the term in the description, then the title, then find how many backlinks point to the site in question? Then output the results in the order that the first result contains the most of every variable?

The site is currently coded in php if this helps you with your reply.

Thanks for any help!

Craig
Wistow is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 23-03-2005, 19:13   #2 (permalink)
Cool Newbie
 
Join Date: Mar 2005
Location: Ukraine
Posts: 26
iTrader: 0 / 0%
sensovision is on a distinguished road
Send a message via ICQ to sensovision Send a message via AIM to sensovision Send a message via MSN to sensovision Send a message via Yahoo to sensovision
Default

Hi Craig! I've dream to make my own search engine one day but never had time to start work on it :P
first of all I don't think php would lead you anywhere far, if you really wish to make good search engine which would be popular you need somehting like C or C++, PHP is good but slow...
As for algo I've thought a lot about this and believe that currently the only way to make successful search engine is to provide something different from existing algos which are PR based or something like this. I don't think about particular solution of what should be taken into account for providing relevancy, but I always though to include some human rating system(maybe hidden one) and probably some model which would rank results basing on the data from my directory ie. Pharos-search.com
Sorry I wasn't big help here, but I'll look forward to see what kind of ideas have other people, would love to see this topic running
__________________
Denis
Pharos Search - A Human Edited Directory
[Forum URL is removed according to new rules ]
sensovision is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 23-03-2005, 19:23   #3 (permalink)
Moderator
 
Join Date: Mar 2005
Location: Barnsley, UK
Posts: 1,001
iTrader: 0 / 0%
Wistow has a reputation beyond reputeWistow has a reputation beyond reputeWistow has a reputation beyond reputeWistow has a reputation beyond reputeWistow has a reputation beyond reputeWistow has a reputation beyond repute
Default

I understand that php is slow, but this is just a side project (a test if you will) to see if I can do it
I would agree that C and C++ are good languages for a search engine, but believe Perl or Java to be the best!

The problem with creating a full-scale search engine is the setup cost, as you would need hardware in the region of:

1. A dedicated database server with around 1.4 terabytes hard-disk, a 3GHz pentium 4 processor and around 4GB of RAM
2. Around 4 query/application servers consisting of 2GB RAM each

The above would probably be sufficient for an index of around 250 Million pages but the cost involved is way out of reach of most people.

Building a successful search engine is also a dream of mine, and I have lost count of the number of business plans I have written that weren't quite 'there' with regards to going ahead with them.

Kind Regards,

Craig
Wistow is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 23-03-2005, 19:27   #4 (permalink)
Cool Newbie
 
Join Date: Mar 2005
Location: Ukraine
Posts: 26
iTrader: 0 / 0%
sensovision is on a distinguished road
Send a message via ICQ to sensovision Send a message via AIM to sensovision Send a message via MSN to sensovision Send a message via Yahoo to sensovision
Default

using part of perl would be good as it have wide possibilities on dealing with the text, but do you think it's good idea to write whole engine on it as it's slow as well(I think), so as Java(from what I've heard as I never used it myself).
anyway wish you luck with your project, would be curious to see how it will perform.
__________________
Denis
Pharos Search - A Human Edited Directory
[Forum URL is removed according to new rules ]
sensovision is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 05-04-2005, 06:55   #5 (permalink)
Senior Member
 
Join Date: Jan 2005
Location: Centralia, WA
Posts: 339
iTrader: 0 / 0%
woodflooristcom is on a distinguished road
Default Getting a headache just thinking about this

Getting a headache just thinking about this. I guess the size of this kind of project is just too mind boggling. Not only is the hardware and software a major expense but the people needed supervise the data that the spiders collect would add up too.

I usually go to bed when I start to think about how amazing what Google does really is. They log every phone number, every email address, every snail mail address, every word and misspelled word, every name, and how far each is from is from each other and in what order. Just html and web and 1 show the size of this project.

You might want to consider something like having twenty children first.
woodflooristcom is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 05-04-2005, 09:21   #6 (permalink)
Moderator
 
Join Date: Jan 2005
Location: post is there >>
Posts: 1,586
iTrader: 0 / 0%
robertall is just really nice
Send a message via AIM to robertall Send a message via MSN to robertall
Default

All that RAM makes my head hurt. I only have 192mb.
__________________
Free .org domain - Click for details!
Do not pm asking for things. My inbox is full. Add me to MSN!
robertall is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply

Tags
algorithm, search

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads

Thread Thread Starter Forum Replies Last Post
Search in UK chrisfr Search Engine Optimisation (SEO) 8 10-05-2006 13:48
MSN Search monaghan Search Engine Optimisation (SEO) 0 14-11-2005 23:56
What should your own search facility do andylong Website Design Forum 1 01-07-2005 09:45
search bar benf Programming / Scripting / Coding 11 23-04-2005 19:26
MSN Search? woodflooristcom Website Development Forum 1 15-02-2005 14:11


All times are GMT +1. The time now is 18:18.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.2.0
UK Webmaster Forum © WebmasterTalk.co.uk | Design by Forbairt

Ad Management by RedTyger

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41