Internet search engine grows as people use it.
The logic is pretty simple. Search Engine crawl the web like a robot and look for specific words and keep them in there local database with indexing. If the user searches a word it will retrieve all the closest matches and display in the screen. If the word is not present it will be added to crawl robot list.
Hope this answers your question.
2006-07-11 08:59:29
·
answer #1
·
answered by moin_anjum 5
·
0⤊
0⤋
As mentioned, if you are going to actually index all the pages on the Web, you're going to need a lot of storage space but...
If you just want to crawl the web searching for things (like a friend of mine wrote a spider that search the web for his name) then you can just write a bot or spider.
The spider program just makes consecutive HTTP request to every link on a page. So when it starts, it goes to the first page, finds every link on that page, pulls out al the addresses and then searches all those pages, repeating the process.
You'll need to know how to write data structures and do searching and sorting as well as parsing HTML, regular expression, etc.
This can be done in any language, but what you need is already in .NET, Java and PERL.
2006-07-11 09:00:06
·
answer #2
·
answered by Anonymous
·
0⤊
0⤋
You can, too, given lots of storage space and great internet connectivity.
You need to create a spider program that goes surfing the web, collecting links and downloading entire websites. You need the spider to be smart, so it respects the robots.txt guidelines and doesn't clog up the sites it visits.
You need to create a program that gets the relevant keywords from a given webpage (*any* webpage) so you can search for them.
You need to design a database so you can optimize the way you store the addresses and relevant keywords, so the process of searching through them is not resource-expensive.
You need either a big, ungodly fast computer or tons of small computers so processing everyone's queries doesn't take a century or so.
You need lots of disk space just to store a list of all the webpages of the Internet. Multiply that by the number of keywords a webpage has.
2006-07-11 08:58:04
·
answer #3
·
answered by Locoluis 7
·
0⤊
0⤋
Search engines index pages, so whatever a search engine indexes would have to be on its server. So you'd need lots and lots of servers to do this. Also, you'd have to program a lot. For example, you'd have to invent a method to determine which websites go first on the results page.
2006-07-11 08:50:16
·
answer #4
·
answered by MattH 2
·
0⤊
0⤋
i think only big companies can.
2006-07-11 08:45:47
·
answer #5
·
answered by Anry 7
·
0⤊
0⤋