Searching the World Wide Web
|
Search Engines
A search engine is a searchable index or database of web pages and other information available on the Internet.
A typical search engine consists of three parts:
-
Spider - also known as robot, crawler, indexer, visits web pages to build an index for the search engine by visiting links on the web pages that it visits
-
Index - also known as the "catalog", where the spider stores a text-only copy of the web pages that it finds
-
Search Engine Software - this is the software that searches through the index of web pages assembled by the spider; it looks for pages that contain one or more of the words that you've entered and returns a list of results, usually ranked in some way
How do search engines rank pages?
-
A list of search engine "hits" is determined by the location and frequency of the search terms that you enter
-
Important locations on a web page include: title; headline; first few paragraphs; meta tags
-
Frequency of the search terms on a page -- results with higher the frequency will be ranked higher in the hits list
No two indexes cover the Web in the same way; they will differ in the number of pages indexed, how often they update their index, and criteria for including pages.
|