Search Engine Basics
Just about every major search engine has basically 3 parts. The
first is the spider, otherwise called a robot. The spider visits
a web page, reads it, and then follows links to other pages
within the site. This is what it means when someone refers to a
site being "spidered" or "crawled." The spider returns to the
site on a regular basis, such as every month or two, to look for
changes and updates. (If a site is updated often and is well
marketed, this will happen much more often, sometimes even every
day)
Everything the spider finds goes into the second part of a
search engine, the index. The index, sometimes called the
database, is like a giant library containing a copy of every web
page that the spider finds. If a web page is different or
appears to have changes, then the site will be re-indexed and
this "book" is updated with new information.
Sometimes it can take a while for new pages or changes that the
spider finds to be added to the index. Thus, a web page may have
been "spidered" but not yet "indexed." Until the new information
is indexed, it is not available to those searching with the
search engine.
The third, and most sophisticated part of a search engine is
the ranking software (sometimes referred to as the algo or
algorithm). This is the program that sifts through the millions
of pages recorded in the index to find matches to a search and
rank them in order of what it believes is most relevant. All
search engines have the basic parts described above, but there
are differences in how these parts are tuned. That is why the
same search on different search engines often produces different
results.