Search Engines - Inside Out by www.itsallaboutlinks.com
Paras Shah
Webmaster & SEO Expert
Web Directory - Add Link - Submit Article - Online Store - Forum
http://www.itsallaboutlink
s.com
"search engine" term is generally used to describe search
engines based on crawling technology and human edited web
directories. Both the search engines operates differently.
Search Engines Based on Crawling Technology
Search engines like Google is considered as Crawler based search
engines and they build their listings automatically. They
"crawl" or "spider" the web pages, then people search through
what they have found.
If you change your web pages, search engines operates on
crawling technology, eventually find these changes
automatically. They will rate your website based on the changes
you have made and that can affect how you are listed. Page
titles, meta tags, header, body and all other elements plays the
major role in your website listing.
Human Edited Web Directories
A human edited web directory, such as DMOZ - Open Directory,
depends on humans for its listings. You submit a link with short
description about your website to the directory, human editors
review your website and list you on the directory. A search
looks for matches only in the descriptions submitted.
Changing your web pages has no effect on your listing. Things
that are useful for improving a listing with a search engine
have nothing to do with improving a listing in a directory. The
only exception is that a good site, with good content, might be
more likely to get reviewed for free than a poor site.
Hybrid Search Engines
In the web's early days, it used to be that a search engine
either presented crawling based results or human edited
listings. Today, it extremely common for both types of results
to be presented. Usually, a hybrid search engine will favor one
type of listings over another. For example, MSN Search is more
likely to present human-powered listings from LookSmart.
However, it does also present crawler-based results (as provided
by Inktomi), especially for more obscure queries.
The Parts Of A Crawling Based Search Engines
Crawler-based search engines have three major elements. First is
the spider, also called the crawler. The spider visits a web
page, reads it, and then follows links to other pages within the
site. This is what it means when someone refers to a site being
"spidered" or "crawled." The spider returns to the site on a
regular basis, such as every month or two, to look for changes.
Everything the spider finds goes into the second part of the
search engine, the index. The index, sometimes called the
catalog, is like a giant book containing a copy of every web
page that the spider finds. If a web page changes, then this
book is updated with new information.
Sometimes it can take a while for new pages or changes that the
spider finds to be added to the index. Thus, a web page may have
been "spidered" but not yet "indexed." Until it is indexed --
added to the index -- it is not available to those searching
with the search engine.
Search engine software is the third part of a search engine.
This is the program that sifts through the millions of pages
recorded in the index to find matches to a search and rank them
in order of what it believes is most relevant. You can learn
more about how search engine software ranks web pages on the
aptly-named How Search Engines Rank Web Pages page.
Major Search Engines: The Same, But Different
All crawler-based search engines have the basic parts described
above, but there are differences in how these parts are tuned.
That is why the same search on different search engines often
produces different results. Some of the significant differences
between the major crawler-based search engines are summarized on
the Search Engine Features Page. Information on this page has
been drawn from the help pages of each search engine, along with
knowledge gained from articles, reviews, books, independent
research, tips from others and additional information received
directly from the various search engines.