Introduction and Purpose
The purpose of this article is to provide evidence and information to counteract the suggestion that Assertivenet is potentially used for malicious purposes.
Initial Research
On Saturday, March 11, 2006, I received a somewhat urgent telephone call from a client of mine, Hibiscus Florals (www.hibiscusflorals.com). The owner, Mark Morkowski, was concerned because he had been reviewing his website traffic statistics and had noticed that at numerous points throughout the day, a user or spider from "ASSERTIVENET" (IP 66.154.103.125) had visited the Hibiscus website.
Since this was rather unusual, Mark elected to investigate further by searching for more information "Assertivenet" via the Google search engine. The first three results that he found appear below:
It is this lack of information that likely led some of the members of the PowerBASIC forums to block the IP range 66.154.* from accessing their various websites, and justifiably so. But this same lack of information led to additional questions:
At this point, I decided to look beyond what the website traffic statistics revealed, as well as the information that Mark's initial search revealed. I needed to start by answering the questions I posed earlier, and in order to do so, I needed to access the raw log files for the Hibiscus website.
I opened up the log files, searched for the particular IPs in question, and found a series of entries such as these:
2006-03-11 03:47:34 66.154.103.125 - 216.89.218.168 80 GET /robots.txt - 200 0 400 285 78 HTTP/1.0 www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html -
2006-03-11 03:47:34 66.154.103.119 - 216.89.218.168 80 GET /larger_image.asp PID=215 200 0 0 299 125 HTTP/1.0 www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html -
2006-03-11 03:50:37 66.154.103.119 - 216.89.218.168 80 GET /larger_image.asp PID=195 200 0 0 299 31 HTTP/1.0 www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html -
2006-03-11 07:47:05 66.154.103.125 - 216.89.218.168 80 GET /robots.txt - 200 0 400 285 78 HTTP/1.0 www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html -
2006-03-11 07:47:05 66.154.103.119 - 216.89.218.168 80 GET /larger_image.asp PID=219 200 0 0 299 109 HTTP/1.0 www.hibiscusflorals.com Gigabot/2.0/gigablast.com/spider.html -
The spider in this case actually belongs to a search engine called Gigablast, and is appropriately named the Gigabot. The Gigabot only crawled pages and files as other search engines have, and made no attempts whatsoever to access files and scripts of a known malicious nature.
Gigablast is a "Tier 2" search engine that has over 1,000,000,000 pages indexed as of the date of this article (March 13, 2006.) While it is not on the same level in terms of popularity as the Big 3 of Yahoo!, MSN, and Google, it has indexed a significantly large portion of the web, and can be useful for some searches. In particular, Gigablast has implemented an "Giga bits" feature whereby alternate searches are suggested based on the user's original query in order to help narrow the query down and provide greater relevancy.
I conducted additional research and discovered that some IP addresses from the 66.154.* IP block do resolve to gigablast.com e.g.:
As you may well have gathered by now, the Gigabot is a perfectly safe spider that acts and operates in the same manner as other search engine spiders operate. There is no reason at this time to block the 66.154.* IP range that the bot uses; if anything, webmasters would gain from the potential free traffic that Gigablast would generate for their websites as the result of the Gigabot's efforts.
Adam Senour is a freelance web designer based out of the Greater Toronto Area. His latest project is Search Engine Friendly Layouts, a series of tableless layouts using CSS that load a website's content area first and foremost.