What The…
In just one day, I’ve gotten about 1500 hits via Google from one source which seems to be carrying out robotic searches. This is not just starting now, I’ve noticed it for the past few days, but it really stands out in my stats, in the search engine details, since the clean slate of March 1st has zeroed all the figures.
The searches seem to be carried out using Google as a vehicle and seem to target Movable Type blogs. The copious searches all have the same structure, looking for web pages which include a Movable Type archive address structure like this:
inurl:”archives/001026.html”
Just change the number before the “.html” extension to include any possible number between 000100 and 001700, and you’ve got an idea of the massive barrage of auto-generated searches that get carried out (the 1700 limit is due to my blog having roughly that number of posts–the searches must certainly exceed that number elsewhere). The robot software then apparently follows each of the links given by Google which results in a referral hit on my site–and, I suspect, many, many others. Moreover, each search results in up to 20 hits for any one archive page. It’s as if someone is using Google as a tool in a bizarre cataloging of Movable Type blogs.
Anyone else out there notice this?
The searches and hits, by the way, are all coming from the same base IP address within an eleven-number range, specifically 85.255.113.179 ~ 85.255.113.189.
Update: I’m not the only one, as I expected. Here’s a guy who linked to this post and is having the identical problem. He’s done better than I have, though, and identified the IP address block as being from the Ukraine.
I’ve just noticed that when I go to look at my account on gmail or search google, the server seems to be tied up. This is highly unusual which makes me think that they’re servers are under some kind of viral attack of some sort. Otherwise, they are a pretty organized bunch. I don’t expect problems from google.
Is there a way to get any info about an ip? e.g. country of origin?
It would be interesting if you did a pdf of the stats that you look at, for one day, for example, and put that on the site and talk about it, so that folks can become familiar w/ what you are looking at.
If you do a whois request – either Google for a site to give you that like SamSpade.org or do it from a terminal on OS X or another Unix system – then you’ll get the registrar etc. The company that hosts these machines is called Inhoster and is, indeed, in the Ukraine. Their abuse contact address is abuse [at] inhoster.com. They own a big block of IPs that contains all the ones from this particular person – who seems to be using a peculiarly inefficient way of spidering blog entries. It’s not as if it works for all MovableType blogs. If you have entries archived by name rather than post number, his technique won’t work.
Don’t worry, it’s just “homeland security” making sure these bloggers are not encouraging “post-9/11 stuff”.
After all, everyone could be dangerous nowadays:
http://www.shns.com/shns/g_index2.cfm?action=detail&pk=RAISEALARM-02-28-06
Sounds like a harvester for potential comment-spam
wave against MT?
It’s a fishing expedition from a comment spambot; that IP address is already in the blacklist that SpamKarma checks.
Mike
This is a typical link spammer who spams blogs, wikis, guestbooks. See here:
http://cyber.law.harvard.edu:8080/globalvoices/wiki/index.php?title=Special:Contributions&target=85.255.113.180
http://cyber.law.harvard.edu:8080/globalvoices/wiki/index.php?title=Special:Contributions&target=85.255.113.189
Chris: I don’t see anything in those links that comes close to what I got.