Hit me baby! As many times as you like!
As you may remember, I some time ago added a wee daemon here at the org, a fiendish script designed to foil those crazy bots and spiders, script kiddies, et al, who think it's fine and dandy to request two, three or even more pages a second, every second, until your entire web site is mirrored elsewhere
, for some reason.
The concept is simple: If a client sends two requests within a set time, their "hammer count" is increased. When it reaches a set level, instead of the page, they get a short warning message, and must wait a few seconds. The more they hammer, the longer they have to wait. Simple.
This script worked well, but did suffer one major flaw; if the client didn't accept cookies, or had them disabled, they bypassed Anti-Hammer, and many did. I didn't mind too much, secretly thankful that there was no chance the protection might interfere with the Googlebot. Bots rarely accept cookies.
Cookies are great (no, really, they are), but the trouble with cookies (aside from the small problem of clients simply ignoring them) is that we need the browser to send them back
. So that's two requests, and what comes back could be spoofed, generated, anything. Clearly this isn't a lot of use for a script that runs before the client has even been served one
It's been at the back of my mind for some time, to close this loophole, and also add a few other desirable features. In all my spare moments this last few days, that's exactly what I've been doing..
No way around!
Anti-Hammer now takes a different approach to client sessions. Rather than wait for some session id to come back, Anti-Hammer uses a mix of all the available client properties to create a unique client id there-and-then, and from that point, recognizes the client only by this id (which is basically an MD5 of all that data concatenated together).
Anti-Hammer stores this data just like a php session; it is, in fact, very like a php session in almost all respects. The big difference being that Anti-Hammer doesn't need the client to send the ID back
. The client doesn't even know what the ID is.
In short, there's almost no way around Anti-Hammer! All bots, spiders, creepy crawlies, script-kiddies and other web-getting entities are now subject to Anti-Hammer protection, cookies or no cookies!
Of course, you may want
to allow certain bots to hammer some. And so, Anti-Hammer comes with a cute way to do exactly that. You simply setup an ini file with a list of friendly bot user agent strings => "IP Mapping files"..
Mozilla/5.0 (compatible; Googlebot=google.txt
"IP Mapping files" just being my fancy phrase to describe the standard Spider IP lists you can get here
Anti-Hammer examines the user agent strings of all clients, and if a bot-match is found, checks the client's IP against the associated IP list. If it's in the list, Anti-Hammer protection is bypassed, and the page continues. Simple as that. Of course, you will need to download copies of the IP lists to your server (they seldom change). I'll likely package the main ones, and an exemptions.ini
, along with the next Anti-Hammer distro proper.
There have been quite a few changes since I last blogged about Anti-Hammer (full details at the foot of the script itself), including..
- Added configuration options for loads more behaviours and messages.
- Anti-Hammer now sends a proper 503 (service temporarily unavailable) message, rather than a 200 OK message.
- You can now set an absolute cut-off point, beyond which the client gets nada.
- Added Garbage Collection for the "session" files. This is fully configurable, and the code easily transportable to other web apps you may have.
- Added (back) rolling ban times. Rather than a few/3/5/10/20 seconds, Anti-Hammer can increment the retry delay count with each and every hammer!
I personally don't think it's as effective, and encourages folks into a refresh-fest, but it is fun.
- The old (php sessions) method is still available as an option, if required
- Admin recognition, for busy web masters (I always bypass Anti-Hammer!)
- And More!
View/Download the latest php source code, here.
[There's now also a page all about it, here
I've been uploading the latest versions as I go along, testing them here at the org, and with great success, I should add. The latest version (with bot-exemptions), in the half hour since it went up, has allowed hundreds of friendly bots and ordinary (by which I mean you special people) users to rifle around to their heart's content, whilst simultaneously, crucially, blocking almost a hundred crazy hammers, mainly automated spammers, and requests for urls like "/serv/tricks/htaccess2.php/index.php?url=../.././../../../../../../../... etc. ../../../../../etc/resolv.conf", and so on, in other words, exactly the sorts of requests Anti-Hammer was designed to thwart. Excellent!
Someone also seems to have come up with a script to semi-automatically harass c99.php in various ways. Someone aught to tell them my copy is just a joke (And all of Morocco and the Arab world while you are at it, ta!), and anyway, had all it's "?act=" switched for "?foo=" since the server boys added that wee safety gremlin a few weeks back!
Anti-Hammer has right now started the process of using this poor subject as a Guinea-pig for its new "absolute cut-off" feature! Bringing me neatly to what Anti-Hammer eventually says to those sorts..
ps. feel free to play with it, though fer gawd sake, a low-content page, eh! Maybe this