Automatically ban web site hammers!
Protect your site against Referer Spam!
Deny script-kiddie and h4x0r requests!
Send bad bots and spiders packing!
Protect your valuable server resources for genuine clients..
Anti-Hammer is a php script that runs before your pages do, watching. As requests arrive, Anti-Hammer checks how long it's been since that client's last request. If a reasonable amount of time has passed, the page is served as usual. But if not, their "Hammer Count" is increased. Oh oh!
When the hammer count reaches preset trigger levels, their hammering is suspended, and instead of the page, they get a cute message (read: warning), and must wait X amount of seconds before trying again.
The more they hammer, the longer they have to wait, incrementally. Simple.
You can even set an absolute cut-off point, beyond which they simply get a blank page, nothing (except a nice 503 response), until their ban lifts (hours later).
Everything is configurable.
Now with Referer Spam and h4x0r Protection!
As well as protecting your site against hammering, Anti-Hammer can deny access to Referer Spammers, Script-Kiddies, h4x0rs and more. In addition to the traditional white-list/black-list approach, Anti-Hammer can perform dynamic interrogation of refererring pages, black-listing any referers which don't actually link to your site, white-listing those that do, automatically.
Anti-Hammer can also deny access to clients making requests to dubious and unimplemented resources, things like
MSOffice/cltreq.asp, and so on; whatever you need.
Why waste even a 404 page on these requests? Especially if you have a clever 404 page, like mine. With Anti-Hammer, you can cut out all the noise, take back your logs and analytics data!
Send Bad Bots and Spiders packing!
Anti-Hammer can also protect your site against known Bad bots and spiders, download engines, site suckers and more. Got yourself a HUGE list of .htaccess ban rules? Or don't have access to your .htaccess? Let Anti-Hammer handle it for you, with simpler syntax and without losing all that Regular Expression magic we know and love.
No Way Around Anti-Hammer!
Anti-Hammer uses its own php-session-like-but-better client tracking mechanism..
This works very like php sessions, except it works for ALL clients, regardless of their advertised capabilities, and works regardless of whether or not they have cookies enabled. Yes! You can even Anti-Hammer the GoogleBot! Not that you would want or need to, it's a rather well-behaved bot.
Rather than wait for some session ID to come back (that would be on the second request, you see, and we haven't even sent one yet), Anti-Hammer uses a mix of available client properties to create a unique client ID there-and-then, and from that point, recognizes the client by this ID (which is an MD5 of all that data concatenated together). It's pretty similar to the way a php_session is created, except Anti-Hammer doesn't need the browser to send anything back.
Anti-Hammer's storage mechanism (a serialized array in a flat file) is the same as a php session, too. And like a php session, it is anonymous; aside from the hammer time info, we store no other data server-side.
Unless you want that..
Anti-Hammer also comes with a mechanism to allow certain bots and other friendly spidering entities (matching specific criteria, including a known IP address/range), usually search engine spiders, to pass clean through Anti-Hammer, if required, or alternatively, allow them a faster hammer rate.
Did I mention everything is configurable?
If you really must, you can test it here at corz.org (yes, of course it's running here!), preferably some low content page, like the..
Ensure your server is running at least PHP5.1!
Unzip the Anti-Hammer package..
anti-hammerdirectory into your site somewhere together, maybe inside
/inc/or something like that.
anti-hammer/ directory writable..
If you run php as a cgi/*suexec, you can probably get off with doing nothing, so long as the directory is owned by your user account. For everyone else, the easiest method is probably via ftp, simply set all its permissions to world-writable (777). Or else in a shell..
chmod -R 777 /path/to/anti-hammer
NOTE: There is nothing inherently insecure about having a writeable directory, even a world-writeable directory. And for the paranoid, there are plenty of .htaccess tricks to ease your mind.
Also note: Technically, you only need to make the
sessions/directories writeable, but doing the whole lot is just fine, too.
Set your Anti-Hammer preferences..
anti-hammer.php, in a decent text editor, by which I mean with syntax highlighting, like these are.
Setup php auto_prepend..
Anti-Hammer needs to run as a php "
auto-prepend", so it runs before your pages do. To achieve this magic, add the following command to your site's main (root) .htaccess file..
php_value auto_prepend_file "/full/real/server/path/to/anti-hammer.php"
..replacing the path with the actual path, of course. If php runs as cgi/*suexec on your site, or you have global control, do this in your site's global/local
php.ini, instead ..
auto_prepend_file = "/full/real/server/path/to/anti-hammer.php"
NOTE: You need to use the FULL, REAL path on the server. If you site is in
/var/www/vhosts/mydomain.com/httpdocs/then you need to add ALL that. Run a
phpinfo();command on your site to discover the path to your web site (aka. "
If that sounds too complex, or you just prefer better, more interesting methods, grab (and use)
debug-report.zip, from here..
auto_prependis in place, before any php file on your site is served to a client (web browser, spider, bot, any client), Anti-Hammer runs, interrogating the client's hammer status, and acting accordingly, either passing control directly back to the requested page, or halting the request in its tracks, with a terse warning
To test all this, simply install Anti-Hammer and load your front page, refresh it repeatedly, over and over like bots do, quickly. Careful now! You will get banned!
(allowing certain known clients special privileges)
The big advantage of preventing bots (and people!) from clobbering your website and overloading your server, is that you have more resources freed up for valid clients..
If you want, you can choose to allow certain clients (usually known friendly spiders and bots) to bypass Anti-Hammer altogether, or alternatively, hammer at a faster rate. If you do, you will be utilizing
exemptions.ini, which lives in the
exemptions/ directory (along with the IP lists), is a standard plain text
.ini file containing a list of pairs of known User Agent strings and the text file in which to find their IP/Mask information.
Here's a slightly chopped-down example version..
[exemptions] Mozilla/5.0 (compatible; Googlebot=google.txt Googlebot=google.txt gsa-crawler (Enterprise; S4-E9LJ2B82FJJAA=google.txt msnbot=msn.txt MSNBOT=msn.txt Mozilla/4.0 (compatible; MSIE 6.0; Windows NT; MS Search=msn.txt Scooter/3.3Y!CrawlX=altavista.txt Scooter=inktomi.txt Yahoo=inktomi.txt slurp=inktomi.txt Excite=excite.txt Infoseek=infoseek.txt Lycos_Spider=lycos.txt NorthernLight=northernlight.txt Mozilla/2.0 (compatible; Ask=askjeeves.txt teoma_agent1=askjeeves.txt
On the left (of the "=" sign), is the expected User Agent string. This can be a partial match, but it must match from the very first character of the client's user agent string. Ideally, you want to roll as many variations as possible into a single line, without being so generic as to pull in every client under the Sun and create needless processing overhead (certain Yahoo! and msn bots post only "Mozilla/4.0", for example. They can meet the Anti-Hammer like everyone else!), but still retain enough information to positively identify a particular client.
For example, the string "Yahoo" will match all the following bots:
Yahoo-Blogs/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; http://help.yahoo.com/help/us/ysearch/crawling/crawling-02.html )
Yahoo-MMAudVid/1.0 (mms dash mmaudvidcrawler dash support at yahoo dash inc dot com)
Yahoo-MMCrawler/3.x (mms dash mmcrawler dash support at yahoo dash inc dot com)
YahooFeedSeeker/1.0 (compatible; Mozilla 4.0; MSIE 5.5; my.yahoo.com/s/publishers.html)
YahooSeeker-Testing/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; http://search.yahoo.com/)
YahooSeeker/1.1 (compatible; Mozilla 4.0; MSIE 5.5; http://help.yahoo.com/help/us/shop/merchant/)
YahooSeeker/1.2 (compatible; Mozilla 4.0; MSIE 5.5; yahooseeker at yahoo-inc dot com ; http://help.yahoo.com/help/us/shop/merchant/)
YahooSeeker/CafeKelsa-dev (compatible; Konqueror/3.2; FreeBSD ;email@example.com ) (KHTML, like Gecko)
Similarly, many Googlebots are matched against the simple word, "Googlebot". If your user agent string is a tad generic, and matches against a client that isn't the expected bot, it's not a problem; Anti-Hammer won't find them in the specified IP list and continues as normal. It's designed this way to catch clients pretending to be known bots, of which there are a surprising number.
NOTE: User agent strings are checked in order, and ini file processing halts as soon as a match is found. Note the two "Scooter" entries; if the Yahoo! version was before the AltaVista version, the AltaVista bot would never be allowed an exemption, as Anti-Hammer would always be looking inside
inktomi.txt for its IP information.
NOTE: Matches are CaSe SeNsITiVE! If you want to match "msnbot" and "MSNBOT", you need two entries. Why? Because in tests, a case-insensitive match is at least three times slower than a Case Sensitive match. So make a second entry!
On the right, is the text file to look at for IP Mask information; where the specified user agent is expected to be making requests FROM. It's the standard Spider IP list format, one IP/Mask per line, as found here..
I've included the most recent lists in the Anti-Hammer zip package (and have started to add to and improve them with updated information), in place and ready-to-go, along with an
exemptions.ini file already setup to handle the major friendly spiders.
Remember, you don't need to add all the bots, or even any bots; only bots, spiders, and other clients that you wish to give special privileges to. Even they shouldn't be hammering, really!
If you wish to set a special rate for known clients, rather than allow them to simply bypass Anti-Hammer, all you do is switch the "true" in your
allow_bots preference (which can be considered "infinite
hammer_time"), for a integer (aka. plain number) representing 1/100th Second, just like the regular
hammer_time preference, e.g..
$anti_hammer['allow_bots'] = 50;
A value of
50 would enable two-hits-per-second spidering, but nothing faster, which is half the normal
hammer_time of one second (
$anti_hammer['hammer_time'] = 100;).
Effectively we have two available hammer rates; one for known good clients, and one for everyone else.
While I'm here I should add, there's also the facility to enable one correctly configured browser to bypass Anti-Hammer at all times. This is designed for busy webmasters who sometimes, in the course of their daily activities, will need to hammer their own site. I know I do!
This, setting ("
admin_agent_string"), along with many other settings, can be found in the preferences section inside
anti-hammer.php. Essentially, you tag a unique string onto the end of your browser's User Agent string (perhaps with user-agent-switcher), so that Anti-Hammer can recognize you as you. It's not high-security, but it is handy. I've used a similar approach to avoid logging my own hits for years.
Not requiring that the client send back the ID, potentially has one undesirable side-effect..
If two clients share the same IP (perhaps a proxy) and are using a perfecty identical browsers (in every way, down to the user's locale), and are browsing your site at the exact same time, and view a page within one second of each other (or whatever you set the
hammer_time to), it is possible that they may unwittingly increment each other's hammer count!
Clearly this would be a rare situation, but still, good to know.
Source Code & Download..
You can view the php source code here..
And download a ready-to-go zip package, right here..
If you want to show your appreciation, you can do that here..
If you have any problems at all, installing or using Anti-Hammer, PLEASE DO leave a comment below, or contact me some other way, let me know about it, so I can fix it, ta.