Setup
Anti-Hammer
This page will (hopefully!) tell you everything you need to know to setup Anti-Hammer protection on your web site. It's usually straightforward.
If you need help with any aspect of the seup, I am an email away.
Quick-Start Guide:
Ensure your server is running at least PHP5.1
Unzip the Anti-Hammer package
And drop theanti-hammer
directory into your site somewhere together, maybe inside/includes/
or/inc/
or something like that, though the root is just fine, too.Make the Anti-Hammer directories writable
If you run php as a cgi/*suexec, you can probably get off with doing nothing, so long as the directory is owned by your user account.When running php as an Apache module, the easiest method is probably via ftp, simply set the permissions to world-writable (777). Or else in a shell..
chmod -R 777 /path/to/anti-hammer/lists
chmod -R 777 /path/to/anti-hammer/sessionsNOTE: There is nothing inherently insecure about having a writeable directory, even a world-writeable directory. Anyone who tells you this is, by itself, a security issue on a modern web server, is deluded.There are dozens of world-writeable dirtectories here at corz.org, and there have been for many years (I even have a public upload facility!). If this was an issue, the onslaught of "hacking" attempts that followed the erroneous mention of corz.org in the Moroccan national press (I'm talking thousands of attempts per day) would have been a total disaster. As it happened, the site did not blink.
Also note: There is no
lists/
directory in the FREE version.
Set your Anti-Hammer preferences
That's insideanti-hammer.php
, in a decent text editor, by which I mean with syntax highlighting, like these are.Setup php auto_prepend
Anti-Hammer needs to run as a php "auto-prepend
", so it runs before your pages do. To achieve this magic, add the following command to your site's main (root) .htaccess file..
php_value auto_prepend_file "/full/real/server/path/to/anti-hammer.php"
..replacing the path with the actual path, of course.
If php runs as cgi/*suexec/FastCGI on your site (or if the .htaccess method brings up a 500 error!), or you have global control, do this in your site's global/localphp.ini
(or.user.ini
file in a per-site configuration), instead ..
auto_prepend_file = "/full/real/server/path/to/anti-hammer.php"
If you don't have a
php.ini
(or.user.ini
), simply create one!NOTE: You usually need to use the FULL, REAL path on the server*. If you site is in
/var/www/vhosts/mydomain.com/httpdocs/
then you need to add ALL that. Run aphpinfo();
command on your site to discover the path to your web site (aka. "DOCUMENT_ROOT
").
* Some servers won't mind if you use a local path, e.g. "./path/to/anti-hammer.php", but as they say, YMMV.
If that sounds too complex, or you just prefer better, more interesting methods, grab (and use)
debug-report.zip
, from here..
You're done!
Once the
auto_prepend
is in place, before any php file on your site is served to a client (web browser, spider, bot, any client), Anti-Hammer runs, interrogating the client's hammer status, and acting accordingly, either passing control directly back to the requested page, or halting the request in its tracks, with a terse warningTo test all this, simply install Anti-Hammer and load your front page, refresh it repeatedly, over and over like bots do, quickly. Careful now! You will get banned!
Anti-Hammer also comes with a handy hammer-test page you can use to check everything is working as expected.
exemptions.ini
(allowing certain known clients special privileges)The big advantage of preventing bots (and people!) from clobbering your website and overloading your server, is that you have more resources freed up for valid clients..
If you want, you can choose to allow certain clients (usually known friendly spiders and bots) to bypass Anti-Hammer altogether, or alternatively, hammer at a faster rate. If you do, you will be utilizing
exemptions.ini
.exemptions.ini
, which lives in theexemptions/
directory (along with the IP lists), is a standard plain text.ini
file containing a list of pairs of known User Agent strings and the text file in which to find their IP/Mask information.Here's a slightly chopped-down example version..
[exemptions] Mozilla/5.0 (compatible; Googlebot=google.txt Googlebot=google.txt gsa-crawler (Enterprise; S4-E9LJ2B82FJJAA=google.txt msnbot=msn.txt MSNBOT=msn.txt Mozilla/4.0 (compatible; MSIE 6.0; Windows NT; MS Search=msn.txt Scooter/3.3Y!CrawlX=altavista.txt Scooter=inktomi.txt Yahoo=inktomi.txt slurp=inktomi.txt Excite=excite.txt Infoseek=infoseek.txt Lycos_Spider=lycos.txt NorthernLight=northernlight.txt Mozilla/2.0 (compatible; Ask=askjeeves.txt teoma_agent1=askjeeves.txt
How
exemptions.ini
worksOn the left (of the "=" sign), is the expected User Agent string. This can be a partial match, but it must match from the very first character of the client's user agent string. Ideally, you want to roll as many variations as possible into a single line, without being so generic as to pull in every client under the Sun and create needless processing overhead (certain Yahoo! and msn bots post only "Mozilla/4.0", for example. They can meet the Anti-Hammer like everyone else!), but still retain enough information to positively identify a particular client.
For example, the string "Yahoo" will match all the following bots:
Yahoo! Mindset
Yahoo-Blogs/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; http://help.yahoo.com/help/us/ysearch/crawling/crawling-02.html )
Yahoo-MMAudVid/1.0 (mms dash mmaudvidcrawler dash support at yahoo dash inc dot com)
Yahoo-MMCrawler/3.x (mms dash mmcrawler dash support at yahoo dash inc dot com)
YahooFeedSeeker/1.0 (compatible; Mozilla 4.0; MSIE 5.5; my.yahoo.com/s/publishers.html)
YahooSeeker-Testing/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; http://search.yahoo.com/)
YahooSeeker/1.1 (compatible; Mozilla 4.0; MSIE 5.5; http://help.yahoo.com/help/us/shop/merchant/)
YahooSeeker/1.2 (compatible; Mozilla 4.0; MSIE 5.5; yahooseeker at yahoo-inc dot com ; http://help.yahoo.com/help/us/shop/merchant/)
YahooSeeker/CafeKelsa-dev (compatible; Konqueror/3.2; FreeBSD ;cafekelsa-dev-webmaster@yahoo-inc.com ) (KHTML, like Gecko)
YahooVideoSearch www.yahoo.com/
YahooYSMcm/2.0.0Similarly, many Googlebots are matched against the simple word, "Googlebot". If your user agent string is a tad generic, and matches against a client that isn't the expected bot, it's not a problem; Anti-Hammer won't find them in the specified IP list and continues as normal. It's designed this way to catch clients pretending to be known bots, of which there are a surprising number.
NOTE: User agent strings are checked in order, and ini file processing halts as soon as a match is found. Note the two "Scooter" entries; if the Yahoo! version was before the AltaVista version, the AltaVista bot would never be allowed an exemption, as Anti-Hammer would always be looking inside
inktomi.txt
for its IP information.NOTE: Matches are CaSe SeNsITiVE! If you want to match "msnbot" and "MSNBOT", you need two entries. Why? Because in tests, a case-insensitive match is at least three times slower than a Case Sensitive match. So make a second entry!
On the right, is the text file to look at for IP Mask information; where the specified user agent is expected to be making requests FROM. It's the standard Spider IP list format, one IP/Mask per line, as found here..
http://www.iplists.com/
http://www.iplists.com/nw/ <- updated, reorganised, with msnbot & more.
A blog URI is listed on that page, where updates are posted (maybe two or three times a year).I've included the most recent lists in the Anti-Hammer zip package (and have started to add to and improve them with updated information), in place and ready-to-go, along with an
exemptions.ini
file already setup to handle the major friendly spiders.Remember, you don't need to add all the bots, or even any bots; only bots, spiders, and other clients that you wish to give special privileges to. Even they shouldn't be hammering, really!
If you wish to set a special rate for known clients, rather than allow them to simply bypass Anti-Hammer, all you do is switch the "true" in your
allow_bots
preference (which can be considered "infinitehammer_time
"), for a integer (aka. plain number) representing 1/100th Second, just like the regularhammer_time
preference, e.g..$anti_hammer['allow_bots'] = 50;
A value of
50
would enable two-hits-per-second spidering, but nothing faster, which is half the normalhammer_time
of one second ($anti_hammer['hammer_time'] = 100;
).Effectively we have two available hammer rates; one for known good clients, and one for everyone else.
I, Admin.
While I'm here I should add, there's also the facility to enable one correctly configured browser to bypass Anti-Hammer at all times. This is designed for busy webmasters who sometimes, in the course of their daily activities, will need to hammer their own site. I know I do!
This, setting ("
admin_agent_string
"), along with many other settings, can be found in the preferences section insideanti-hammer.php
. Essentially, you tag a unique string onto the end of your browser's User Agent string (perhaps with user-agent-switcher), so that Anti-Hammer can recognize you as you. It's not high-security, but it is handy. I've used a similar approach to avoid logging my own hits for years.Caveats:
One-Way Sessions..
Not requiring that the client send back the ID, potentially has one undesirable side-effect..
If two clients share the same IP (perhaps a proxy) and are using a perfecty identical browsers (in every way, down to the user's locale), and are browsing your site at the exact same time, and view a page within one second of each other (or whatever you set the
hammer_time
to), it is possible that they may unwittingly increment each other's hammer count!Clearly this would be a rare situation, but still, good to know.
Upgrading from Free to Pro
If you are using a recent version of Anti-Hammer FREE (0.9.3+), it's a simple drop-in replacement.
You will need to copy over your preferences from the old version, which should only take a minute or two.
If you are using an older version of Anti-Hammer FREE, you will need to check your sessions path preference, to ensure it is pointing to the correct directory. Everything else should work as expected (once you copy over your prefs).
Feedback
If you have a question, feel free to leave a comment, below. I don't expect it to get too busy; Anti-Hammer usually just works. If you think you have found a bug, please mail me about it, with full details, preferably attaching your script to thte mail. Thanks!
Welcome to the comments facility!
Thanks
Tweaked my own prefs in, installed and everything works fine :-)
Lately I've been pinged a BIG time with whatnots and this kills practically all of it.
I'm a VERY happy chap :-D
A question - how big the .ht_hammer can grow? Before there are some effects, that is...
Great "add-on" even to a Joomla site, I suppose it's ok to direct people here with a link (?). I'm sure quite a few people could use this one.
A HUGE thank you, mate.
If it doesn't work for you...
And your error logs don't show anything (you have enabled error logging, right?), try using php.ini rather than .htaccess and check the logs again. Also make sure the anti-hammer directory is writeable..
Only about one other person in the world visits my site regularly, but this'll keep 'em in line for sure. Thanks cor, this is awesome
This is exactly what I'm looking for the proxylist.co proxy list site.
Many thanks!
I got anti-hammer to work just fine. Thanks for that! But I am uncomfortable setting permissions in the directory it is in to 777, which is the only way the code will run.
I therefore tried to put the directory above the root, but that didn't work.
Having a directory below the root that is set to 777 is risky. I wonder if there is a way to avoid this situation.
My PHP runs as DSO API.
ps. I'd love you to explain exactly how having a deirectory below the root set to 777 is risky. ;o)
Great script, but I have a slight issue with it. When the site pushes out a temporary redirect header (e.g. to proceed after submitting a form, or for example, to jump out to the admin login form), it appears to trigger immediately.
Any quick way to resolve this?
EDIT : (it just did it when submitting this comment form on your site too)
Same problem as others have reported above: php_value auto_prepend_file simply doesn't seem to work here, despite htaccess working in general.
I think we all should donate at least $1 (more if you can)
I have been coming to this site for about a year now for the best .htaccess tutorial on the web - period. Tonight I was doing the same when I came across the anti-hammer link. I have just downloaded it, so I can only anticipate the benefits to my Easy Profit Bot Review Website,
but I already owe you a debt of gratitude for the countless hours of frustration that I did NOT have to suffer once I found your .htaccess tutorial.
I am donating $1 right now. I think everyone that sees this comment should also, more if you can spare the change!
We all know that this site is worth much, much more. Collectively, lets show our appreciation thru donation, so this site can continue to provide the priceless value that it has to me, to you and to everyone of our visitors that had an enriched user experience because of the tips and hints we found on corz.org.
I guess, what I mean is, "Thanks! Here's a hundred pennies for your thoughts!"
My host deactivated my website multiple times due to spam bots. This script saved it! Very powerful and efficient. Thanks
Does the script allow Bing bots?
I have an error in my file
Warning: session_start() [function.session-start]: Cannot send session cache limiter - headers already sent (output started at X:*****\******\anti-hammer\anti-hammer.php:1) in X:*****\******\index.php on line 2
please tell me how to fix this error?
You also might want to consider using output buffering (ob_start();) at the beginning of your scripts. ;o)
I've gotten this to work with WordPress, but I'm having a problem getting it to work with Joomla. Does anyone know of any settings that need to be adjusted for this to work with Joomla? Any settings with the anti-hammer.php file?
[edit]I just installed anti-hammer at my son's Joomla site, works great.
As for yours, if something isn't working your php error log should be your fist port of call.[/edit]
;o)