This page will (hopefully!) tell you everything you need to know to setup Anti-Hammer protection on your web site. It's usually straightforward.
If you need help with any aspect of the seup, I am an email away.
Ensure your server is running at least PHP5.1
Unzip the Anti-Hammer packageAnd drop the
anti-hammerdirectory into your site somewhere together, maybe inside
/inc/or something like that, though the root is just fine, too.
Make the Anti-Hammer directories writableIf you run php as a cgi/*suexec, you can probably get off with doing nothing, so long as the directory is owned by your user account.
When running php as an Apache module, the easiest method is probably via ftp, simply set the permissions to world-writable (777). Or else in a shell..
chmod -R 777 /path/to/anti-hammer/lists
chmod -R 777 /path/to/anti-hammer/sessionsNOTE: There is nothing inherently insecure about having a writeable directory, even a world-writeable directory. Anyone who tells you this is, by itself, a security issue on a modern web server, is deluded.
There are dozens of world-writeable dirtectories here at corz.org, and there have been for many years (I even have a public upload facility!). If this was an issue, the onslaught of "hacking" attempts that followed the erroneous mention of corz.org in the Moroccan national press (I'm talking thousands of attempts per day) would have been a total disaster. As it happened, the site did not blink.
Also note: There is no
lists/directory in the FREE version.
Set your Anti-Hammer preferencesThat's inside
anti-hammer.php, in a decent text editor, by which I mean with syntax highlighting, like these are.
Setup php auto_prependAnti-Hammer needs to run as a php "
auto-prepend", so it runs before your pages do. To achieve this magic, add the following command to your site's main (root) .htaccess file..
php_value auto_prepend_file "/full/real/server/path/to/anti-hammer.php"..replacing the path with the actual path, of course.
If php runs as cgi/*suexec/FastCGI on your site (or if the .htaccess method brings up a 500 error!), or you have global control, do this in your site's global/local
.user.inifile in a per-site configuration), instead ..
auto_prepend_file = "/full/real/server/path/to/anti-hammer.php"
If you don't have a
.user.ini), simply create one!
NOTE: You usually need to use the FULL, REAL path on the server*. If you site is in
/var/www/vhosts/mydomain.com/httpdocs/then you need to add ALL that. Run a
phpinfo();command on your site to discover the path to your web site (aka. "
* Some servers won't mind if you use a local path, e.g. "./path/to/anti-hammer.php", but as they say, YMMV.
If that sounds too complex, or you just prefer better, more interesting methods, grab (and use)
debug-report.zip, from here..
auto_prependis in place, before any php file on your site is served to a client (web browser, spider, bot, any client), Anti-Hammer runs, interrogating the client's hammer status, and acting accordingly, either passing control directly back to the requested page, or halting the request in its tracks, with a terse warning
To test all this, simply install Anti-Hammer and load your front page, refresh it repeatedly, over and over like bots do, quickly. Careful now! You will get banned!
Anti-Hammer also comes with a handy hammer-test page you can use to check everything is working as expected.
(allowing certain known clients special privileges)
The big advantage of preventing bots (and people!) from clobbering your website and overloading your server, is that you have more resources freed up for valid clients..
If you want, you can choose to allow certain clients (usually known friendly spiders and bots) to bypass Anti-Hammer altogether, or alternatively, hammer at a faster rate. If you do, you will be utilizing
exemptions.ini, which lives in the
exemptions/directory (along with the IP lists), is a standard plain text
.inifile containing a list of pairs of known User Agent strings and the text file in which to find their IP/Mask information.
Here's a slightly chopped-down example version..
[exemptions] Mozilla/5.0 (compatible; Googlebot=google.txt Googlebot=google.txt gsa-crawler (Enterprise; S4-E9LJ2B82FJJAA=google.txt msnbot=msn.txt MSNBOT=msn.txt Mozilla/4.0 (compatible; MSIE 6.0; Windows NT; MS Search=msn.txt Scooter/3.3Y!CrawlX=altavista.txt Scooter=inktomi.txt Yahoo=inktomi.txt slurp=inktomi.txt Excite=excite.txt Infoseek=infoseek.txt Lycos_Spider=lycos.txt NorthernLight=northernlight.txt Mozilla/2.0 (compatible; Ask=askjeeves.txt teoma_agent1=askjeeves.txt
On the left (of the "=" sign), is the expected User Agent string. This can be a partial match, but it must match from the very first character of the client's user agent string. Ideally, you want to roll as many variations as possible into a single line, without being so generic as to pull in every client under the Sun and create needless processing overhead (certain Yahoo! and msn bots post only "Mozilla/4.0", for example. They can meet the Anti-Hammer like everyone else!), but still retain enough information to positively identify a particular client.
For example, the string "Yahoo" will match all the following bots:
Yahoo-Blogs/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; http://help.yahoo.com/help/us/ysearch/crawling/crawling-02.html )
Yahoo-MMAudVid/1.0 (mms dash mmaudvidcrawler dash support at yahoo dash inc dot com)
Yahoo-MMCrawler/3.x (mms dash mmcrawler dash support at yahoo dash inc dot com)
YahooFeedSeeker/1.0 (compatible; Mozilla 4.0; MSIE 5.5; my.yahoo.com/s/publishers.html)
YahooSeeker-Testing/v3.9 (compatible; Mozilla 4.0; MSIE 5.5; http://search.yahoo.com/)
YahooSeeker/1.1 (compatible; Mozilla 4.0; MSIE 5.5; http://help.yahoo.com/help/us/shop/merchant/)
YahooSeeker/1.2 (compatible; Mozilla 4.0; MSIE 5.5; yahooseeker at yahoo-inc dot com ; http://help.yahoo.com/help/us/shop/merchant/)
YahooSeeker/CafeKelsa-dev (compatible; Konqueror/3.2; FreeBSD ;email@example.com ) (KHTML, like Gecko)
Similarly, many Googlebots are matched against the simple word, "Googlebot". If your user agent string is a tad generic, and matches against a client that isn't the expected bot, it's not a problem; Anti-Hammer won't find them in the specified IP list and continues as normal. It's designed this way to catch clients pretending to be known bots, of which there are a surprising number.
NOTE: User agent strings are checked in order, and ini file processing halts as soon as a match is found. Note the two "Scooter" entries; if the Yahoo! version was before the AltaVista version, the AltaVista bot would never be allowed an exemption, as Anti-Hammer would always be looking inside
inktomi.txtfor its IP information.
NOTE: Matches are CaSe SeNsITiVE! If you want to match "msnbot" and "MSNBOT", you need two entries. Why? Because in tests, a case-insensitive match is at least three times slower than a Case Sensitive match. So make a second entry!
On the right, is the text file to look at for IP Mask information; where the specified user agent is expected to be making requests FROM. It's the standard Spider IP list format, one IP/Mask per line, as found here..http://www.iplists.com/
http://www.iplists.com/nw/ <- updated, reorganised, with msnbot & more.
A blog URI is listed on that page, where updates are posted (maybe two or three times a year).
I've included the most recent lists in the Anti-Hammer zip package (and have started to add to and improve them with updated information), in place and ready-to-go, along with an
exemptions.inifile already setup to handle the major friendly spiders.
Remember, you don't need to add all the bots, or even any bots; only bots, spiders, and other clients that you wish to give special privileges to. Even they shouldn't be hammering, really!
If you wish to set a special rate for known clients, rather than allow them to simply bypass Anti-Hammer, all you do is switch the "true" in your
allow_botspreference (which can be considered "infinite
hammer_time"), for a integer (aka. plain number) representing 1/100th Second, just like the regular
$anti_hammer['allow_bots'] = 50;
A value of
50would enable two-hits-per-second spidering, but nothing faster, which is half the normal
hammer_timeof one second (
$anti_hammer['hammer_time'] = 100;).
Effectively we have two available hammer rates; one for known good clients, and one for everyone else.
While I'm here I should add, there's also the facility to enable one correctly configured browser to bypass Anti-Hammer at all times. This is designed for busy webmasters who sometimes, in the course of their daily activities, will need to hammer their own site. I know I do!
This, setting ("
admin_agent_string"), along with many other settings, can be found in the preferences section inside
anti-hammer.php. Essentially, you tag a unique string onto the end of your browser's User Agent string (perhaps with user-agent-switcher), so that Anti-Hammer can recognize you as you. It's not high-security, but it is handy. I've used a similar approach to avoid logging my own hits for years.
Not requiring that the client send back the ID, potentially has one undesirable side-effect..
If two clients share the same IP (perhaps a proxy) and are using a perfecty identical browsers (in every way, down to the user's locale), and are browsing your site at the exact same time, and view a page within one second of each other (or whatever you set the
hammer_timeto), it is possible that they may unwittingly increment each other's hammer count!
Clearly this would be a rare situation, but still, good to know.
Upgrading from Free to Pro
If you are using a recent version of Anti-Hammer FREE (0.9.3+), it's a simple drop-in replacement.
You will need to copy over your preferences from the old version, which should only take a minute or two.
If you are using an older version of Anti-Hammer FREE, you will need to check your sessions path preference, to ensure it is pointing to the correct directory. Everything else should work as expected (once you copy over your prefs).
If you have a question, feel free to leave a comment, below. I don't expect it to get too busy; Anti-Hammer usually just works. If you think you have found a bug, please mail me about it, with full details, preferably attaching your script to thte mail. Thanks!
Welcome to the comments facility!
Since php_value auto_prepend_file is not an option for us, i would like to ask if this script could be turned in a function(s) and called with an available "hook" that most php applications have.
Just get in touch. ;o)
Tried to run this as include in my index.php.
No errors but the blocking doesn't work well. Blocks right away in some pages (login or register for example).
Assuming you say you cannot run Anti-Hammer as a auto_prepend (why? you didn't say),. you could try assigning a specific file extension to files you want to run with Apache as a module, something like (in .htaccess)..
AddType application/x-httpd-php .phpx
AddType php-cgi .php
Or override regular php files in a specific directory..
AddType application/x-httpd-php .php
>Assuming you say you cannot run Anti-Hammer as a >auto_prepend (why? you didn't say),. you could try assigning >a specific file extension to files you want to run with >Apache as a module, something like (in .htaccess)..
>AddType application/x-httpd-php .phpx
>AddType php-cgi .php
>Or override regular php files in a specific directory..
>AddType application/x-httpd-php .php
My hoster doesnt allow it (HostGator).
I don't understand the two alternatives you say ? Could you explain a bit more ?
Hi! thanks for your script... I installed it and uses php.ini
; Automatically add files before any PHP document.
When I tested it with my ELGG open source software based site ... when I go to mysitenameur.com (site mentioned here is not the real site name) and hit <5> the anti-hammer works however when I navigate to other pages on the site like.... mysitenameur.com/blog/all the anti-hammer does not work.
When I came to your site and tried your extension-less url, https://corz.org/serv/tools/anti-hammer/ and hit refresh several times, the anti hammer seams to work... any how to use anti-hammer with extension-less urls or files?
It sounds like you have more .htaccess files inside /blog/, overriding your main .htaccess. You may want to add the anti-hammer command to that file, too.
Finally, I find a site with some useful scripts and great, easy-to-understand .htaccess info.
I find this especially useful in that I have my own site and domain for it. I do all of my own "webmastering", and this makes my "headaches" in web-administration a whole lot simpler!
After viewing the source of your Anti-Hammer code (which, by the way, is very useful and ingenious), I was thinking of integrating it with my own web-stats module I have written. I have created a PHP module for the purpose of identifying and logging unique hits to include for each website page on each day. I believe Anti-Hammer may go very well with it in that this will also allow me to better-control what legitimate hit-stats get recorded and counted.
I may also look at integrating it with my own guestbook script so as to attempt to block some of the spamming that has been going on. It is such a big shame that spamming activity has been picking up a lot over the last six months!
Great job on a very informative and useful website!
I will be sure to check back more often (hopefully we will not be having too many problems visiting, since I always use a privacy proxy, especially in light of today's "political atmosphere"! )
I also really liked your idea and implementation of a very creative way in controlling "hot-linking". Very good idea to use such attempts to actually promote your site! I am getting ready to set up a web-store, and THIS idea would go great with it!
- Jim S.
Just an update from one of the users of your Anti-Hammer:
It works great!!!
However, it took me a bit of conversation with one of the tech-support folks to find out that I needed to use the php.ini-directive to run Anti-Hammer, and NOT from the .htaccess file. This being because my hosting provider's server(s) do not have that version of mod_rewrite installed which would work with setting PHP environment variables from the .htaccess file.
So, I had to create a custom php.ini file for my site in order to use A-H. However, it is looking good! My hosting provider uses an "suPHP" subsystem, BTW.
I hope this little bit of "techie" wisdom will help some of those who are in the same predicament as I was! ;-)
I have also decided to alter where the "Hammer_ID" files and the "Counter" file are to be stored. I NEVER liked storing temporary and data files in the same folder or folder-tree as my executables!
Also, because the MD5 hashing algorithm has been compromised (IE: due to its limited 128-bit hash - it IS possible to have more than one input value produce the same hash) as was demonstrated in one of the advanced tech forums, I changed the code to use the SHA1 hashing algorithm. This gives a 160-bit hash signature, which means fewer possible "clashes" - IE: more likelihood of only ONE set of input data to result in ONE hash signature.
Great coding and great idea! I love it!
- Jim S.
Thanks for the thoughtful input! Of course you are free to alter any prefs, that's why they are there! The defaults are simply intended to make for easier installation. The new version uses a completely different structure, anyway. Besides, storing data files inside your web root is fine, so long as your permissions are setup correctly. SuExec systems are great, but never forget that now all your files are writable by the server process, not just ones we specify!
By the way, SHA1 is overkill in this situation. MD5 is simply used as a handy way to store the signature of a bunch of concatenated data, a sort of container. If you think about it, collisions would actually be a good thing. CRC16 would provide better protection!
Hi! Thanks for your input... I was able to make a plugin from your application for Elgg Software and the plugin.
If you have time you can check it at...
Where can I find "an updated version which has improved documentation, amongst many other things" ?
Let me know if you have any question.
Next version, payware.
Hi, while finishing the plugin, i run into an emergency and did not finish everything the way it was supposed to be...
On the Valid Link back to corz.org, Last time I made a plugin and then left a link to an .org website, I realized that Elgg does not allow plugin developers to have back links on the html pages. Some plugin developers were embedding back links to infected sites... So, due to those reasons they decided to stop all plugin developers from embedding back links to their personal sites or any external site except Elgg plugin Download locations
On the plugin download page, I just edited the page and I have give the credit where credit is due!
Your work can change the world... And yes it has already changed the world.
Dear Cor, (sounds Duitch to me
I'm trying to get it running on my localhost, it isn't working.
What changes should be made?
Y.T. Harry Betlem
Check your logs! ;o)
I read here, under the "Now with Referer Spam and h4x0r Protection!"section, that we can immediately ban baddies, but I don't see anything about how to turn that on, either here or in the code. Am I just missing it? If not, could you please post how to go about adding that feature?
BTW, this is a great script. I love using it on my server. When will you be releasing the new version you mentioned above?
Apologies! This page is a bit of a pre-empt, it escaped prior to my major site update (coming up in the next few days!). The page is unfinished, but Anti-Hammer Pro is working great. Everything here at the org has been updated and upgraded, so it's a big job!
Anti-Hammer Pro will be available /soon/.
If you urgently need a copy meantime, mail me.
My black-list.txt gets 100% false positive, despite level 3 checking accuracy. It would be better if all listed links are initially commented out then leave it upto me if I want to approve any of them.