more .htaccess tips and tricks..

<ifModule>
more clever stuff here
</ifModule>
 

redirecting and rewriting

"The great thing about mod_rewrite is it gives you all the configurability and flexibility of Sendmail. The downside to mod_rewrite is that it gives you all the configurability and flexibility of Sendmail."

- Brian Behlendorf, Apache Group
 

One of the more powerful tricks of the .htaccess hacker is the ability to rewrite URLs. This enables us to do some mighty manipulations on our links; useful stuff like transforming very long URL's into short, cute URLs, transforming dynamic ?generated=page&URL's into /friendly/flat/links, redirect missing pages, preventing hot-linking, performing automatic language translation, and much, much more.

Make no mistake, mod_rewrite is complex. This isn't the subject for a quick bite-size tech-snack, probably not even a week-end crash-course, I've seen guys pull off some real cute stuff with mod_rewrite, but with kudos-hat tipped firmly towards that bastard operator from hell, Ralf S. Engelschall, author of the magic module itself, I have to admit that a great deal of it still seems so much voodoo to me.

The way that rules can work one minute and then seem not to the next, how browser and other in-between network caches interact with rules and testing rules is often baffling, maddening. When I feel the need to bend my mind completely out of shape, I mess around with mod_rewrite!

After all this, it does work, and while I'm not planning on taking that week-end crash-course any time soon, I have picked up a few wee tricks myself, messing around with webservers and web sites, this place..

The plan here is to just drop some neat stuff, examples, things that have proven useful, and work on a variety of server setups; there are apache's all over my LAN, I keep coming across old .htaccess files stuffed with past rewriting experiments that either worked; and I add them to my list, or failed dismally; and I'm surprised that more often these days, I can see exactly why!

Very little here is my own invention. Even the bits I figured out myself were already well documented, I just hadn't understood the documents, or couldn't find them. Sometimes, just looking at the same thing from a different angle can make all the difference, so perhaps this humble stab at URL Rewriting might be of some use. I'm writing it for me, of course. but I do get some credit for this..

# time to get dynamic, see..
RewriteRule ^(.*)\.htm $1.php
 

beginning rewriting..

Whenever you use mod_rewrite (the part of apache that does all this magic), you need to do..


..before any ReWrite rules. note: +FollowSymLinks must be enabled for any rules to work, this is a security requirement of the rewrite engine. Normally it's enabled in the root and you shouldn't have to add it, but it doesn't hurt to do so, and I'll insert it into all the examples on this page, just in case*.

The next line simply switches on the rewrite engine for that folder. if this directive is in you main .htaccess file, then the ReWrite engine is theoretically enabled for your entire site, but it's wise to always add that line before you write any redirections, anywhere.

* Although highly unlikely, your host may have +FollowSymLinks enabled at the root level, yet disallow its addition in .htaccess; in which case, adding +FollowSymLinks will break your setup (probably a 500 error), so just remove it, and your rules should work fine.

Important: While some of the directives on this page may appear split onto two lines, in your .htaccess file, they must exist completely on one line. If you drag-select and copy the directives on this page, they should paste just fine into any text editor.
 

simple rewriting

Simply put, Apache scans all incoming URL requests, checks for matches in our .htaccess file and rewrites those matching URLs to whatever we specify. something like this..

all requests to whatever.htm will be sent to whatever.php:
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.*)\.htm$ $1.php [NC]

Handy for anyone updating a site from static htm (you could use .html, or .htm(.*), .htm?, etc) to dynamic php pages; requests to the old pages are automatically rewritten to our new urls. no one notices a thing, visitors and search engines can access your content either way. leave the rule in; as an added bonus, this enables us to easily split php code and its included html structures into two separate files, a nice idea; makes editing and updating a breeze. The [NC] part at the end means "No Case", or "case-insensitive", but we'll get to the switches later.

Folks can link to whatever.htm or whatever.php, but they always get whatever.php in their browser, and this works even if whatever.htm doesn't exist! but I'm straying..

As it stands, it's a bit tricky; folks will still have whatever.htm in their browser address bar, and will still keep bookmarking your old .htm URL's. Search engines, too, will keep on indexing your links as .htm, some have even argued that serving up the same content from two different places could have you penalized by the search engines. This may or not bother you, but if it does, mod_rewrite can do some more magic..

this will do a "real" http redirection:
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.+)\.htm$ http://corz.org/$1.php [R=301,NC]

This time we instruct mod_rewrite to send a proper HTTP "permanently moved" redirection, aka; "301". Now, instead of just redirecting on-the-fly, the user's browser is physically redirected to a new URL, and whatever.php appears in their browser's address bar, and search engines and other spidering entities will automatically update their links to the .php versions; everyone wins. and you can take your time with the updating, too.

For details of the many 30* response codes you can send, see here.
 

not-so-simple rewriting ... flat links and more

You may have noticed, the above examples use regular expression to match variables. What that simply means is.. match the part inside (.+) and use it to construct "$1" in the new URL. In other words, (.+) = $1 you could have multiple (.+) parts and for each, mod_rewrite automatically creates a matching $1, $2, $3, etc, in your target (aka. 'substitution') URL. This facility enables us to do all sorts of tricks, and the most common of those, is the creation of "flat links"..

Even a cute short link like http://mysite/grab?file=my.zip is too ugly for some people, and nothing less than a true old-school solid domain/path/flat/link will do. Fortunately, mod_rewrite makes it easy to convert URLs with query strings and multiple variables into exactly this, something like..

a more complex rewrite rule:
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^files/([^/]+)/([^/]+).zip /download.php?section=$1&file=$2 [NC]

would allow you to present this link as..

  http://mysite/files/games/hoopy.zip

and in the background have that transparently translated, server-side, to..

  http://mysite/download.php?section=games&file=hoopy

which some script could process. You see, many search engines simply don't follow our ?generated=links, so if you create generating pages, this is useful. However, it's only the dumb search engines that can't handle these kinds of links; we have to ask ourselves.. do we really want to be listed by the dumb search engines? Google will handle a good few parameters in your URL without any problems, and the (hungry hungry) msn-bot stops at nothing to get that page, sometimes again and again and again…

I personally feel it's the search engines that should strive to keep up with modern web technologies, in other words; we shouldn't have to dumb-down for them. But that's just my opinion. Many users will prefer /files/games/hoopy.zip to /download.php?section=games&file=hoopy but I don't mind either way. As someone pointed out to me recently, presenting links as standard/flat/paths means you're less likely to get folks doing typos in typed URL's, so something like..

an even more complex rewrite rule:
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^blog/([0-9]+)-([a-z]+) http://corz.org/blog/index.php?archive=$1-$2 [NC]

would be a neat trick, enabling anyone to access my blog archives by doing..

 http://corz.org/blog/2003-nov

in their browser, and have it automagically transformed server-side into..

 http://corz.org/blog/index.php?archive=2003-nov

which corzblog would understand. It's easy to see that with a little imagination, and a basic understanding of posix regular expression, you can perform some highly cool URL manipulations.

Here's the very basics of regexp (roughly from the apache mod_rewrite documentation)..
 

Escaping:

\char escape that particular char

    For instance to specify special characters.. .[]() etc.

Text:

.             Any single character  (on its own = the entire URI)
[chars]       Character class: One  of following chars
[^chars]      Character class: None of following chars
text1|text2   Alternative: text1 or text2 (ie. "or")

    e.g. [^/] matches any character except /

Quantifiers:

? 0 or 1 of the preceding text
* 0 or N of the preceding text  (hungry)
+ 1 or N of the preceding text

    e.g. (.+)\.html? matches foo.htm and foo.html

Grouping:

(text)  Grouping of text

    Either to set the borders of an alternative or
    for making backreferences where the nth group can 
    be used on the target of a RewriteRule with $n

	e.g.  ^(.*)\.html foo.php?bar=$1

Anchors:

^    Start of line anchor
$    End   of line anchor

    An anchor explicitly states that the character right next to it MUST 
    be either the very first character ("^"), or the very last character ("$")
    of the URI string to match against the pattern, e.g.. 
	
	^foo(.+) matches foobar but not eggfoo
	(.*)l$ matches fool but not foo
 

shortening URLs

One common use of mod_rewrite is to shorten URL's. Shorter URL's are easier to remember and, of course, easier to type. An example..

beware the regular expression:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^grab(.*) /public/files/download/download.php$1

this rule would transform this user's URL..

  http://mysite/grab?file=my.zip

server-side, into..

  http://mysite/public/files/download/download.php?file=my.zip

which is a wee trick I use for my distro machine, among other things. everyone likes short URL's, and so will you; using this technique, you can move /public/files/download/ to anywhere else in your site, and all the old links still work fine; simply alter your .htaccess file to reflect the new location. edit one line, done - nice - means even when stuff is way deep in your site you can have cool links like this.. and this; links which are not only short, but flat..
 

capturing variables

Slapping (.*) onto the end of the request part of a ReWriteRule is just fine when using a simple $_GET variable, but sometimes you want to do trickier things, like capturing particular variables and converting them into other variables in the target URL. Or something else..

When capturing variables, the first thing you need to know about, is the [QSA] flag, which simply tags all the original variables back onto the end of the target url. This may be all you need. The second thing, is %{QUERY_STRING}, an Apache server string we can capture variables from, using simple RewriteCond (aka. conditional ) statements.

In the following example, the RewriteCond statement checks that the query string has the foo variable set, and captures its value while it's there. In other words, only requests for grab that have the foo variable set, will be rewritten, and while we're at it, we'll also switch foo, for bar..

capturing a $_GET variable:
Options +FollowSymlinks
RewriteEngine On
RewriteCond %{QUERY_STRING} foo=(.*)
RewriteRule ^grab(.*) /page.php?bar=%1

would translate a link/user's request for..

http://domain.com/grab?foo=foobar

server-side, into..

http://domain.com/page.php&bar=foobar

Which is to say, the user's browser would be fed page.php (without an [R] flag in the RewriteRule, their address bar would still read /grab?foo=foobar).

The variable bar would be available to your script, with its value set to foobar. This variable has been magically created, by simply using a regular ? in the target of the RewriteRule, and tagging on the first captured backreference, %1.. ?bar=%1

Note how we use the % character, to specify variables captured in RewriteCond statements, aka "Backreferences". This is exactly like using $1 to specify numbered backreferences captured in RewriteRule patterns, except for those captured inside RewriteCond statements, we use % instead of $.

You can use the [QSA] flag in addition to these query string manipulations, merge them. In the next example, the value of foo becomes the directory in the target URL, the variable file is magically created, and the original query string is then tagged back onto the end of the whole thing..
QSA Overkill!
Options +FollowSymlinks
RewriteEngine On
RewriteCond %{QUERY_STRING} foo=(.+)
RewriteRule ^grab/(.*) /%1/index.php?file=$1 [QSA]

So a request for..

http://domain.com/grab/foobar.zip?level=5&foo=bar

is translated, server-side, into..

http://domain.com/bar/index.php?file=foobar.zip&level=5&foo=bar

Depending on your needs, you could even use flat links and dynamic variables together, something like this could be useful..


By the way, you can easily do the opposite, strip a query string from a URL, by simply putting a ? right at the end of the taget part. This example does exactly that, whilst leaving the actual URI intact..

just a demo!
RewriteCond %{QUERY_STRING} .
RewriteRule foo.php(.*) /foo.php? [L]
The RewriteCond statement only allows requests that have something in their query string, to be processed by the RewriteRule, or else we'd end up in that hellish place, dread to all mod_rewriters.. the endless loop. RewriteCond is often used like this; as a safety-net.
 

cooler access denied

In part one I demonstrated a drop-dead simple mechanism for denying access to particular files and folders. The trouble with this is the way our user gets a 403 "Access Denied" error, which is a bit like having a door slammed in your face. Fortunately, mod_rewrite comes to the rescue again and enables us to do less painful things. One method I often employ is to redirect the user to the parent folder..

they go "huh?.. ahhh!"
# send them up!
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.*)$ ../ [NC]

It works great, though it can be a wee bit tricky with the URLs, and you may prefer to use a harder location, which avoids potential issues in indexed directories, where folks can get in a loop..

they go damn! Oh!
# send them exactly there!
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.*)$ /comms/hardware/router/ [NC]

Sometimes you'll only want to deny access to most of the files in the directory, but allow access to maybe one or two files, or file types, easy..
deny with style!
# users can load only "special.zip", and the css and js files.
Options +FollowSymlinks
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !^(.+)\.css$
RewriteCond %{REQUEST_FILENAME} !^(.+)\.js$
RewriteCond %{REQUEST_FILENAME} !special.zip$
RewriteRule ^(.+)$ /chat/ [NC]

Here we take the whole thing a stage further. Users can access .css (stylesheet) and javascript files without problem, and also the file called "special.zip", but requests for any other filetypes are immediately redirected back up to the main "/chat/" directory. You can add as many types as you need. You could also bundle the filetypes into one line using | (or) syntax, though individual lines are perhaps clearer.

Here's what's currently cooking inside my /inc/ directory..

all-in-one control..
RewriteEngine on
Options +FollowSymlinks
# allow access with no restrictions to local machine at 192.168.1.3
RewriteCond %{REMOTE_ADDR} !192.168.1.3
# allow access to all .css and .js in sub-directories..
RewriteCond %{REQUEST_URI} !\.css$
RewriteCond %{REQUEST_URI} !\.js$
# allow access to the files inside img/, but not a directory listing..
RewriteCond %{REQUEST_URI} !img/(.*)\.
# allow access to these particular files...
RewriteCond %{REQUEST_URI} !comments.php$
RewriteCond %{REQUEST_URI} !corzmail.php$
RewriteCond %{REQUEST_URI} !digitrack.php$
RewriteCond %{REQUEST_URI} !gd-verify.php$
RewriteCond %{REQUEST_URI} !post-dumper.php$
RewriteCond %{REQUEST_URI} !print.php$
RewriteCond %{REQUEST_URI} !source-dump.php$
RewriteCond %{REQUEST_URI} !textview.php$
RewriteRule ^(.*)$ / [R,nc,l]
 

prevent hot-linking

Believe it or not, there are some webmasters who, rather than coming up with their own content will steal yours. Really! Even worse, they won't even bother to copy to their own server to serve it up, they'll just link to your content!  no, it's true, in fact, it used to be incredibly common. These days most people like to prevent this sort of thing, and .htaccess is one of the best ways to do it.

This is one of those directives where the mileage variables are at their limits, but something like this works fine for me..

how DARE they!
Options +FollowSymlinks
# no hot-linking
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?corz\.org/ [NC]
RewriteRule .*\.(gif|jpg|png)$

You may see the last line broken into two, but it's all one line (all the directives on this page are). Let's have a wee look at what it does..

We begin by enabling the rewrite engine, as always.

The first RewriteCond line allows direct requests (not from other pages - an "empty referrer") to pass unmolested. The next line means; if the browser did send a referrer header, and the word "corz.org" is not in the domain part of it, then DO rewrite this request.

The all-important final RewriteRule line instructs mod_rewrite to rewrite all matched requests (anything without "corz.org" in its referrer) asking for gifs, jpegs, or pngs, to an alternative image. Mine says "no hotlinking!" You can see it in action here. There are loads of ways you can write this rule. google for "hot-link protection" and get a whole heap. Simple is best. You could send a wee message instead, or direct them to some evil script, or something. These days, mine is a simple corz.org logo, which I  think is rather clever.
 

lose the "www"

I'm often asked how I prevent the "www" part showing up at my site, so I guess I should add something about that. Briefly, if someone types http://www.corz.org/ into their browser (or uses the www part for any link at corz.org) it is redirected to the plain, rather neat, http://corz.org/ version. This is very  simple to achieve, like this..

beware the regular expression:
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{http_host} ^www\.corz\.org [NC]
RewriteRule ^(.*)$ http://corz.org/$1 [R=301,NC]

You don't need to be a genius to see what's going on here. There are other ways you could write this rule, but again, simple is best. Like most of the examples here, the above is pasted directly from my own main .htaccess file, so you can be sure it works perfectly. In fact, I recently updated it (both work)..

here's what I'm currently using:
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.(.*) [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,NC,L]
 

multiple domains in one root

If you are in the unfortunate position of having your sites living on a host that doesn't support multiple domains, you may be forced to roll your own with .htaccess and mod_rewrite. So long as your physical directory structure is well thought-out, this is fairly simple to achieve.

For example, let's say we have two domains, pointing at a single hosted root; domain-one.com and domain-two.com. In our web server root, we simply create a folder for each domain, perhaps one/, and two/ then in our main (root) .htaccess, rewrite all incoming requests, like this..

All requests NOT already rewritten into these folders, transparently rewrite..
#two domains served from one root..
RewriteCond %{HTTP_HOST} domain-one.com
RewriteCond %{REQUEST_URI} !^/one
RewriteRule ^(.*)$ one/$1 [L]

RewriteCond %{HTTP_HOST} domain-two.com
RewriteCond %{REQUEST_URI} !^two
RewriteRule ^(.*)$ two/$1 [L]

All requests for the host domain-one.com are rewritten (not R=redirected) to the one/ directory, so long as they haven't already been rewritten there (the second RewriteCond). Same story for domain-two.com. Note the inconsistency in the RewriteCond statement; !^/dir-name and !^dir-name should both work fine.

Also note, with such a simple domain & folder naming scheme, you could easily merge these two rule sets together. This would be unlikely in the real world though, which is why I left them separate; but still, worth noting.

Other general settings and php directives can also go in this root .htaccess file, though if you have any further rewrite you'd like to perform; short URL's, htm to php conversion and what-not; it's probably easier and clearer to do those inside the sub-directory's .htaccess files.
 

automatic translation

If you don't read English, or some of your guests don't, here's a neat way to have the wonderful Google translator provide automatic on-the-fly translation for your site's pages. Something like this..

they simply add their country code to the end of the link, or you  do..
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.*)-fr$ http://www.google.com/translate_c?hl=fr&sl=en&u=http://corz.org/$1 [R,NC]
RewriteRule ^(.*)-de$ http://www.google.com/translate_c?hl=de&sl=en&u=http://corz.org/$1 [R,NC]
RewriteRule ^(.*)-es$ http://www.google.com/translate_c?hl=es&sl=en&u=http://corz.org/$1 [R,NC]
RewriteRule ^(.*)-it$ http://www.google.com/translate_c?hl=it&sl=en&u=http://corz.org/$1 [R,NC]
RewriteRule ^(.*)-pt$ http://www.google.com/translate_c?hl=pt&sl=en&u=http://corz.org/$1 [R,NC]

You can create your menu with its flags or whatever you like, and add the country code to end of the links.. <a href="page.html-fr" id="... Want to see this page in French?

Although it is very handy, and I've been using it here for a couple of years here at the org, for my international blog readers, all two of them, heh. Almost no one knows about it, mainly because I don't have any links . One day I'll probably do a wee toolbar with flags and what-not. Perhaps not. Trouble is, the Google translator stops translating after a certain amount of characters (which seems to be increasing, good), though these same rules could easily be applied to other translators, and if you find a good one, one that will translate a really huge  document on-the-fly, do let me know!

If you wanted to be really clever, you could even perform some some kind of IP block check and present the correct version automatically, but that is outside the scope of this document. note: this may be undesirable for pages where technical commands are given (like this page) because the commands will also be translated. "RewriteEngine dessus" will almost certainly get you a 500 error page!

 

httpd.conf

Remember, if you put these rules in the main server conf file (usually httpd.conf) rather than an .htaccess file, you'll need to use ^/... ... instead of ^... ... at the beginning of the RewriteRule line, in other words, add a slash.
 

inheritance..

If you are creating rules in sub-folders of your site, you need to read this.

You'll remember how rules in top folders apply to all the folders inside those folders too. we call this "inheritance". normally this just works. but if you start creating other rules inside subfolders you will, in effect, obliterate the rules already applying to that folder due to inheritance, or "decendancy", if you prefer. not all the rules, just the ones applying to that subfolder. a wee demonstration..

Let's say I have a rule in my main /.htaccess which redirected requests for files ending .htm to their .php equivalent, just like the example at the top of this very page. now, if for any reason I need to add some rewrite rules to my /osx/.htaccess file, the .htm >> .php redirection will no longer work for the /osx/ subfolder, I'll need to reinsert it, but with a crucial difference..

this works fine, site-wide, in my main .htaccess file
# main (top-level) .htaccess file..
# requests to file.htm goto file.php
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^(.*)\.htm$ http://corz.org/$1.php [R=301,NC]

Here's my updated /osx/.htaccess file, with the .htm >> .php redirection rule reinserted..

but I'll need to reinsert the rules for it to work in this sub-folder
# /osx/.htaccess file..
Options +FollowSymlinks
RewriteEngine on
RewriteRule some rule that I need here
RewriteRule some other rule I need here
RewriteRule ^(.*)\.htm$ http://corz.org/osx/$1.php [R=301,NC]

Spot the difference in the subfolder rule, highlighted in red. you must add the current path to the new rule. now it works again, and all the osx/ subfolders will be covered by the new rule. if you remember this, you can go replicating rewrite rules all over the place.

If it's possible to put your entire site's rewrite rules into the main .htaccess file, and it probably is; do that, instead, like this..

it's a good idea to put all your rules in your main .htaccess file..
# root /.htaccess file..
Options +FollowSymlinks
RewriteEngine on
# .htm >> .php is now be covered by our main rule, there's no need to repeat it.
# But if we do need some /osx/-specific rule, we can do something like this..
RewriteRule ^osx/(.*)\.foo$ /osx/$1.bar [R=301,NC]

Note, no full URL (with domain) in the second example. Don't let this throw you; with or without is functionally identical, on most servers. Essentially, try it without the full URL first, and if that doesn't work, sigh, and add it - maybe on your next host!

The latter, simpler form is preferable, if only for its tremendous portability it offers - my live site, and my development mirror share the exact same .htaccess files - a highly desirable thing.

By the way, it perhaps doesn't go without saying that if you want to disable rewriting inside a particular subfolder, where it is enabled further up the tree, simply do:

handy for avatar folders, to allow hot-linking, etc..
RewriteEngine off
 

troubleshooting tips..

rewrite logging..

When things aren't working out as you'd expect, the first thing you need to do is enable rewrite logging. I'll assume you are testing these mod_rewrite directives on your development mirror, or similar setup, and can access the main httpd.conf file. If not, why not?  Testing mod_rewrite rules on your live domain isn't exactly ideal, is it? Anyway, put this somewhere at the foot of your http.conf..

Expect large log files..
#
# ONLY FOR TESTING REWRITE RULES!!!!!
#
RewriteLog "/tmp/rewrite.log"
#RewriteLogLevel 9
RewriteLogLevel 5

Set the file location and logging level to suit your own requirements. If your rule is causing your Apache to loop, load the page, immediately hit your browser's "STOP" button, and then restart Apache. All within a couple of seconds. Your rewrite log will be full of all your diagnostic information, and your server will carry on as before.

Setting a value of 1 gets you almost no information, setting the log level to 9 gets you GIGABYTES! So you must remember to comment out these rules and restart Apache when you are finished because, not only will rewrite logging create space-eating files, it will seriously impact your web server's performance.

RewriteLogLevel 5 is very useful, I find.

Fatal Redirection

If you start messing around with 301 redirects [R=301], aka. "Permanently Redirected", and your rule isn't working, you could give yourself some serious headaches..

Once the browser has been redirected permanently  to the wrong address, if you then go on to alter the wonky rule, your browser will still  be redirected to the old address (because it's a browser thing), and you may even go on to fix, and then break  the rule all over again without ever knowing it. Changes to 301 redirects can take a long time to show up in your browser.

Solution: restart your browser, or use a different one.

Better Solution: Use [R] instead of [R=301] while you are testing . When you are 100% certain the rule does exactly as it's expected to, then  switch it to [R=301] for your live site.
 

debug-report.php

When things aren't working as you would expect, you probably won't have to enable rewrite logging to get the information you need. What's usually required is no more than a quick readout of all the current variables, $_GET array, and so on; so you can see exactly what happened to the request.

For another purpose, I long ago created debug.php, and later, finding all this information useful in chasing down wonky rewrites, created a "report" version, which rather than output to a file, spits the information straight back into your browser, as well as $_POST, $_SESSION, and $_SERVER arrays, special variables, like __FILE__, and much more.

Usage is simple; you make it your target page, so in a rule like this..

RewriteRule ^(.*)\.html$ /catch-all.php?var=$1

You would have a copy of debug-report.php temporarily renamed to catch-all.php in the root of your server, and type http://testdomain.org/foobar.html into your address bar and, with yer mojo working, debug-report.php leaps into your browser with a shit-load of exactly the sort of information you need to figure out all this stuff. When I'm messing with mod_rewrite, debug-report.php saves me time, a lot. Also, it's free..

 

conclusion

In short, mod_rewrite allows you to send browsers from anywhere to anywhere. You can create rules based not simply on the requested URL, but also on such things as IP address, browser agent (send old browsers to different pages, for instance), and even the time of day; the possibilities are practically limitless.

The ins-and outs of mod_rewrite syntax are topic for a much longer document than this, and if you fancy experimenting with more advanced rewriting rules, I urge you to check out the apache documentation.

If you have apache installed on your system, there will likely be a copy of the apache manual, right here, and the excellent mod_rewriting guide, lives right here. do check out the URL Rewriting Engine notes for the juicy syntax bits. That's where I got the cute quote for the top of the page, too.
 
;o)
(or
 
 
 
 

Before you ask a question..

Firstly, read this at least once in your life. I insist!

NOTE: THIS IS NOT A COMMUNITY. And I am not your free tech dude. Sure, folk sometimes drop back in, but realistically, the chances of someone else coming along and answering your tech question are about as close to zero as it gets; almost no one sticks around but me, the guy who wrote all that text (above).

If you can't be bothered to read the article, I can't be bothered responding. Capiche? I do read all comments, though, and answer questions about the article. I'm also keen to discuss anything you think I've missed, or interesting related concepts in general.

If you are still sure that you want to post your own, personal, tech question, then please ensure that you first, either..

a) Have read the article (above) and have tried "everything" yourself; in which case; post the exact code that isn't working (preferably inside [pre][/pre] tags), or else..

b) Pay me. The PayPal button is at the top right of the page.

Other posts will be ignored and/or deleted.

cbparser powered comments..

previous comments (twenty nine pages)   show all comments

cor - 29.04.08 11:38 pm

..back from my own tech issues..

Minty! What is THAT!?
The first error probably comes from the colon ":" in front of the Options statement. That's gotta be wrong.

Next are these crazy (.*[^/]) sections. Surely you mean.. ([^/]*) The bracketed conditions replace the "." character in the more usual (.*), you see.

Thanks dMan. Your desire is fairly trivial to achieve, something like this should do the trick..

RewriteRule ^(.+)\.html$ /go.php?sku=$1 [L]

;o)
(or


Flavio - 02.05.08 7:41 am

First of all, congratulations for this great article.

Back to the main theme... I've been wondering if this would be the best solution to deal with validation (W3C) when using PHP based url queries. The use of the ampersand is the main problem because it's a reserved character.
So I was planning something like redirecting the urls through .htaccess manipulation.

Do you believe this is a good solution for this problem?

Thank you,

Flavio


cor - 02.05.08 8:46 am

The best solution would be to use..
&amp;
..in your page links.

;o)
(or


Flavio - 02.05.08 5:47 pm

The thing is I've been a little confused when I read this article: http://www.w3.org/QA/2005/04/php-session.
I used to don't care about web standarts and validation stuff and tableless, but now I'm seening how important is.


cor - 02.05.08 8:03 pm

You're right; all these things are important. And it's good that you care. However, this particular issue isn't of great importance because almost no one uses trans_id anymore, mainly because it's so insecure.

The article in question also gives solutions that you can use in your .htaccess, aka. "Apache directives". i.e..
php_value arg_separator.output "&amp;"
But by far the best solution is to disable trans_id altogether..

php_value session.use_trans_sid 0

or..

php_flag session.use_trans_sid off

which is also shown in the article.

;o)
(or

ps. this now has nothing to do with rewriting, and so should probably be on page one! smiley for ;)


Faisal - 05.05.08 8:35 am

Thanks for sharing your knowledge with us. Here is my question after having done several tries.

This is the URL I want to get translated
www.domain.com/index.php?pg=detail&catId=1&catTitle=2

After rewrite, it should look like this
www.domain.com/about_us.html
(about_us is the cat title so it will change with every argument)

However, I have achieved the following.

www.domain.com/detail/8/about_us.html

I have used the following line of code to get this result.
RewriteRule ^detail/(.*)/(.*)$  /sitewonders/index.php?pg=detail&catId=$1&catTitle=$2 [nc]
 

What my requirement is that I dont want to show detail and 8 whereas detail is the page where my codes are stored and 8 is the product category.I have no clue how to get hold of this. the <a href> code used to call is as follows.

<a href="<?=$webPath?>detail/<?=$viewcat['pk_id']?>/<?=str_replace(' ', '_', $viewcat['cat_Title'])?>.html" class="b">
 


Please give me any clue or help to achieve this target.


Derick - 05.05.08 12:56 pm

Your article was very helpful.

I have a problem of creating url as following

http://www.domain.com/sub-category.php?r=37&sr=38 as

http://www.domain.com/jouets-1er-age/puericulture.php

jouets-1er-age - means category name that is "r=37" in the url. So I have to get the category name from database

puericulture - means sub category name that is "sr=38" in the url. So I have to get the sub category name from database

can you please explain how should I use mod_rewrite to do that work.

Thanks.


cor - 05.05.08 1:59 pm

Faisal, if the parameters aren't in the original (flat) URL, they won't be available to your script, so you need to have them somewhere. Your rewrite is fine, and the flat link is probably as small as it can be.

Derick, everything you need is in the article. Have a read. If you still can't achieve your goal, get back here and post the actual code that isn't working, and I'll have a look at it.

;o)
(or


Derick - 06.05.08 5:23 am

Thanks cor. I read this article and I have developed the code. But I didn't get the correct file. It redirect to index.php always but the name is correct.

http://www.domain.com/sub-category.php?r=37&sr=38 as

http://www.domain.com/jouets-1er-age/puericulture.php


-----------------------------------------------------------------------------------------------
My code of .htaccess is;

php_value register_globals off

RewriteEngine on
Options +FollowSymLinks

# RewriteBase /
# Rule for duplicate content removal : www.domain.com vs domain.com
RewriteCond %{HTTP_HOST} ^ludo [NC]
RewriteRule (.*) http://localhost/ludo/$1 [R=301,L,NC]

#RewriteCond %{REQUEST_URI} ^(/r/sr) [NC,OR] ##optional
RewriteCond %{REQUEST_URI} (/|\.htm|\.php|\.html|/[^.]*)$ [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*) index.php

Please look at my code help me. I appriciate your kind help.


cor - 06.05.08 8:15 am

Sheesh! Derick, I didn't mean your entire site! <snipped> !

That .htaccess code is a mess. It's like you just threw characters into notepad and hoped for the best. I'm not gonna teach you the basics, dude! I already wrote a big article to do exactly that. Really, have a read, except this time; slowly.

Here's my advice: START AGAIN!

;o)
(or

ps. hints: you only put brackets around characters you want to capture. Look again at my examples in the article, and the many many good examples (my replies) in these comments, and you'll get the idea - there are many examples that are probably exactly what you need. Good luck!


Dman - 06.05.08 9:50 am

Hi Corz,
thank you for helping me in my last post.
It works GREAT! I've read through the comments page and it seems
like mod-rewrite only works with variables and not static filenames.

Here is what I'm trying to do now.
My page has about 6 different category links which go to their individual category page.

These category file names begin with go + (3-5 random numbers) + .html
Below are a few examples to give you a better idea:

go245.html
go1256.html
go15460.html

These are all direct static links which represent specific categories.
And here is what I'm trying to do:

go245.html --> (rewritten to) --> cars.html
go1256.html --> (rewritten to) --> boats.html
go15460.html --> (rewritten to) --> trucks.html

Each link is static with none of that variable or wildcard stuff.
Below was my feeble attempt to make it work and both failed..

RewriteRule cars.html /go245.html [L] (didn't work)
RewriteRule ^cars.html$ go245.html [L] (didn't work)

------------------------------------
you see - because these are static links I dont think using
variables like $1 or ^([^.]*). is going to work in this situation.
but what the heck do I know - I need an expert and
I sure would be grateful for your expertise.

Thanks again,
Dman


cor - 06.05.08 1:09 pm

Dman, mod_rewrite works with URL requests. Any part of it; dynamic parts, static parts, user agents, protocol version, you name it.

Anyway, are you sure this..

go245.html --> (rewritten to) --> cars.html

Isn't meant to be this..

cars.html --> (rewritten to) --> go245.html

That is, the link (or what the user types into their address bar) would be "cars.html", and the real, server-side page would be "go245.html". Right? Back-to-frontness is very common in mod_rewrite questions.

If so, there's no reason why your first example wouldn't work..

RewriteRule cars.html /go245.html [L]

Requests for "cars.html" would be redirected to a file called "go245.html" in the root of your site. Looks fine.

If it doesn't work, then perhaps another rule is catching it before it gets to that one.

;o)
(or


Dman - 07.05.08 9:44 am

thanks Corz,
wow, I think your "right on target" once again. I bet my first example (attempt) wasn't working because of what you said. Because I do have other lines in my .htaccess file. Heres what it looks like:

-----------------------
RewriteEngine On
RewriteRule ^go([^.]*).html /index.php?link_id=$1 [L]
RewriteRule ^([^.]*).html /index.php?search=2&query=$1 [L]
RewriteRule ^([^.]*).shtml /go.php?sku=$1 [L]

RewriteRule cars.html /go245.html [L]
ErrorDocument 400 /400.htm
ErrorDocument 401 /401.htm
ErrorDocument 403 /403.htm
ErrorDocument 404 /404.htm
ErrorDocument 500 /500.htm

-----------------------

The above example is what my htaccess file looked like which failed to work. Notice the 4th rule down. So could this be not working due to the other rules above it? Or what catches your eye? Thx again, D.


Praetor - 08.05.08 1:51 am

Great article....... learned a few things. Used the .htaccess in my includes folder. Works like a charm. Used the same in my css folder. No joy! Will have to leave that one alone. Now, if I can figure out how to do rel="nofollow" in the .htaccess, I will be in business.

Thanks for all the information.


Derick - 08.05.08 8:37 am

Dear Corz,

Thank you for your advice for the last post.

Still I am learning mod_rewrite. Now i got the things littlebit and now working my redirections.
But I have a two problem

1) Once I have create link as folders images are not loading and all paths taken from the current virtual folder.

RewriteRule ^jouets-.*-([0-9]+)/.*-([0-9]+)\.html$ /jouet-test.php?r=$1&sr=$2 [L]

http://www.domain.com/jouets-1er-age-37/puericulture-38.html

2) I want to remove "r" value 37 and "sr" value 38 from the URL (http://www.domain.com/jouets-1er-age-37/puericulture-38.html). If I remove those the jouet-test.php loaded without data. That means "r" and "sr" not set. How can I do this?

Thanks.

I appriciate your comments and suggestions.




cor - 08.05.08 9:18 am

Dman, Notice the 2nd rule down. smiley for ;) Wooosh!

Praetor, rel=nofollow is evil.

Well, okay, I guess it has its uses, but .htaccess-automatic? That's just Wrong! I only see a use for it in temporary situations, as in, before the webmaster reviews a public comment, personally. All other uses seem anti-web, to me.

I remember the buzz I got when I first spotted someone had dropped a Wikipedia link my way, and how my heart sunk when I read the source. rel=nofollow. Some kind of bandwidth-eating joke, and I fell for it! smiley for :erm: Still do…

Derick, if more than one page use either the phrase jouets-1er-age or puericulture in their flat-link, then you need those numbers in the flat-link.

In other words, unless you have a database of word->number mappings that equates jouets-1er-age -> 37, and jouets-something other phrase to 38, something else again to 39, and so on; which you don't appear to; then you cannot remove the numbers. You need them for the regex back-references.

If you want to re-use jouets-1er-age or puericulture in your flat links, then you need to keep the variable numbers inside the flat link, so mod_rewrite has something with which to populate the r and sr variables in th e target URI.

That's me said the same thing three different ways! And again, just in case..

If my flat link, the one the user sees, is foo.html, and I want the user to actually get bar.php?var=38, then I need to get the 38 from somewhere. It can either a) come cleanly from the flat-link, in the style of /foo/38.html, or foo-38.html, or foo38.html, or something like that. Or b) a file/database table/etc. exists somewhere, where bar.php could look to know that, for example..

foo=38
w00t=39
roo=40


and so on, and know that "foo" = "38", and the only time that I would use foo in a flat-link, is when I want to get the user to bar.php with var set to 38. Mod_rewrite is very like magic, but you can't just magic variables out of nowhere!

;o)
(or


cor - 08.05.08 9:37 am

I must add, and quite literally, to this page; a link to my php debug script, I might even do a proper section about it, in lieu of its own page one day, as all these sorts of things should have.

Inside the archive (McAfee reports all downloads at corz.org are 0 on the nuisance meter, Green for go!) are two files, a regular back-end version (which spits variables to a real file, and is mostly used to capture and display debug output from inside php scripts), and a "report" version, which is most handy for .htaccess.

What's it for? You ask. Simply, it spits out a big page of stuff; the entire $_SERVER[] array, variables available to the script (say that when you're drunk!), that is; your captured variables, I sometimes drop a link to it into tech-support emails, and invariably folk get back with some positive comment, often a plain old "Wow!". It is handy.

Usage is simple; you make it your target page, so in a rule like this..

RewriteRule ^(.*)\.html$ catch-all.php?var=$1

You would have a copy of debug-report.php renamed to catch-all.php, and type foobar.html into your address bar, and with yer mojo working, debug-report.php leaps into your browser with a shit-load of exactly the sort of information you need to figure out all this stuff. When I'm messing with mod_rewrite, it saves me lots of time. Read-out. And it's free.

I'll drop a link right here, too, by way of motivating myself to get the URL into the clipboard, and in the Style of Brian Tracey, handle every piece of paper only once..

http://corz.org/engine?section=php/corz%20function%20library&source=debug.php

for now..

;o)
(or


Derick - 08.05.08 10:10 am

Thanks Corz.

I got trick now.

I have another problem. When I create links as

RewriteRule ^jouets-.*-([0-9]+)/.*-([0-9]+)\.html$ /jouet-test.php?r=$1&sr=$2 [L]

http://www.domain.com/jouets-1er-age-37/puericulture-38.html.

But "jouets-1er-age-37" directory is not there. So the all images are not loading.

Can you please advice for me.




kmm - 09.05.08 2:05 pm

Great info been really helpful trying to get a handle on rewrite/redirects.

Ok before I pull what little hair I still have out please have a look at this for me.

I want to redirect http://www.domain/?p=about_us to http://www.domain/aboutUs2.php

My .htaccess:
RewriteEngine On

rewritecond %{http_host} ^domain
rewriteRule ^(.*) http://www.domain/$1 [R=301]
rewriteRule ^linkmachine/(.*)$ http://www.domain/?p=linkRedirect [R=301,L]
rewriteRule ^blog/(.*)$ http://tvblog.another-domain.com/temple-view/ [R=301,L]

rewriteRule ^about(.*)$ http://www.domain/aboutUs2.php [R,L]


IndexIgnore *

ErrorDocument 400 /a1Whoops.html
ErrorDocument 401 /a1Whoops.html
ErrorDocument 403 /a1Whoops.html
ErrorDocument 404 /a1Whoops.html
ErrorDocument 500 /a1Whoops.html

All the other rules work fine except this one and I cannot see why. I think it has something to do with the criteria being in the query string but heh I'm really out of my depth here
Thanks in anticipation of help!


Jamie - 13.05.08 1:23 pm

Thank you very much.. Finaly an understandable tutorial on .htaccess and Rewrite!


cor - 13.05.08 2:39 pm

Derick, that isn't an .htaccess problem, that's your back-end. If jouets-1er-age-37 isn't a real directory, you can't create "relative" links to images and such on page with a virtual URI. The user's browser has no idea you are using mod_rewrite.

You'll need to either use full links to the images, from the root of the server (i.e. begin <img src="/...), or else use more rewrites to redirect image requests to the correct folder. There are other solutions.

kmm, correct; you are incorrectly attempting to capture parts of the query string. Two things are wrong with your code..

Firstly, you do not understand the usage of anchors; that is, the "^" and "$" characters in the pattern part of the rules. Using code you don't understand is a recipe for disaster!

Along with some other improvement to that section, I have enlarged the regex special character descriptions, among these you'll find..

An anchor explicitly states that the character right next to it MUST be either the very first character ("^"), or the very last character ("$") of the URI string to match against the pattern …

In light of this information, you will see that the anchor in your first RewriteRule is completely superfluous, and the one in your last RewriteRule is what tells me you don't understand your code. A request for about.php?foo=bar would match your rule, as "about" is at the very start of the URI. See?

I'm assuming you are catching legacy inward links, or something - you can use Rewritecond, to check for the existence and value of the p variable..
RewriteCond %{QUERY_STRING} p=about_us
RewriteRule !aboutUs2\.php /aboutUs2.php [R,L]
Which matches all requests that have the p query variable set to about_us, that haven't already been redirected to aboutUs2.php

If you really need it to be an external redirect, simply tag the http://domain part back on.

By the way, you may be interested to know that I also recently added a new section about capturing variables. A good read for all! Feedback welcome.

Hey! Jamie, you finally got here!

;o)
(or


 

leave a comment, become part of this site!


First, confirm that you are human by entering the code you see..

(if you find it difficult to read, refresh the page for a new code)


Enter the 5-digit code this text sounds like : lower-case ee, Upper-Case Vee, ate, lower-case ay, Upper-Case Gee


 
 
[site notice]

If you give a shit, BUY A SHIRT!