Apache mod_rewrite bug still lurks

Wednesday, August 20. 2008

There's this enormous apache mod_rewrite bug that I ran into back in 2005, and to my dismay, it's still there. Long-standing bugs are usually small edge cases that don't affect many people, but this one is a monster that I suspects pretty much everyone that's using mod_rewrite, and they've just been lucky in avoiding it. The basic issue is this: if you match params that require URL encoding to be safe, mod_rewrite will not rewrite the back-reference (that's $1 below) correctly. So take this very simple redirect: RewriteRule ^(.*)$ index.php?show=$1 [R] so you hit http://www.example.com/a%2Fb and mod_rewrite neatly rewrites it as http://www.example.com/index.php?show=a/b! Notice that it has urldecoded the matched parameter in the replacement string. Apache 2.2 introduced a new B flag to deal with this, but that apparently suffers the same problem! There are two workarounds I've used that are both horrible: double-encode the source string (if you are in control of both start and end points of the URLs) to survive the spurious urldecode, or base-64 encode (javascript flavour) it and do the decoding yourself. I did warn you they were horrible. I'll bet that there are a zillion mod_rewrites out there that suffer from this fundamental problem and haven't even noticed. If a few people voted for this to be fixed, it would probably go away...

Trackbacks


Trackback specific URI for this entry
    No Trackbacks

Comments


    No comments

Add Comment

You can use [geshi lang=lang_name [,ln={y|n}]][/geshi] tags to embed source code snippets.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

    Submitted comments will be subject to moderation before being displayed.