codem - blog

Posts Tagged ‘apache’

Zeus and crazy URL rewriting

Recently I had a SilverStripe project go live on a host running the “Zeus” web server. I’ve never worked with this beasty before and I can see why – a pretty bog standard URL rewrite using mod_rewrite in Apache turned into multi-line horror.

Here’s the Apache rewrite, pretty standard stuff that ships with SilverStripe:

<IfModule  mod_rewrite.c>
	RewriteEngine On
	RewriteBase /
	RewriteCond %{REQUEST_URI} ^(.*)$
	RewriteCond %{REQUEST_FILENAME} !-f
	RewriteRule .* sapphire/main.php?url=%1&%{QUERY_STRING} [L]
<IfModule>

Here’s the Zeus equivalent:

insensitive match URL into $ with ^/.*\.(gif|jpg|jpeg|png|css|js|swf|php|html|htm|pdf)$
if not matched
	insensitive match URL into $ with ^/(.*)$
	if matched
		look for file at $1
		if not exists
			# Set the default page to be displayed if the URL is not a file or resource
			set URL = /sapphire/main.php?url=$1
			goto END
		endif
	endif
endif

Notice the case insensitivity matching, the Leaning Tower of Ifs and the file extension matching. It works, but not very nicely – for instance I missed the “swf” extension and no Flash files loaded, so it seems to automatically shove anything that doesn’t match down the pipe with a 404 header (even though the file exists on the server)

Edit – I’ve found a better way to do this.. which started by reading some Zeus/Apache rewrite translations at this Drupal page.

RULE_0_START:
# get the document root
map path into SCRATCH:DOCROOT from /
# initialize our variables
set SCRATCH:ORIG_URL = %{URL}
set SCRATCH:REQUEST_URI = %{URL}

# see if theres any queries in our URL
match URL into $ with ^(.*)\?(.*)$
if matched then
  set SCRATCH:REQUEST_URI = $1
  set SCRATCH:QUERY_STRING = $2
endif
RULE_0_END:

RULE_1_START:
# prepare to search for file, rewrite if its not found
set SCRATCH:REQUEST_FILENAME = %{SCRATCH:DOCROOT}
set SCRATCH:REQUEST_FILENAME . %{SCRATCH:REQUEST_URI}

# check to see if the file requested is an actual file or
# a directory with possibly an index.  don't rewrite if so
look for file at %{SCRATCH:REQUEST_FILENAME}
if not exists then
  look for dir at %{SCRATCH:REQUEST_FILENAME}
  if not exists then
    set URL = /sapphire/main.php?url=%{SCRATCH:REQUEST_URI}
    goto QSA_RULE_START
  endif
endif

# if we made it here then its a file or dir and no rewrite
goto END
RULE_1_END:

QSA_RULE_START:
# append the query string if there was one originally
# the same as [QSA,L] for apache
match SCRATCH:ORIG_URL into % with \?(.*)$
if matched then
  set URL = %{URL}&%{SCRATCH:QUERY_STRING}
endif
goto END
QSA_RULE_END:

That’s half the problem – after changing this I noticed that the base path in SilverStripe was set to “/index.php” resulting in all the links to various CSS and JS files to be broken. To get around this, in your _config.php file, make this change to force the base path

//force base URL for zeus
Director::setBaseURL('/');

And you should end up with a functioning site mimicking the (simpler) Apache setup.

In any case, even if it does work, my advice: anything that makes mod_rewrite even harder to implement by turning a 4 line rewrite into a 44 line monster should be avoided.

cache this, Internet Explorer

In my last post about the vagaries of Internet Explorer, I touched upon a technical solution we’d implemented to get Internet Explorer respecting our cache directives.

As we are using Apache,  its mod_expires module should be loaded with which we can specify for how long the user agent should cache the images. In this case we use a LocationMatch to specify the paths as a regular expression to where our assets are located:

<LocationMatch "^(/path/to/images|/another/path/to/images)">
   FileETag All
   <IfModule mod_expires.c>
     ExpiresActive On
     ExpiresByType image/jpg "access plus 14 days"
     ExpiresByType image/jpeg "access plus 14 days"
     ExpiresByType image/gif "access plus 14 days"
     ExpiresByType image/png "access plus 14 days"
   </IfModule>
</LocationMatch>

In this instance, we’re telling the client to cache  jpg, png and gif images for 14 days and we are also ETag-ing these files  – another useful measure to force the point on IE.

Even with this in place, it wouldn’t be Internet Explorer without having to deal with yet another of its quirks. If you gzip or deflate content in Apache but exclude images from this compression (for obvious reasons), Internet Explorer will refuse to cache those uncompressed images if a Vary header is being sent to the client. Luckily both mod_gzip and mod_deflate have workarounds for this :

mod_deflate (for Apache 2)

 SetOutputFilter DEFLATE
 SetEnvIfNoCase Request_URI  \.(?:gif|jpe?g|png)$ no-gzip dont-vary
 SetEnvIfNoCase Request_URI  \.(?:exe|t?gz|zip|gz2|sit|rar)$ no-gzip dont-vary

mod_gzip (for Apache 1.3)

 mod_gzip_on    Yes
 mod_gzip_send_vary    Off
 mod_gzip_dechunk  Yes
 mod_gzip_item_include file      \.(html?|txt|css|js|php)$
 mod_gzip_item_include mime      ^text/.*
 mod_gzip_item_include mime      ^application/x-javascript.*
 mod_gzip_item_exclude mime      ^image/.*
 mod_gzip_item_exclude rspheader ^Content-Encoding:.*gzip.*

The relevant entries here that turn off the Vary header are “mod_gzip_send_vary Off” for mod_gzip and “dont-vary” for mod_deflate. You’ll know it’s working when you start seeing Expires, Cache-Control and no Vary headers when requesting images in the relevant locations.

With this in place, Internet Explorer speeds up no-end! The flip-side of that from a development point-of-view is that more people will hang on to their crusty old browser for longer. Patience is required for the next 6 months or so.