Thursday, April 16, 2015

Using Apache's Swiss Army Knife - mod_rewrite for RESTful endpoints (part III)

This is the last blog in my series on Apache's mod_rewrite.  In my previous blogs I've explained a little bit about how you can use this Apache module to redirect URLs both internally and externally.

Using Apache's Swiss Army Knife - Part II

I've provided a few examples and talked about some of the gotchas that I've encountered.  Today we're going to discuss how to use mod_rewrite to provide an API endpoint that is an alias for your web server's name without creating another virtual host.  In most cases, it makes sense to create another virtual host configuration file to implement your application API, however in the event you don't have access to Apache's configuration files, you can use mod_rewrite rules in an .htaccess file to achieve nearly the same operations.


Before I begin describing what some might consider a bit of a hack, I should point you to a good piece of documentation on mod_rewrite.  Knowing where not to use mod_rewrite is as important as knowing all of the many things that it can do.  Like Perl, Apache modules provide a lot of redundant tools that will allow you to achieve the same end result.  TIMTOWTDI.  Take a look at this page for more information about when to avoid mod_rewrite.

http://httpd.apache.org/docs/2.2/rewrite/avoid.html

As you may recall we have a theoretical application hosted at http://www.mysite.com and   we'd like to create a RESTful API with an endpoint of http://api.mysite.com.  Let's further suppose for whatever reason we do not want to create a separate virtual host.

First, you'll need to make sure that you add api.mysite.com as a ServerAlias in Apache's VirtualHost configuration section.   Even if you don't have access to Apache's virtual host configuration file, most hosting companies give you some mechanism to setup server aliases.

You'll also need to make sure you server alias resolves to the desired IP address.   Once you've verified those two steps, you can use some mod_rewrite magic.

RewriteCond

As noted, we'll be using rewrite rules to redirect the endpoint.  Apache's rewrite rules can be conditionally applied by using the RewriteCond directive.  One or more conditions precede a rewrite rule.  These conditions are logically ANDed by default.  Apache will apply the rewrite rule only if all of the conditions match.  You can change this behavior by using the [OR] flag on the conditions to tell Apache to logically OR the conditions so that the rewrite rule will be applied if ANY of the conditions match.

The RewriteCond directive has two arguments: a test string, and a condition string.  The test string can be any of:

RewriteCond backreferences ($N 1 <= N <= 9)
RewriteRule backreferences assuming the rewrite rule matches
Server-variables - (%{HTTP_HOST}, %{SERVER_NAME}, etc)
RewriteMap (%{map:key}) extensions which allow you to create a hash or map of key value pairs

The conditions string is typically a PCRE (per compatible regular expression) with some special variants described in the mod_rewrite documentation.  I tend to stick with the regular expressions.

Rewriting our RESTful Endpoint

Using RewriteCond and a rewrite rule, we can easily create a new endpoint for our API that will invoke our Bedrock page that implements our RESTful API.

RewriteEngine On
UseCanonicalName On

RewriteCond %{HTTP_HOST} api\.mysite\.com
RewriteRule ^/([^/]*)/(.*)$ http://%{SERVER_NAME}/api.roc/$2/?api=$1 [P,QSA,L]

Our rewrite condition tests the host name (api.mysite.com) and then applies a rewrite rule with some special flags (more on that later).  The rewrite rule looks for a URI of the form:

/api-name/extra-path-info

...and then rewrites that URI to:

/api.roc/extra-path-info?api=api-name

Now our Bedrock page can easily interpret this API endpoint and perform tests to decide what to do.  For example, we can test the query string variable api to disambiguate the API service we want to invoke.

<if $input.api --eq 'customer'>
...
<elseif $input.api --eq 'product'>
...
</if>

We can also look at the extra-path-info using the $env.BEDROCK_PATH_INFO variable to get additional context or API parameters needed by the API depending on how you have designed your endpoints.

<null:args $env.BEDROCK_PATH_INFO.split('/')>

Given an API endpoint that looks like this:

http://api.mysite.com/customer/1234

Our rewrite magic and Bedrock parsing would yield.

$input.api => customer

$env.BEDROCK_PATH_INFO => /1234

$args =>  [
  [0] .. ()
  [1] .. (1234)
  ]

The $input object, which contains our query parameters would contain the name of the API we wish to invoke, the $env object would contain an environment variable (BEDROCK_PATH_INFO) which reveals the extra path information and by splitting using the split() method of a scalar we create an array named $args.

Helpful Flags [P,QSA,L]

In our rewrite rule we added three flags, P, QSA, L.  The P flag which we mentioned in an earlier blog tells mod_rewrite to "Force the substitution URL to be internally sent as a proxy request".  Note that in order to use this flag for internal redirects you'll need to enable mod_proxy (you do not need to set ProxyRequests On however!).  You should also be careful when creating rewrite rules when using the P flag. 

The QSA flag tells Apache to allow the appending of the query string in the proxy request and the L flag (although redundant) stops processing of the current set of rewrite rules.

We use Apache's UseCanonicalName directive so that the %{SERVER_NAME} variable is the ServerName as defined in the virtual host configuration file.  The variable is used in the rewrite rule to redirect Apache to our Bedrock page and to avoid matching the RewriteCond.  So in effect we are redirecting to http://www.mysite.com/api.roc.



Conclusion

mod_rewrite is one of Apache's most powerful and complex modules.  It is indeed the Swiss Army knife of modules.  To find out more about mod_rewrite, visit the Apache documentation.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.