Saturday, April 4, 2015

Using Apache's Swiss Army Knife - mod_rewrite for RESTful endpoints

As I've been exploring using Bedrock for creating RESTful APIs, one of the requirements for a well crafted API is to create appropriate URIs that make sense.  I've read the phrase "you don't write RESTful APIs you DESIGN them" in more than one blog.  This usually boils down to exposing the API in a logical and hierarchical manner.  Apache's mod_rewrite is a versatile and useful tool for allowing you to create meaningful, well crafted endpoints that also work with Bedrock.


In a typical application we might need to create a RESTful service for accessing customer information.  To retrieve a list of customers we might consider a URI like:

http://www.mysite.com/api/customers

...or we might want to retrieve information about a specific customer with a customer identifier of 1234 with a URI like:

http://www.mysite.com/api/customers/1234

Our goal would be to use Bedrock to create our service, so how do we translate these URIs into Bedrock page requests?

Using mod_rewrite to rewrite URIs

In order to use Bedrock to provide RESTful endpoints, we'll need a bit of alchemy to translate a URI into a URL that is understood in the context of Bedrock's scripting language. Typically this means we want to redirect to the specific .roc or .jroc file with the necessary query parameters needed by Bedrock to satisfy the request.

In our example we'll create a Bedrock page that returns JSON output.  We therefore create a Bedrock file named customers.jroc that will allow us to either retrieve a list of customers or a specific customer's data, depending on whether or not a query parameter (id) is present in the request. Here's my attempt at creating a quick and dirty Bedrock page to provide customer information in JSON format.

<sink>
 <if $input.id>
   <sqlselect "select * from customer where id = ?"
              --bind=$input.id
              --define-var="result"></sqlselect>
 <else>
   <sqlselect "select * from customer"
              --define-var="result"></sqlselect>
  </if>
</sink><var --json $result>

The quick and dirty snippet above is lacking a lot of things you should include in a RESTful service (error handling, etc), however for the purposes of this exercise, we'll assume you've read the five part series already.

Suppose we want our API endpoints to look like the ones we defined previously above:

http://www.mysite.com/api/customers
http://www.mysite.com/api/customers/1234

We'll need to translate these endpoints into these URLs:

http://www.mysite.com/customers.jroc
http://www.mysite.com/customers.jroc?id=1234

Apache's mod_rewrite allows us to interpret URIs and rewrite them for just this purpose.  To use the module, we turn on the rewrite engine by using the RewriteEngine directive in Apache's configuration file (either .htaccess, your virtual host configuration or server configuration as appropriate).

RewriteEngine On

Now we add rewrite rules to translate our programmer friendly URIs into URLs.  Rewrite rules contain a regular expression pattern, a rewrite target and optional flags.

RewriteRule pattern target flags

RewriteRule ^sales/(.*)$ /cgi-bin/sales.cgi/$1 [QSA]

The module can be used to interpret and rewrite the URI in a variety of ways and in a variety of contexts (VirtualHost, Directory).  What gets matched in your regular expressions depends on the context of the rewrite rule.  From the Apache website:

What is matched?

In VirtualHost context, The Pattern will initially be matched against the part of the URL after the hostname and port, and before the query string (e.g. "/app1/index.html").
In Directory and htaccess context, the Pattern will initially be matched against the filesystem path, after removing the prefix that led the server to the current RewriteRule (e.g. "app1/index.html" or "index.html" depending on where the directives are defined).
If you wish to match against the hostname, port, or query string, use a RewriteCond with the %{HTTP_HOST}%{SERVER_PORT}, or %{QUERY_STRING} variables respectively.

What exactly is matched is important to understand if you use aliases to simplify asset paths.  For example you might use /icons to access /var/www/icons/small.

Alias /icons /var/www/icons/small

<Directory /var/www/icons/small>
  RewriteEngine On
  RewriteRule ^foo\.png /foobar.png
...
</Directory>

Your rewrite rule for http://www.mysite.com/icon/foo.png would be matching against:

foo.png

...not /icons/foo.png

Note that although you can tell Apache not do to this using optional flags on your rewrite rules, once Apache translates your URI it is handed back to Apache's URL parsing engine.  That's important because our Bedrock files need to be recognized by Apache and passed to the appropriate handler (Bedorck).  You do need to be careful about how you craft your rewrite rules in order to avoid recursively defined rules that cause looping.

Our Bedrock Rewrite Rules

Let's recap.  We want to translate...

http://www.mysite.com/api/customers
http://www.mysite.com/api/customers/1234

...into something like these URLs:

http://www.mysite.com/customers.jroc
http://www.mysite.com/customers.jroc?id=1234

In order to accomplish the above, we add these rewrite rules to our virtual host configuration file in our document directory section.

<Directory /var/www/htdocs>
...
  RewriteEngine On
  RewriteRule ^api/customers/?$ /customers.jroc
  RewriteRule ^api/customers/([0-9]+)$ /customers.jroc?id=$1
...
</Directory>

Using PATH_INFO

mod_rewrite is indeed a Swiss Army Knife...and so is Bedrock!  TIMTOWTDI, so another less complex approach is to simply pass along extra path information to Bedrock and let it figure out what to do.  Instead of rewriting the URI and including a query string, let's just pass the extra path information along to Bedrock and let Bedrock process it.  Change the rewrite rules as follows:

RewriteRule ^api/customers/?$ /customers.jroc
RewriteRule ^api/customers/([^/]*)/?$ /customers.jroc/$1


..and here's how we use the extra path information and interpret the URIs using Bedrock.

<sink>
 <null:path_info --default=$env.PATH_INFO $env.BEDROCK_PATH_INFO>

 <if $path_info --re '^/(?<id\>\\d+)\/?$'>
   <sqlselect "select * from customer where id = ?"
              --bind=$id
              --define-var="result"></sqlselect>
 <else>
   <sqlselect "select * from customer"
              --define-var="result"></sqlselect>
  </if>
</sink><var --json $result>

Note how we use the capture group variable ?<id> in the regular expression to create the $id variable.

What About Updating Customers?

So far in our example we assumed the service only needed to fetch information, but of course we might also need to update or insert new customers.  We can also create RESTful services for operations other than fetching  (GET) by inspecting $env.REQUEST_METHOD and then writing snippets that support those operations.

<if $env.REQUEST_METHOD --eq 'GET'>
...
<elseif $env.REQUEST_METHOD --eq 'PUT'>
...
<elseif $env.REQUEST_METHOD --eq 'POST'>
...
<elseif $env.REQUEST_METHOD --eq 'DELETE'>
...
</if>

For more information about using Bedrock to create RESTful services check out the five part series.
Next time we'll discuss how we can use the same virtual host with a different ServerAlias to expose our RESTful API as http://api.mysite.com using more mod_rewrite magic.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.