sUTL as a Template Language

When you talk to APIs, or read from a JSON structure in a datastore, or just deal with complex datastructures in a dynamic language, you can end up wanting to deal with complex JSON-compatible datastructures (ie: Dictionaries, Lists, Simple Types). Picking through them in imperative or even functional code can be a pain in the butt.

If you want to generate some html, or similar, you would usually use a templating language. For example, say I've got this chunk of JSON from facebook:

{
  "id": "X999_Y999",
  "from": {
    "name": "Tom Brady",
    "id": "X12"
  },
  "message": "Looking forward to 2010!",
  "actions": [
    {
      "name": "Comment",
      "link": "http://www.facebook.com/X999/posts/Y999"
    },
    {
      "name": "Like",
      "link": "http://www.facebook.com/X999/posts/Y999"
    }
  ],
  "type": "status",
  "created_time": "2010-08-02T21:27:44+0000",
  "updated_time": "2010-08-02T21:27:44+0000"
}

Figure 1: A facebook message

I could use the Mustache templating language with this template:

<html>
<title>Facebook Messages #1</title>
<body>
  <h1>A message retrieved from facebook:</h1>
  <hr>
  <span>From: {{from.name}}</span>
  <span>Message: {{message}}</span>
  <hr>
</body>

I would generate the following result:

<html>
<title>Facebook Messages #1</title>
<body>
  <h1>A message retrieved from facebook:</h1>
  <hr>
  <span>From: Tom Brady</span>
  <span>Message: Looking forward to 2010!</span>
  <hr>
</body>

Which look like this in a browser:

just add a blink tag to be ready for geocities!

In Javascript I can include mustache, include my mustache template from a file somewhere, use JSON.parse to read in the source_data, then do something like this:

var html = Mustache.to_html(template, source_data);

instead of writing lots of nasty imperative code.

If you're an experienced programmer who builds anything to do with the web, you will have found yourself doing this kind of thing many times, using some template language or another. Separating the declarative template from other code is a win from a maintainability perspective, and makes it easier to generate more complex html.

But say I don't want to generate html. Instead, I'd like to change the shape of the data above, to make it suitable to pass to a function in my code that expects something specific. For example:

function ProcessMessage(message_obj) 
{
  // expects an object, with the 
  // attributes "fullname" and "message"
  ...
}

I've got that big chunk of parsed JSON in some code, and I need to pass it to ProcessMessage(). I'm back to writing a bunch of nasty imperative code. Why can't I use a template in this situation?

sUTL is designed to be that templating solution.

sUTL is a transform language, and I talk about transforms rather than templates. But transforms and templates are the same kind of thing, as is a function; they are all black boxes that take input and produce output.

Given the data from Figure 1, above, ProcessMessages() wants the following data:

{
  "fullname": "Tom Brady",
  "message": "Looking forward to 2010!"
}

How do we get there? With this transform:

(note: this is a live example. Have a look at the source and result, then try changing the source or transform and see what happens to the result)

If you've ever used JSONPath, you'll note that sUTL can completely replace JSONPath, but can quickly return much more interesting results, by combining paths together inside literals as above.

In Javascript, I could then include sUTL and use the following code the evaluate the transform:

var source_data = {
  ...
}
var transform = {
  "fullname": "^$.from.name",
  "message": "^$.message"
}
var msg = sUTL.evaluate(source_data, transform, {})
var result = ProcessMessage(msg)

which is very similar to the Mustache example above.

So what's happening in this transform?

You'll notice that, like a template, we are basically specifying what we want as a result, and plugging in variables where we want to pull in something from the source_data.

Those variables are called "paths" in sUTL. A path selects a subset of a chunk of data. You can read all about paths here, but the gist of it is this:

A path starts with "&" or "^". Read "&" as "Select the list of things matching the following..." and "^" as "Select the head of the list of things matching the following...".
"$" refers to the Source. ie: Start with source_data.
After this, we have indices separated with dots. Each index is treated as a string, if it's pathing into a dictionary, or an integer if it's pathing into a list.

So "^$.from.name" says "Select the head of the list of things matching the following... start from the source, dereference by the index "from", then dereference by the index "name".

Looking at source_data, we can see that this list gives us the string "Tom Brady". Our selection actually does something obtuse, which is to select the list containing that result as its only item, then return us the head; ie: it constructs this:

[ "Tom Brady" ]

then returns the head of that, "Tom Brady".

It does this so that paths can support wildcards.

Wildcards

Say I'd like to get the list of "names" of "actions" using a transform. ie: I want a list of strings, all the "name" attributes of dictionaries in the "actions" list.

"^$.actions" would give me the list of dictionaries with attributes "name" and "link". But I can use a wildcard to just get the list of names. Here's a transform showing this, with a bunch of other variations:

We get this result:

{
  "nameinalist": [
    "Tom Brady"
  ],
  "actions": [
    "Comment",
    "Like"
  ],
  "oneaction": "Comment",
  "allnames": [
    "Tom Brady",
    "Comment",
    "Like"
  ]
}

Let's break down these results.

nameinalist: This shows that if you use "&" instead of "^" in a path with no wildcards, you'll get your result in a list. Probably not what you want, but sometimes it might be.
actions: "&$.actions.*.name" deferences the source_data by action, then we have a single asterisk. That matches everything at that level (all elements of an array, all keys in a dictionary), in this case all the dictionaries in the "action" array. Below that, we select the attribute "name". So the result is a list of all "name" elements from the dictionaries in the action list.
oneaction: This is the same as "actions", but uses "^" instead of "&". This selects the head of the list of things matching $.actions.*.name, ie: the string "Comment".
allnames: Here I'm using a double asterisk wildcard, which is recursive. I'm starting at the root of source_data, then traversing it looking for any attributes called "name", and returning everything I find in a list.

Conclusion

sUTL is useful as a templating language for data, where we specify the result we want using paths in place of literal data. It lets you think about your data in terms of what you're trying to achieve, and write a declarative transform that you can separate out from the rest of your code, in an analogous fashion to an html templating language like mustache or smarty or jinja.

But sUTL goes a lot deeper; it's a Turing Complete language in its own right. I'll add more articles to this blog, showing how to achieve more complex results with sUTL. The next one will be about processing lists of data. Sign up, and stay tuned!