Lessons

I have a problem. It's not an unusual problem, and I wish to tackle it precisely because of its ubiquity. The problem is, roughly, this: I have a large chunk of code which does some blocking work. I wish to augment this code so that it can be used with Twisted, and I wish to do so in a way that satisfies the following conditions:

  1. A minimum of source code is changed.
  2. There are no deep hacks which are not trivial to explain when taken one by one.
  3. The caller may use either the Twisted or non-Twisted interfaces at their leisure.
  4. The difference between the Twisted and non-Twisted results at all of the borders of the module should be undone by maybeDeferred.

These conditions should not be difficult to achieve and yet they seem to constantly stymie many would-be IRC bot authors. I'm going to see if I can improve on this with a couple lessons from Haskell.

So, first, let's consider why we cannot simply remove data from Twisted interfaces. It's elementary: Deferred computations cannot have their results accessed directly. Instead, actions have to be lifted up into a Deferred, which will run the action when it is ready.

To a Haskell programmer, this sounds quite a bit like how IO behaves, and indeed, Deferred is a Functor (and Monad) that cannot be unwrapped. So, with this in mind, let's look a bit at what kind of interface this would be in Haskell. First, let's consider the type of our computation:

computation :: a -> b

That is, we are taking some data of type a and returning some data of type b. The trick here, for those of you not well-versed in Haskell, is that computation may not do anything outside of these types; it cannot perform I/O or have any impure effects. (Okay, fine, I mean, you can perform horrid hacks to do those things, but remember our rules: No deep hackery here.)

Now, let's consider the type of the Twisted-style computation.

deferredComputation :: Deferred a -> Deferred b

That is, we're taking a Deferred containing data of type a, and returning a Deferred with data of type b.

Now, here's the fun part. We want to generate deferredComputation from computation, to achieve code reuse. How? Well, let's use fmap, since Deferred is a Functor!

computation :: a -> b

fmap :: Functor f => (a -> b) -> f a -> f b

deferredComputation :: Deferred a -> Deferred b
deferredComputation = fmap computation

Hey hey! Pretty cool, right? This might seem super-obvious to people with lots of Haskell experience, but I think it's still worth repeating since not everybody has done this sort of thing before.

And now we return to the land of Python. Python-land. It's time to construct this thing in Python, as well. So, how do we lift a function up into a Deferred in Python?

def computation(a):
    return b

def deferredComputation(deferred):
    deferred.addCallback(computation)
    return deferred

Think about this for a second. Remember, Deferred objects carry state around with them, so we need this "impure" sort of approach, which is really not actually impure but just object-at-a-time. If you're unsure of exactly what this snippet's doing, go through it one bit at a time.

  1. Take a Deferred which will fire with a value of type a.
  2. Append a callback which transforms a into b.
  3. Return a Deferred which will fire with a value of type b.

Now, let's make this concrete with an example. Let's say that we've got a system that has two implementations of a client, one which is synchronous, and one which is asynchronous. We've isolated and split out these clients such that they have exactly the same setup functions, and they return exactly the same data, with one single difference: One client is blocking and returns the data, and the other client is non-blocking and returns a Deferred which will fire with the data. This is exactly the difference that maybeDeferred can paper over. We've got all of the code set up just the way we want it, according to those conditions I listed earlier.

But! These clients only make up a couple dozen lines of code. There are still thousands of lines of code that only work with the synchronous client. How do we make them work with Twisted without losing our synchronous abilities?

Let's create some helper which will apply the computation to the data. This helper will come with the client and will be tailored to the client's output. For unlifted non-Twisted data, this is simply the classic builtin apply, known to Haskellers as ($):

($) :: (a -> b) -> a -> b
f $ a = f a
def apply(f, a):
    return f(a)

Note that my apply is not the Python apply builtin function, which does a slightly different thing if its argument is iterable.

And for the Deferred-handling case, let's create a slightly more interesting applier which will continue to move data through the Deferred. We already wrote this above, actually, and in Haskell, it would be fmap:

fmap :: Functor f => (a -> b) -> f a -> f b
def deferredApply(f, deferred):
    deferred.addCallback(f)
    return deferred

And now we're ready to put everything together! Here's a small skeleton:

class SyncClient(object):
    @staticmethod
    def applier(f, value):
        return f(value)

    def request(self, s):
        return sync_library_call(s)

class AsyncClient(object):
    @staticmethod
    def applier(f, deferred):
        deferred.addCallback(f)
        return deferred

    def request(self, s):
        return async_library_call(s)

def computation(data):
    transform(data)
    poke(data)
    return data

def request_and_compute(client, resource):
    data = client.request(resource)
    return client.applier(computation, data)

Look at request_and_compute. It has no idea whether it's handling synchronous or asynchronous data, and it doesn't really care; it asks the client to actually apply the computation to the data. And the computation itself is totally unaware of things going on around it. It doesn't even have to be pure; it could do all kinds of side effects with that data. The only requirement for the computation is that it must remember to return the data so that subsequent computations can access it.

This is the approach I'm taking in a new library I'm hacking together for Ganeti, called Gentleman. I think it'll work out well.

~ C.

Last modified on 2012-09-21 15:09:00

Be Prepared

I like Flask. No, really, I do. Yesterday, during a lightning talk, I claimed that I love it, and if I don't love it, then at least I love the form and function of it. I wouldn't marry it, since I don't think being married to a microframework for Web applications would provide any tax benefits. Maybe I'm getting off-topic?

Flask is built on Werkzeug, and directly uses Werkzeug routes for view lookup and URL building. As a result, anything that can be done to Werkzeug can be done to Flask. A little-known ability of Werkzeug is the ability to add new URL converters. An example and explanation is provided in the Werkzeug documentation on converters. I decided to build some cool converters which would automate some of the work I have to do when working with certain objects.

Without further ado, I would like to present ModelConverter, a class which can convert a segment of a URL representing a text field on a model into an instance of that model, and vice versa.

from __future__ import with_statement

from werkzeug.routing import BaseConverter, ValidationError

from sqlalchemy.orm.exc import MultipleResultsFound, NoResultFound

class ModelConverter(BaseConverter):
    """
    Converts a URL segment to and from a SQLAlchemy model.

    Rather than use an initializer, this class should be subclassed and
    have the `model` and `field` class attributes filled in. `model` is
    the Flask-SQLAlchemy model to use for queries, and `field` is the
    field on the model to use for lookups.

    The field to use should be Unicode or bytes.
    """

    def to_python(self, value):
        try:
            with self.app.test_request_context():
                obj = self.model.query.filter_by(**{self.field: value}).one()
        except (MultipleResultsFound, NoResultFound):
            raise ValidationError()

        return obj

    def to_url(self, value):
        return getattr(value, self.field)

This particular flavor uses an inheritance-based approach in order to avoid clobbering BaseConverter's initializer, but a compositional approach works too. A make_model_converter convenience method can provide the glue needed to specialize the converter. To apply it to the Flask application, merely modify the URL map after application creation:

from converters import make_model_converter
from models import Character

app = Flask(__name__)

app.url_map.converters["character"] = make_model_converter(app, Character,
    "slug")

And now you can create cool things along the lines of:

@app.route("/<character:c>")
def character(c):
    return render_template("character.html", c=c)

There is one caveat with this technique: the model instances retrieved this way will be detached from SQLAlchemy and the current session will not know about them. If you need to look up any lazily-loaded data on the models, you will need to add them to the current session first. For example, assuming Character.friends is a lazily-loaded one-to-many mapping:

@app.route("/<character:c>/friends")
def character_and_friends(c):
    db.session.add(c)
    return render_template("friends.html", c=c, friends=c.friends)

Today's snippets are all real-world snippets from DCoN, and can be seen in the converters.py and views.py source files.

~ C.

Last modified on 2012-02-15 20:34:00

Japanese Standing Cat

I recently purchased and installed a standing desk. While I normally don't blog about things in my personal life, I figured that this was permissible since it directly affects my ability to write code.

My standing desk, fully loaded

This standing desk is a Fredrik "computer work station," but let's be honest, here: it's a standing desk. It's pretty sturdy and removes the need to hack together various Ikea desks to produce reasonably-scaled tabletops. Pictured here is my desk, in its natural habitat. My workstation and hacked-apart AGP box both have their own monitor, and there is plenty of tabletop room for anything I need to have at my hands.

For those in the audience who aren't yet aware of standing desks, and don't want to read Wikipedia on standing desks, here are the abridged notes:

  • Pros
    • Reduces back stress and pain after a few months
    • Reduces risk of DVT
    • When monitors are elevated, reduces neck stress and pain
  • Cons
    • Slightly more expensive than sitting desks
    • Hard to find
    • Causes foot and ankle stress for the first few weeks
    • Turns one into a hipster programmer

So, on the bright side, this desk has definitely improved my back pain. But, if I start writing Node.js code soon, we'll all know who to blame.

~ C.

Last modified on 2012-01-31 11:42:00

Valid CSS!