Monte: Typhon and SmallCaps

Two years of having a real job have made me bitter and grumpy, so I'm gonna stop with the cutesy blog post titles.

Today, I'm gonna talk a bit about Monte. Specifically, I'm going to talk about Typhon's current virtual machine. Typhon used to use an abstract syntax tree interpreter ("AST interpreter") as its virtual machine. It now uses a variant of the SmallCaps bytecode machine. I'll explain how and why.

When I started designing Typhon, I opted for an AST VM because it seemed like it matched Monte's semantics well. Monte is an expression language, which means that every syntactic node is an expression which can be evaluated to a single value. Monte is side-effecting, so evaluation must happen in an environment which can record the side effects. An AST interpreter could have an object representing each syntactic node, and each node could be evaluated within an environment to produce a value.

class Node(object):

    def evaluate(self, environment):
        pass

Subclasses of Node can override evaluate() to have behavior specific to that node. The Assign class can assign values to names in the environment, the Object class can create new objects, and the Call class can pass messages to objects. This technique is amenable to RPython, since Node.evaluate() has the same signature for all nodes.

How well does this work with the RPython just-in-time ("JIT") compiler? It does alright. The JIT has little problem seeing that the nodes are compile-time constant, and promotion applied strategically allows the JIT to see into things like Call nodes, which cause methods to be inlined relatively well. However, there are some problems.

The first problem is that Monte has no syntactic loops. Since RPython's JIT is designed to compile loops, this means that some extra footwork is needed to set up the JIT. I took Monte's looping object, __loop, and extended it to detect when its argument is likely to yield a JIT trace. The detection is simple, examining whether the loop body is a user-defined object and only entering the JIT when that is the case. In theory, most bodies will be user-defined, since Monte turns this:

object SubList:
    to coerce(specimen, ej):
        def [] + l exit ej := specimen
        for element in l:
            subguard.coerce(element, ej)
        return specimen

…into this:

object SubList:
    method coerce(specimen, ej):
        def via (__splitList.run([0])) [l] exit ej := specimen
        var validFlag__6 := true
        try:
            __loop.run([l, object _ {
                method run(_, value__8) {
                    __validateFor.run([validFlag__6])
                    subguard.coerce([value__8, ej])
                    null
                }
            }])
        finally:
            validFlag__6 := false
        specimen

This example is from Typhon's prelude. The expansion of loops, and other syntax, is defined to turn the full Monte language into a smaller language, Kernel-Monte, which resembles Kernel-E. Now, in this example, the nonce loop object generated by the expander is very custom, and probably couldn't be simplified any further. However, it is theoretically possible that an optimizer could detect loops that call a single method repeatedly and simplify them more aggressively. In practice, that optimization doesn't exist, so Typhon thinks that all loops are user-defined and allows the JIT to trace all of them.

The next hurdle has to do with names. Monte's namespaces are static, which means that it's possible to always know in advance which names are in a scope. This is a powerful property for compilers, since it opens up many kinds of compilation and lowering which aren't available to languages with dynamic namespaces like Python. However, Typhon doesn't have access to the scoping information from the expander, only the AST. This means that Typhon has to redo most of the name and scope analysis in order to figure out things like how big namespaces should be and where to store everything. I initially did all of this at runtime, in the JIT, but it is very slow. This is because the JIT has problems seeing into dictionaries, and it cannot trust that a dictionary is actually constant-size or constant-keyed.

RPython does have an answer to this, called virtual and virtualizable objects. A virtual object is one that is never constructed within a JIT trace, because the JIT knows that the object won't leave the trace and can be totally elided. (The literature talks at length of "escape analysis", the formal term for this.) Typhon's AST nodes occasionally generated virtual objects, but only rarely, because most objects are assigned to names in the environment, and the JIT refuses to ignore modifications to the environment.

Virtualizable objects solve this problem neatly. A virtualizable object has one or more attributes, called virtualizable attributes, which can be "out-of-sync" during a JIT trace. A virtualizable can delay being updated, as long as it's updated at some point before the end of the JIT trace. RPython allows fields of integers and floating-point values to be virtualizable, as well as constant-size lists. However, Typhon uses mappings of names for its environments, backed by dictionaries, and dictionaries can't be virtualizable.

The traditional solution to this problem involves assigning an index to every name within a scope, and using a constant-size list with the indices. I did this, but it was arduous and didn't have the big payoff that I had wanted. Why not? Well, every AST node introduces a new scope! This caused scope creation to be expensive. I realized that I needed to compute closures as well.

Around this time, a few months ago, I began to despair because debugging the AST VM is hard. RPython's JIT logging and tooling is all based around the assumption that there is a low-level virtual machine of some sort which has instructions and encapsulation, and the AST was just too hard to manage in this way. I had had to invent my own tracebacks, my own logging, and my own log viewer. This wasn't going well. I wanted to join the stack-based VM crew and not have piles of ASTs to slog through whenever something wasn't going right. So, I decided to try to implement SmallCaps, from E. E, of course, is the inspiration for Monte, and shares many features with Monte. SmallCaps was based on old Smalltalk systems, but was designed to work with unique E features like ejectors.

So, enough talk, time for some code. First, let's lay down some ground rules. These are the guiding semantics of SmallCaps in Typhon. Keep in mind that we are describing a single-stack automaton with a side stack for exception handling and an environment with frames for local values and closed-over "global" values.

  • All expressions return a value. Therefore, an expression should always compile to some instructions which start with an empty stack and leave a single value on the stack.
  • All patterns perform some side effects in the environment and return nothing. Therefore, they should compile to instructions which consume two values from the stack (specimen and ejector) and leave nothing.
  • When an exception handler is required, every handler must be dropped when it's no longer needed.

With these rules, the compiler's methods become very obvious.

class Str(Node):
    """
    A literal string.
    """

    def compile(self, compiler):
        index = compiler.literal(StrObject(self._s))

Strings are compiled into a single LITERAL instruction that places a string on the stack. Simple enough.

class Sequence(Node):
    """
    A sequence of nodes.
    """

    def compile(self, compiler):
        for node in self._l[:-1]:
            node.compile(compiler)
            compiler.addInstruction("POP", 0)
        self._l[-1].compile(compiler)

Here we compile sequences of nodes by compiling each node in the sequence, and then using POP to remove each intermediate node's result, since they aren't used. This nicely mirrors the semantics of sequences, which are to evaluate every node in the sequence and then return the value of the ultimate node's evaluation.

This also shows off a variant on the Visitor pattern which Allen, Mike, and I are calling the "Tourist pattern", where an accumulator is passed from node to node in the structure and recursion is directed by each node. This makes managing the Expression Problem much easier, since nodes completely contain all of the logic for each accumulation, and makes certain transformations much easier. More on that in a future post.

class FinalPattern(Pattern):

    def compile(self, compiler):
        # [specimen ej]
        if self._g is None:
            compiler.addInstruction("POP", 0)
            # [specimen]
        else:
            self._g.compile(compiler)
            # [specimen ej guard]
            compiler.addInstruction("ROT", 0)
            compiler.addInstruction("ROT", 0)
            # [guard specimen ej]
            compiler.call(u"coerce", 2)
            # [specimen]
        index = compiler.addFrame(u"_makeFinalSlot")
        compiler.addInstruction("NOUN_FRAME", index)
        compiler.addInstruction("SWAP", 0)
        # [_makeFinalSlot specimen]
        compiler.call(u"run", 1)
        index = compiler.addLocal(self._n)
        compiler.addInstruction("BINDSLOT", index)
        # []

This pattern is compiled to insert a specimen into the environment, compiling the optional guard along the way and ensuring order of operations. The interspersed comments represent the top of stack in-between operations, because it helps me keep track of how things are compiled.

With this representation, the Compiler is able to see the names and indices of every binding introduced during compilation, which means that creating index-based frames as constant-size lists is easy. (I was going to say "trivial," but it was not trivial!)

I was asked on IRC about why I chose to adapt SmallCaps instead of other possible VMs. The answer is mostly that SmallCaps was designed and implemented by people that were much more experienced than me, and that I trust their judgement. I tried several years ago to design a much purer concatenative semantics for Kernel-E, and failed. SmallCaps works, even if it's not the simplest thing to implement. I did briefly consider even smaller semantics, like those of the Self language, but I couldn't find anything expressive enough to capture all of Kernel-E's systems. Ejectors are tricky.

That's all for now. Peace.

~ C.

Last modified on 2015-02-14 13:51:00

Valid CSS!