Two Great Tastes that Taste Great Together
Monads are, in a way, all about sequences of operations which depend on data living "inside" a data structure and produce instances of that data structure with new data inside. For instance, the list monad sequences operations which depend on things inside a list, and which produce lists themselves. The state monad sequences operations which depend, in a sense, on data living inside a state function and result in new state functions. The "maybe" monad sequences operations depending on values which produce "maybe" values (like (Just x) or (Nothing)).
Think about this enough and you start to look for monads whenever you deal with any kind of data structure. Recently, I've thought a lot about hygienic macros, of which the key component is "syntax", a kind of decorated representation of source code. If you write simple macros that don't do anything like variable capture, you basically don't have to worry about the distinction between syntax and datum, but when you start selectively messing up hygiene, it becomes important.
And you start to notice patterns. Specifically, you are often "unpacking" a piece of syntax, doing something to it, and then "resyntaxing" it when you are done. Can we conceptualize syntax as a monad? If we do so, do we get anything? It is useful? I suspect for many seasoned readers the first is obvious: any data structure which "holds things" has an associated monad. We'll see about the second - I actually don't know myself, and expect to find out as I write this post.
We'll be operating in my Scheme of Choice, Racket and we'll use my
better-monads. To define a monad in
you simply create an instance of the
monad struct, defined as:
(struct monad (bind return plus zero) #:transparent)
zero are optional, and you set them to
#f if you can't
provide meaningful values. That is:
(require functional/better-monads) (define the-id-monad (monad (lambda (v f) (f v)) (lambda (v) v) #f #f))
Creates the identity monad. We can then use it in an
(mlet* in: the-id-monad ((x 10) (y (+ x 11))) (return (+ x y)))
Which will be 31, because the identity monad doesn't do anything at all. The library defines and exports some common monads:
(mlet* in: the-list-monad ((x '(a b c)) (y '(q r s))) (return (list x y)))
'((a q) (a r) (a s) (b q) (b r) (b s) (c q) (c r) (c s))
If you aren't down with monads, read my tutorials on using them in Lisps. I'm told they are helpful.
The Syntax Monad
The easiest place to start with a monad is with
return, since return
is our simplest monadic function. That is, there is a family of
functions which accept values and produce monadic values and return is
often one of the simplest of these functions. What is return in the
The super naive but technically correct answer is:
(define (syntax-return item) (datum->syntax (syntax _) item))
This takes an item, any kind of datum, and wraps it into a
however, a "seed" syntax object from which to copy the syntactic
information that helps maintain (or violate) hygiene. This version of
return just creates a new
syntax object with
(syntax _) as the
seed, but this isn't very useful. We are almost always trying to
create a syntax object which borrows someone else's scope information,
say to capture a variable.
If this were Haskell, we might have to live with this sub-optimal
solution (you don't really need
return as far as I can tell, its
just a handy way to put things into the monad). However, we're using
Scheme and that means we can take advantage of our greater flexibility
to write this version:
(define (syntax-return item . rest) (match rest [(list) (syntax-return item (syntax _))] [(list seed) (datum->syntax seed item)]))
This function takes an optional second argument which can specify the
seed if need be.
better-monads doesn't care about the fact that
this return has a different signature than other returns. All that
matters is that everything works out at runtime.
Bind is often a sticking point but in the syntax monad it is very
easy to understand:
(define (syntax-bind stx stx<datum>) (stx<datum> (syntax->datum stx)))
Here we use a little notational hint I use in my code. The
part of the symbol
stx<datum> indicates that
stx<datum> is a
function waiting for a single value, the
datum, to produce
something, namely a piece of syntax (hence the
stx). The value
stx (without any
<>) is a syntax object.
Bind simply extracts
the value in the syntax using
syntax->datum and calls the function
stx<datum> on that value. The rules of our monad dictate that the
result must be some syntax, but this (again) is enforced at run-time,
not by any type-checker.
We can define our monad in two ways, depending on the behavior of
return we want. The first is:
(require functional/point-free) (define (the-syntax-monad<stx> stx) (monad syntax-bind (partial< syntax-return stx) #f #f))
This monad is parametrized by the syntax we want to use as the seed
return operations. We accomplish this by partially applying
syntax-return on the right with the value
stx passed in during the
partial< means "partially apply on the right" (as
>partial, which means "on the left") and lives in my
As I understand it, this kind of "value parametrized monad" is difficult to represent in Haskell. I've read that the "set monad" is hard to come up with for just this reason. In scheme, you just parametrize the set monad by whatever predicate you want to constitute equality.
We can also write a regular old monad:
(define the-syntax-monad (monad syntax-bind syntax-return #f #f))
Here we just expect the user to provide an appropriate seed
syntax-object when they call
return. I am almost certainly sure you
can't do this in Haskell without jumping through big hoops. Not
really a Haskell programmer, though, so its possible I'm wrong.
But Will it Blend?
Ok, so this is cute. We've got a monad over
syntax but can we use
it to do anything? Well, one application that springs to mind is
that we can use it to defined syntax-oriented versions of standard
(define (syntax-car stx) (mlet* in: (the-syntax-monad<stx> stx) ((contents stx)) (return (car contents))))
Or, using the more convenient notion of lifting a function into a monad:
(define (syntax-car stx) ((lift1 car (the-syntax-monad<stx> stx)) stx))
For functions of two arguments, this looks like:
(define (syntax-cons hd tl) ((lift2 cons (the-syntax-monad<stx> tl)) hd tl))
We can use these functions like so:
(syntax-car (syntax a b c d)) ;-> (syntax a)
(syntax-cons (syntax a) (syntax b c d)) ;-> (syntax a b c d)
In the second case, though, its important to realize that the syntax
a has its syntactic metadata replaced with the meta-data for the
tail, that is
(syntax b c d ). If we want
a to keep its own
meta-data, we should write:
(define (syntax-cons hd tl) (mlet* in: the-syntax-monad ((tl-contents tl)) (return (cons a tl-contents) tl)))
We could define this in a point free way (if we wanted to show off), as such:
(define (syntax-cons hd tl) ((lift1 (>partial cons hd) (the-syntax-monad<stx> tl)) tl))
There are other various and sundry things which it might be easier to do/define using the syntax monad, but without using it a great deal, I'll just say that it seems to blend ok.
We've shown how to define and use a
syntax monad for Scheme syntax
objects. The essence of the implementation is the realization that
syntax, as a kind of wrapper over regular Scheme data, is enough like
a container that we can write a monad over it, much like we can create
a list monad over lists.
The utility of the monad seems to be limited by the fact that syntax containers aren't just dumb boxes around pieces of data, but contain important meta-data which is intimately related to their intended use. It isn't clear to me whether, given this fact, really using the syntax monad would ever be a win, but who knows? At the very least it is an interesting example of what I guess you could call a monad parametrized over a run time value, which is an interesting idiom, anyway.
All this code is up on my git repository.