25 April 2021

Monads, part two: Okay, I guess, but what is it for?

Monads are abstract machines

Hopefully, after reading parts zero and one of this series, you understand what the monad abstraction is and how bind works. What may not be so clear yet is why anyone would ever want to work with a monad. In other words, what can you do with a monad? More importantly, what can you do with a monad that you cannot do more easily without one? This is what I'm going to try and clarify in this post.

Broadly speaking, from a software engineering perspective, a monad lets you define an abstract machine that implements a different model of computation. That may not sound so clear, so let's quickly review our three examples from last week in that light:

  • The Maybe monad changes our model of computation to add short-circuiting.
  • The Logger monad adds debugging traces.
  • The Fiber monad adds parallel execution.

To drive this point home, let's briefly look at the State monad, which, as its name suggests, adds the notion of ambient mutable state:

import java.util.function.Function;

public final class State<S, A> {
    // Simple pair class to allow us to return two values.
    public static final class P<S, A> {
        public final S s;
        public final A a;
        private P(S s, A a) {
            this.s = s;
            this.a = a;
        }
        public String toString() {
            return "P[" + s + ", " + a + "]";
        }
    }

    // Internal function. Note that this is NOT the function users `bind`.
    private final Function<S, P<S, A>> run_state;

    // Private constructor so we control all the possible values for
    // `run_state` (`pure`, `get`, `put`).
    private State(Function<S, P<S, A>> run_state) {
        this.run_state = run_state;
    }

    public <B> State<S, B> bind(Function<A, State<S, B>> f) {
        return new State<>(s0 -> {
            P<S, A> p = this.run_state.apply(s0);
            return f.apply(p.a).run_state.apply(p.s);
        });
    }

    public static <S, A> State<S, A> pure(A a) {
        return new State<>(s -> new P<>(s, a));
    }

    public static <S> State<S, S> get() {
        return new State<>(s -> new P<>(s, s));
    }

    public static <S> State<S, Void> put(S new_state) {
        return new State<>(_s -> new P<>(new_state, null));
    }

    public static <S, A> P<S, A> run(State<S, A> m, S init) {
        return m.run_state.apply(init);
    }

    // -- end state monad abstraction, start example usage

    public static final class Stack {
        private final Integer head;
        private final Stack tail;
        public Stack(Integer h, Stack t) {
            this.head = h;
            this.tail = t;
        }
        public String toString() {
            Stack s = this.tail;
            StringBuilder sb = new StringBuilder();
            sb.append("Stack(" + this.head);
            while (s != null) {
                sb.append(", " + s.head);
                s = s.tail;
            }
            sb.append(")");
            return sb.toString();
        }
    }


    public static State<Stack, Void> push(Integer a) {
        return State.<Stack>get().bind(old_stack ->
               put(new Stack(a, old_stack)));
    }

    public static State<Stack, Integer> pop() {
        return State.<Stack>get().bind(old_stack ->
               put(old_stack.tail).bind(_1 ->
               pure(old_stack.head)));
    }

    public static void main(String[] args) {
        State<Stack, Void> stack_computation =
            pop()      .bind(a ->
            pop()      .bind(b ->
            push(a + b)));

        Stack init_1 = new Stack(10, new Stack(20, null));
        Stack init_2 = new Stack(13, new Stack(256, null));

        System.out.println(run(stack_computation, init_1));
        System.out.println(run(stack_computation, init_2));
    }
}

The important points about that code sample are that:

  1. There is not a single mutation in there (ignoring the StringBuilder part), and yet
  2. The stack operations sure look like they mutate some global state.

Squinting a bit, the stack_computation variable represents a computation that could almost be written as:

a = pop();
b = pop();
push(a + b);

Note once again that, because of the third monadic law, it is perfectly acceptable to define "higher-level" operations, such as push and pop here being defined in terms of get and set. Also note that get and set are very specific to this State class, depend on its implementation details, and have nothing to do with the "Monad" abstraction itself.

It's also important to note that, while it's not part of the monad abstraction either, the run function (get in Maybe, getValue for Logger, exec in Fiber) is an integral part of making anything useful with this State class. In a sense, nothing observable happens until you try and take a value out of a monad.

Who needs an abstract machine?

So, monads define abstract machines. In order to go from there to "should I use monads?", we need to fill in a couple more missing pieces:

  • What's an abstract machine and why should I want one?
  • What are other ways of implementing an abstract machine?
  • How do I recognize problems that can be solved by an abstract machine?

So let's try to fill in those gaps. In this context, an abstract machine is essentially a different programming language. We've seen how to add tracing, short-circuiting, concurrency, and mutation to a purely functional subset of Java. There are many other possibilities; as a simple example, the List monad allows one to add non-determinism.

If you don't have monads, and you want a language with different semantics, you have a few options:

  • Your language has macros. You can define a new embedded language with the semantics you want using macros. Macro programming is usually a bit of a leaky abstraction, so the implementation can get quite messy, but it's usually still possible to get a nice integration with the host language for users of that embedded abstract machine. A great example here is how the go macro from core.async adds concurrency to the Clojure language.
  • Your language has a useable syntax for literal, nestable data structures (e.g. list, array, vector, etc.); you can define your new language in terms of those data structure. This lets you write your interpreter directly in the host language, while also letting you use the host language to generate pieces of your new language. You gain on not needing to parse (or learn) a new syntax, but writing code in your new language feels dinstinctly different.
  • Your language has dynamic late binding, enabling metaprogramming. A good example of this is the Rails framework for Ruby.
  • In some cases, especially if you have none of the above, your best bet is to implement a new programming language.

Comparatively, monads are an absurdly cheap way to define a new mini-language, which means you can use that technique much more freely and, once you master monads, you will start seeing way more opportunities to use it.

How cheap? Well, let's look back at that State class above. In those 90 lines of code, we define a mini-language to invent ambient mutable state, and then we build a second mini language on top of that that simulates a stack machine. That's two languages in less than an hundred lines (of Java, which isn't the most terse language around).

So when should you use monads? When you want different programming language semantics, but not enough that you'd be willing to implement a new language.

To a large extent, how much monads seem useful to you will depend on which language you use, and how much you value purity. If your language has freely-available side-effects everywhere and you're not bothered by that, you probably don't see much of a point in using monads to add ambient "mutable" state, tracing, of short-circuiting, as all of those can be achieved easily (in the small) with (real) mutable state. You may also already have a good enough concurrency model in your language.

If you're not already convinced that functional programming is at least interesting, and you're not striving to write most of your code using immutable values, it is likely you will not find much value in using monads.

How do I start?

When trying to learn functional programming, a good place to start is to look for opportunities to use map, filter and reduce. "Real" purely functional code is not made up only of those, but if you're currently doing mostly imperative code, looking for ways to use these three will flex the right muscles and get you on the path towards learning more functional techniques.

In the same way, if you want to start looking at the world through the monadic lens, you can try to spot places where you can use the State monad. An easy place to start is recursive functions (or groups of functions) that thread some amount of state through, but where most functions don't actually use it and only pass it on to the rest of the call stack.

The most obvious case would be if you have a handful of functions that are already defined as returning pairs of (state, value) instead of just their return value.

What does it cost?

Let's assume the preceding blogs and sections so far have you convinced that monads are interesting and you want to start using them. Should you add them to your current project?

Like most things in software engineering, or any engineering discipline for that matter, the layman's answer is "it depends", while the professional answer is a slightly more elaborate "you should use it when the benefits outweigh the costs". As an engineering discipline, software engineering is fundamentally unique in that, while we mostly follow the same cost/benefit analysis as other engineering disciplines, we do away with the nasty, unnecessary and complicated step of actually trying to objectively measure costs and benefits, and we just go with gut feeling instead, with the vague hope that through enough experience we can shape our guts to eventually get good enough feelings.

In that spirit, let me try to guide your gut through a few key points to consider when evaluating the cost of using monads in a given project.

There are mainly three sources of costs for monadic abstractions, and all of them vary wildly depending on which language you use (and by extension who you work with):

  1. Conceptual overhead,
  2. Syntactic overhead, and
  3. Implementation overhead.

The first hurdle is how much the set of people who are expected to review and maintain the code you are about to write are themselves familiar with monads. I would argue that monadic code is not fundamentally more complex than non-monadic code, just like functional code is not fundamentally more complex than imperative code, but it is definitely a different style and lack of familiarity can get in the way of understanding.

Different monads can have different levels of complexity, and while you can build additional abstractions on top of some monads (e.g. monad transformers), if you and your team are not very familiar with them yet it may be best to just start with a simple one, like Maybe,1 while everyone gets familiar with that new way to structure code.

The second hurdle is the syntactic overhead. This is extremely dependent on the language you are using. In Java, as demonstrated, it's pretty bad. Let us look back at the monadic code in the main method of our State code sample:

        State<Stack, Void> stack_computation =
            pop()      .bind(a ->
            pop()      .bind(b ->
            push(a + b)));

In Haskell, while you could translate the above code directly to explicit >>= calls, yielding:

  pop >>= (\a ->
  pop >>= (\b ->
  push (a + b)))

you can also use special syntax designed just for monads, which reads a lot better:

  do
    a <- pop
    b <- pop
    push $ a + b

in particular:

  • There is no need to close all those parentheses at the end.
  • Variable names precede their binding, just like normal assignment.
  • Unused bindings can simply be omitted (as opposed to Java, which forces us to not only declare them but give them different names).

Scala similarly has special syntax in its for instruction that can be used to implement monads in a fairly (syntactically) lightweight manner. Other languages may have more or less syntactic noise and tools to manage it. Point is, the cost here depends a lot on what language you're using.

Defining monads can also be more or less short depending on the language; as an example, the State monad defined above could be written in Haskell as:

module Main where

import Control.Monad (liftM, ap)

newtype State s a = State (s -> (s, a))

-- magic incantation to get `do` syntax
instance Functor (State s) where fmap = liftM
instance Applicative (State s) where pure = return; (<*>) = ap

instance Monad (State s) where
  return a = State $ \s -> (s, a)
  (State ma) >>= f = State $ \s0 ->
    let (s1, a) = ma s0
    in let State (g) = f a
    in g s1

put :: s -> State s ()
put s = State $ \_ -> (s, ())

get :: State s s
get = State $ \s -> (s, s)

run :: State s a -> s -> (s, a)
run (State f) a = f a

data Stack = Empty | Stack { top :: Int, rest :: Stack } deriving Show

push :: Int -> State Stack ()
push i = do
  old_stack <- get
  put (Stack i old_stack)

pop :: State Stack Int
pop = do
  old_stack <- get
  put (rest old_stack)
  return (top old_stack)

main :: IO ()
main = do
  let stack_computation =
        do
          a <- pop
          b <- pop
          push (a + b)
  let init_1 = Stack 10 (Stack 20 Empty)
  let init_2 = Stack 13 (Stack 256 Empty)

  print $ run stack_computation init_1
  print $ run stack_computation init_2

The final source of cost for monads is the implementation, both in terms of the effort to produce it and the runtime performance overhead. Monads are not (generally) free, though this is by far the hardest one of the three costs to speak of in a general sense. So I'm not going to try. If you're worried about performance, write prototypes and use profilers.


  1. Actually, don't ever use Maybe, use Either instead. It's pretty much the same, except the none constructor can carry a value, typically used to contain a possibly-structured error message of some sort.

Tags: monad-tutorial