This post is part of a series on monads. In the previous post, I presented a technique for implementing new, bespoke monads and applied that technique to well-known, standard ones. In this post, I am going to show two examples of bespoke monads I have recently crafted using this technique, comparing code with and without a monad for the same use-cases and walking through the process of defining the monadic operations.

In my post about genetic search, I presented code that used a monad to thread through the random number generator. I did not, however, show the code without a monad or explain how to come up with it. Let's think through how that would work. Let's first define a Rng type so we get some type safety rather than just passing around a [Double] everywhere. What API does it need? We'll ignore its initialization here, which means it really just needs one function: produce a random number. Because we don't have side-effects, this same function needs to return the new "state" of the random number generator (rng) so we can get another one next time. Otherwise, we'd get a constant random number, which is not the most useful kind.

data Rng = Rng [Double]

rnd :: Rng -> (Rng, Double)
rnd (Rng ds) = (Rng (tail ds), head ds)


With that, we can write a few of the functions used in that blog post. In the interest of space, we're not going to rewrite the entire thing here1, but just look at three functions. Here they are:

make_solution :: Rng -> (Rng, (Double, Double))
make_solution rng_0 =
let (rng_1, rand_x) = rnd rng_0
(rng_2, rand_y) = rnd rng_1
in (rng_2, (rand_x * 10, rand_y * 10))

mutate :: Rng -> (Double, Double) -> (Rng, (Double, Double))
mutate rng_0 (x, y) =
let (rng_1, change_x) = rnd rng_0
(rng_2, dx) = rnd rng_1
(rng_3, change_y) = rnd rng_2
(rng_4, dy) = rnd rng_3
new_x = if change_x < 0.1 then x + dx - 0.5 else x
new_y = if change_y < 0.1 then y + dy - 0.5 else y
in (rng_4, (new_x, new_y))

step :: Rng -> [(solution, Double)] -> (Rng, [(solution, Double)])
step rng_0 prev =
let survivors = take 10 prev ++ take 3 (reverse prev)
(rng_1, children) = rep rng_0 87 (\rng_0 ->
let (rng_1, parent1) = carousel rng_0 prev
(rng_2, parent2) = carousel rng_1 prev
(rng_3, child) = crossover rng_2 (fst parent1) (fst parent2)
(rng_4, child') = mutate rng_3 child
in (rng_4, fit child'))
in (rng_1, srt (survivors <> children))


And here are the monadic versions for reference:

make_solution :: WithRandom (Double, Double)
make_solution = do
rand_x <- GetRand
rand_y <- GetRand
return (rand_x * 10, rand_y * 10)

mutate :: (Double, Double) -> WithRandom (Double, Double)
mutate (x, y) = do
change_x <- GetRand
dx <- GetRand
change_y <- GetRand
dy <- GetRand
let new_x = if change_x < 0.1 then x + dx - 0.5 else x
let new_y = if change_y < 0.1 then y + dy - 0.5 else y
return (new_x, new_y)

step :: [(solution, Double)] -> WithRandom [(solution, Double)]
step prev = do
let survivors = take 10 prev ++ take 3 (reverse prev)
children <- rep 87 (do
parent1 <- carousel prev
parent2 <- carousel (parent1 Data.List.delete prev)
child <- crossover (fst parent1) (fst parent2)
fit <$> mutate child) return$ srt $survivors <> children  It's not a landslide change, and it's really easy to map individual lines between both styles. But if you have to edit one of them, in which one do you think it's the least likely you'll make a mistake in threading through the state of the random number generator? Which one is more work? If you have to add a step to the non-monadic version, will you have the patience to increment all the rng_x variables, or will you go for a rng_2' instead of rng_3 in the middle? How does that approach scale in number of lines of code? How confident are you the non-monadic step function is correct as presented? Is it easy to check at a glance whether the rng_1 on the last line is related to the one set alongside parent1? Assuming we agree the monadic style is better in this case, how do we get there? The structure of the monad is going to be determined by the operations we want to have available; in this case, we have just one operation needed: get the next random number. This operation depends only on the (implicit) state of the monad, meaning we don't need to pass it any argument, and the goal is to get back a random number, so we return a Double. This yields the monadic structure: data WithRandom a where Bind :: WithRandom a -> (a -> WithRandom b) -> WithRandom b Return :: a -> WithRandom a GetRand :: WithRandom Double  What about its interpretation? Building up a monad is all well and good, but we need a run function to give it meaning. In this case, all we really need to do is thread through a [Double], which is reminiscent of the state monad, only we're not trying to be as general. This means we can just copy over the Bind and Return case from the state monad and just make sure we thread the [Double] through; it also means our run function needs to take a [Double] as its initial state. This yields: run :: [Double] -> WithRandom a -> ([Double], a) run rands = \case Return a -> (rands, a) Bind ma f -> let (rands', a) = run rands ma in run rands' (f a) GetRand -> ???  What do we need to do with the GetRand case? Well, we need to do two things: 1. Return the next random number, and 2. Make sure we update the state so the next number is not the same. This, of course, is pretty much what the rnd function did, and is trivial to implement: run :: [Double] -> WithRandom a -> ([Double], a) run rands = \case Return a -> (rands, a) Bind ma f -> let (rands', a) = run rands ma in run rands' (f a) GetRand -> (tail rands, head rands)  The final step is to update the code to use this monad. The process is fairly easy: any function that used to take in a Rng and return a (Rng, a) now takes one fewer argument and returns a WithRandom a. As for the body of the function, simply replace all (rng_x, a) pattern matches with a <-, remove the rng parameters to function calls, and replace rnd with GetRand. (Or implement rnd as rnd = GetRand if you prefer lowercase actions.) Hopefully this example reiterates the points of the previous post: 1. Making a new monad is easy. 2. The monadic form of the code is better, because computers are better than humans at keeping track of tedious things like not accidentally losing a state update. 3. Switching code to monadic form is, in this case, pretty easy too; almost mechanical. Next, let's look at a more complex example. ### Interpreter My second example is an interpreter (see full code for reference). As I described monads as mini-language interpreters, it should come as no surprise that they can be useful in writing actual interpreters. Let's start with the language we want to interpret. We're skipping over the text-to-ast parsing phase and we define the language directly in terms of the following data definitions: newtype Name = Name String deriving (Eq, Ord, Show) newtype Value = Value Int deriving (Eq, Show) data Op = Add | Sub | Mul | NotEq deriving Show data Exp where Lit :: Value -> Exp Var :: Name -> Exp Set :: Name -> Exp -> Exp Bin :: Op -> Exp -> Exp -> Exp Do :: [Exp] -> Exp While :: Exp -> Exp -> Exp Print :: Exp -> Exp deriving Show  This is a very simple, imperative language, with a looping construct and arithmetic operations, as well as variable names. The values are always integers. As an example program, here is some code to compute a factorial: fact :: Int -> Exp fact x = let acc = Name "acc" i = Name "i" in Do [ Set acc (Lit (Value 1)), Set i (Lit (Value x)), While (Bin NotEq (Lit (Value 1)) (Var i)) (Do [ Set acc (Bin Mul (Var acc) (Var i)), Set i (Bin Sub (Var i) (Lit (Value 1))), Print (Var acc) ]) ]  Or, more precisely, this is a function which, given an integer, returns a program that will print each step of computing its factorial. We can easily evaluate this without any monad, by just walking down the tree of expressions and evaluating each node as we come back up. Let's first define what evaluation means here: the language has a Print operation, meaning we need to consider side effects. It also has Set on its variables, meaning we need to carry around some sort of environment. Fortunately, this language only has one, global scope (no let or lambda), so we do not need to worry about nesting environments. For the purpose of this blog post, we choose to represent the side-effects of the program by returning a value of type: data TweIO = Output Int TweIO | Halt deriving Show append :: TweIO -> Value -> TweIO append Halt (Value v) = Output v Halt append (Output p io) v = Output p (append io v)  and we'll assume that the environment has this API: newtype Env = Env (Data.Map Name Value) deriving Show mt_env :: Env mt_env = Env Data.Map.empty lookup :: Env -> Name -> Value lookup (Env m) n = maybe undefined id (Data.Map.lookup n m) insert :: Env -> Name -> Value -> Env insert (Env m) n v = Env$ Data.Map.insert n v m


which, among other things, implies we assume all programs are correct and setup their variables before trying to read them.

With that out of the way, a simple tree-walking evaluator can be written as:

tree_walk_eval :: Exp -> TweIO
tree_walk_eval ex =
let (_, io, _) = loop ex Halt mt_env
in io
where
loop :: Exp -> TweIO -> Env -> (Value, TweIO, Env)
loop exp0 out0 env0 =
case exp0 of
Lit v -> (v, out0, env0)
Var n -> (lookup env0 n, out0, env0)
Set n exp1 -> let (v, out1, env1) = loop exp1 out0 env0
in (v, out1, insert env1 n v)
Bin op e1 e2 ->
let (v1, out1, env1) = loop e1 out0 env0
(v2, out2, env2) = loop e2 out1 env1
in ((bin op) v1 v2, out2, env2)
Do (exps) -> foldl (\(_, out1, env1) exp1 -> loop exp1 out1 env1) (undefined, out0, env0) exps
While condition body ->
let (Value c, out1, env1) = loop condition out0 env0
in if c == 1
then do
let (_, out2, env2) = loop body out1 env1
loop (While condition body) out2 env2
else (undefined, out1, env1)
Print exp1 -> let (v, out1, env1) = loop exp1 out0 env0
in (v, append out1 v, env1)


Here, each iteration of the loop returns a triple (Value, TweIO, Env), where:

• Value is the result of the (presumably nested) expression we evaluated,
• TweIO is the output produced so far, which we need to keep track of, and
• Env is ambient state the nested call might have changed.

At this point, and especially in the context of this blog post, I'm hoping this type of structure is reminiscent of a monad. Specifically, it looks like we are trying to combine two "classic" monads: a state monad to keep track of the environment, and a logger monad to keep track of the side-effects.

So let's define a monad. The operations we need here are:

• Print something.
• Change a variable in the environment.
• Get the value of a variable from the environment.

This translates directly to a monadic structure:

data EvalExec a where
EvalBind :: EvalExec a -> (a -> EvalExec b) -> EvalExec b
EvalReturn :: a -> EvalExec a
EvalPrint :: Value -> EvalExec ()
EvalSet :: Name -> Value -> EvalExec ()
EvalLookup :: Name -> EvalExec Value


And the semantics are given by:

exec :: (Env, TweIO) -> EvalExec a -> (Env, TweIO, a)
exec (env_0, io_0) = \case
EvalBind ma f -> let (env_1, io_1, a) = exec (env_0, io_0) ma
in exec (env_1, io_1) (f a)
EvalReturn v -> (env_0, io_0, v)
EvalLookup n -> (env_0, io_0, lookup env_0 n)
EvalPrint v -> (env_0, append io_0 v, ())
EvalSet n v -> (insert env_0 n v, io_0, ())


With that monad defined, we can write another evaluation function using it:

twe_mon :: Exp -> TweIO
twe_mon exp =
let (_, io, _) = exec (mt_env, Halt) (eval exp)
in io
where
eval :: Exp -> EvalExec Value
eval = \case
Lit v -> return v
Var n -> do
v <- EvalLookup n
return v
Set n exp -> do
v <- eval exp
EvalSet n v
return v
Bin op e1 e2 -> do
v1 <- eval e1
v2 <- eval e2
return $(bin op) v1 v2 Do exps -> do mapM eval exps return undefined While condition body -> do c <- eval condition if Value 1 == c then do _ <- eval body eval (While condition body) else return undefined Print exp -> do v <- eval exp EvalPrint v return v  This looks like more code, as it makes more use of newlines, but I think there's an argument to be made for the readability of:  Bin op e1 e2 -> do v1 <- eval e1 v2 <- eval e2 return$ (bin op) v1 v2


over

      Bin op e1 e2 -> do
let (v1, out1, env1) = loop e1 out0 env0
let (v2, out2, env2) = loop e2 out1 env1
((bin op) v1 v2, out2, env2)


### What if I'm not writing in Haskell?

So far we have a good, fairly simple approach to making monads, which works great in Haskell. How does that work in other languages? I've recently come across a use-case for a monad while writing Clojure code, which I'll walk through in my next post.

1. I, however, did rewrite the entire thing without a monad, in order to make sure the code I am showing works as advertized as part of a larger whole. You can see the entire thing here.