Notes on Optimizing Clojure Code: Numerics
Pretty much every program will, at some point, need to deal with numbers. Numbers are deceptively complicated in programming: on the one hand, numerical operations are often one of the first things introduced in programming textbooks, on the vague hope that this will make it easier for beginners because "they already know maths"; on the other hand, programming with floating-point numbers is rightly considered an advanced, expert topic that very few programmers really master.
Like most programming languages, Clojure has multiple types of numbers, and knowing when to use which is important. Unlike most programming languages, Clojure by default uses type-fluid bound-checked polymorphic functions to manipulate those numbers, and understanding those is also important.
But first, a word about the compiler.
The Clojure compiler
Before we can cover how Clojure handles numerics, there are a couple important points to cover on how the compiler works.
Specifically:
- Clojure is always compiled one form at a time.
- Each form is evaluated after compilation, and can change the state of the compiler.
Another way to say the same thing is that the compiler is designed for the kind
of interactive use you get at a REPL, where it makes sense that evaluating a
def
form must change the state of your compiler so that it now knows about
one more binding.
But the Clojure compiler does not have a special mode for the REPL; it always
works the same way. It never loads a sinlge file at once; this is why Clojure
definitions must precede their use even in code files. The ns
form is
changing the current default namespace the compiler resolves things from.
Those are the obvious ones, because those are the ones everyone uses regularly.
But there are additional vars that can change the behaviour of the compiler,
and today we're specifically interested in a var called *unchecked-math*
.
Unchecked math
Unlike most programming languages (at least, unlike all the languages I know about), Clojure arithmetic operations are safe by default: arithmetic operations (including conversions) are checked for overflow.
This is nice from a correctness perspective, but it does add some level of overhead. In cases where you want native arithmetic operations, you need to explicitly disable those checks.
This can be done in two ways:
- By using the
unchecked-*
family of operations, such asunchecked-add
andunchecked-multiply
, which is fine for some use-cases but can quickly get a bit verbose. - By setting the
*unchecked-math*
var totrue
(it defaults tofalse
). This is a compiler flag, so it needs to be set before compiling the form you want to affect (and reset afterwards).
Here's an example:
(ns t.core
(:require [criterium.core :as crit])
(:gen-class))
(defn bench
[f]
(->> (crit/benchmark (f) {}) :mean first))
(defn checked
[]
(loop [total 0, steps 1000000]
(if (zero? steps)
total
(recur (+ total steps) (dec steps)))))
(defn unchecked
[]
(loop [total 0, steps 1000000]
(if (zero? steps)
total
(recur (unchecked-add total steps) (unchecked-dec steps)))))
(set! *unchecked-math* true)
(defn also-unchecked
[]
(loop [total 0, steps 1000000]
(if (zero? steps)
total
(recur (+ total steps) (dec steps)))))
(set! *unchecked-math* false)
(defn -main
[& args]
(doseq [[n f] [["checked" checked]
["unchecked" unchecked]
["also-unchecked" also-unchecked]]]
(println (format "%-15s: %.2e (%d)" n (bench f) (f)))))
Running this yields:
checked : 2.03e-03 (500000500000)
unchecked : 3.52e-04 (500000500000)
also-unchecked : 3.52e-04 (500000500000)
Which is almost a 10x difference.
Boxing
Java1 has, from the start, been designed with a weird dichotomy between "primitive" numbers and "objects", where primitives are immediate values and objects are referenced values. So when you call a function and pass an object as argument, the function operates on a copy of the pointer, which still references the original object (enabling side effects), whereas when you pass a primitive to a function the function gets a copy of the primitive, and there is no way for the function to change the value of that primitive as seen by the calling code.
Clojure is generally thought of as a dynamic ("untyped") programming language,
and that's for the most part true: Clojure functions literally compile to an
interface that exposes methods of the type invoke(Object, Object, ...)
, and
everything Clojure manipulates is generally typed, at the JVM level, as Object.
Common Clojure operations (think map
, first
, conj
, etc.) cast their
arguments to underlying Java interfaces (or abstract classes) and then call
methods on those. For example, conj
casts its first argument to
IPersistentCollection
and then calls the cons
method on that.
In general, this is not an issue, because the JVM is very good at optimizing away type casts and associated checks.
However, specifically for numerics and because of the dichotomy with primitives
at the Java level, this does mean that by default Clojure functions end up
boxing all their numbers, i.e. passing around Long
and Double
instead of
long
and double
.
The Clojure compiler makes some effort to infer the types of primitive local
variables, such as in the loops in the example above where both total
and
steps
are inferred to be primitive long
s, but it doesn't do any automatic
inference on function arguments.
When defining a function, though, one can annotate an argument to mark it as a primitive number, and the compiler will then make sure that it does not get boxed when that function is called. Similarly, one can annotate the argument vector with the return type of the function. The boxing/unboxing behaviour is typically not optimized away by the JIT, so getting rid of it at compile time can yield substantial benefits. Here's an example:
(defn boxed-add
[a b]
(+ a b))
(defn boxed
[]
(loop [total 0, steps 1000000]
(if (zero? steps)
total
(recur (boxed-add total steps)
(dec steps)))))
(defn non-boxed-add
^long [^long a ^long b]
(+ a b))
(defn non-boxed
[]
(loop [total 0, steps 1000000]
(if (zero? steps)
total
(recur (non-boxed-add total steps)
(dec steps)))))
(defn non-boxed-unchecked-add
^long [^long a ^long b]
(unchecked-add a b))
(defn non-boxed-unchecked
[]
(loop [total 0, steps 1000000]
(if (zero? steps)
total
(recur (non-boxed-unchecked-add total steps)
(unchecked-dec steps)))))
With the same kind of -main
, we get:
boxed : 9.02e-03 (500000500000)
non-boxed : 1.17e-03 (500000500000)
non-boxed-unchecked: 3.55e-04 (500000500000)
We can clearly see that boxing definitely has an impact. Surprisingly, it also
looks like the JIT cuts right through the non-boxed-unchecked-add
function
and manages to get us the same performance as inlining the addition operation.
Because everyone loves ternary logic, you can set *unchecked-math*
to
:warn-on-boxed
to get the same behaviour as true
but also print a
compile-time warning when an arithmetic operation is applied to boxed
arguments.
t.core=> (set! *unchecked-math* :warn-on-boxed)
:warn-on-boxed
t.core=> (defn add [a b] (+ a b))
Boxed math warning, /repl:1:17 - call: public static java.lang.Number
clojure.lang.Numbers.unchecked_add(java.lang.Object,java.lang.Object).
#'t.core/add
t.core=>
"Bigger" numbers
So far we've only been talking about 64-bit values, whether boxed or unboxed.
Clojure also has "native" support for three other numeric types:
clojure.lang.Ratio
, clojure.lang.BigInt
and java.math.BigDecimal
. They
all have their uses, but from a performance perspective it's usually better to
avoid them. They are inherently boxed, and they use non-native operations since
they are designed to handle things that do not fit in native operations.
They each have their own literal syntax: BigInt
uses a N
suffix (2N
),
BigDecimal
uses a M
suffix (2.0M
), and Ratio
uses the expected /
spearator between its two components (2/3
).
The "primed" operators (+'
, -'
, etc.) will auto-promote longs to BigInts,
e.g.:
t.core=> (defn fib [n]
(->> [0 1]
(iterate (fn [[a b]] [b (+' a b)]))
(map first)
(drop n)
first))
#'t.core/fib
t.core=> (fib 10)
55
t.core=> (fib 100)
354224848179261915075N
t.core=>
Because they are not good for performance, I will not say more about them here.
"Smaller" numbers
While the JVM does have support for 32-bit values (specifically float
/int
and Float
/Integer
), Clojure does not. It is not possible to type-hint a
function argument (or return value) as either float
or int
(or the even
smaller byte
and char
).
This can be worked around to some extent by using arrays instead, as Clojure
does let one define an array of any primitive type, but that should be used
only for interop as the arithmetic operations in Clojure will likely still
"upcast" int
and float
to long
and double
respectively. There are a few
unchecked-*-int
operations that work on int
s, but nothing for float
s as
far as I'm aware.
Leiningen's :global-vars
In some cases, it may be appropriate to set *unchecked-math*
globally, for an
entire project, rather than just around a single form as we've done here. There
is an inherent loss of safety, so one should make sure this is appropriate for
the project, but if it is what you want to do and you are using Leiningen,
you can add the :global-vars
entry to your project configuration, e.g.:
(defproject t "app"
:dependencies [[org.clojure/clojure "1.10.0"]
[criterium "0.4.6"]]
:global-vars {*unchecked-math* :warn-on-boxed}
:main ^:skip-aot t.core
:target-path "target/%s"
:profiles {:uberjar {:aot :all}})
You can also add it to a specific profile, just like any Leiningen property, and you can use it to set all the global compiler flags you may want to set.
If you are not using Leiningen, chances are whatever youre using still has some way of stting global compiler flags, so it may be worth looking into that.
Conclusion
Safety by default is nice in general, but when you're striving for maximum
performance it means you have to work a little bit harder. Fortunately, it's
easy to set the *unchecked-math*
flag with the appropriate scope, and
:warn-on-boxed
can readily find any number of easy places where performance
can be greatly improved with a simple annotation.
Clojure, at least as far as this series on optimizing is concerned, compiles down to JVM bytecode. JVM bytecode is not quite Java, but in terms of base types it's close enough that we can think about Java types instead of digging into JVM bytecode.
↩