6 February 2022

Notes on Optimizing Clojure Code: Numerics

Pretty much every program will, at some point, need to deal with numbers. Numbers are deceptively complicated in programming: on the one hand, numerical operations are often one of the first things introduced in programming textbooks, on the vague hope that this will make it easier for beginners because "they already know maths"; on the other hand, programming with floating-point numbers is rightly considered an advanced, expert topic that very few programmers really master.

Like most programming languages, Clojure has multiple types of numbers, and knowing when to use which is important. Unlike most programming languages, Clojure by default uses type-fluid bound-checked polymorphic functions to manipulate those numbers, and understanding those is also important.

But first, a word about the compiler.

The Clojure compiler

Before we can cover how Clojure handles numerics, there are a couple important points to cover on how the compiler works.

Specifically:

  • Clojure is always compiled one form at a time.
  • Each form is evaluated after compilation, and can change the state of the compiler.

Another way to say the same thing is that the compiler is designed for the kind of interactive use you get at a REPL, where it makes sense that evaluating a def form must change the state of your compiler so that it now knows about one more binding.

But the Clojure compiler does not have a special mode for the REPL; it always works the same way. It never loads a sinlge file at once; this is why Clojure definitions must precede their use even in code files. The ns form is changing the current default namespace the compiler resolves things from.

Those are the obvious ones, because those are the ones everyone uses regularly. But there are additional vars that can change the behaviour of the compiler, and today we're specifically interested in a var called *unchecked-math*.

Unchecked math

Unlike most programming languages (at least, unlike all the languages I know about), Clojure arithmetic operations are safe by default: arithmetic operations (including conversions) are checked for overflow.

This is nice from a correctness perspective, but it does add some level of overhead. In cases where you want native arithmetic operations, you need to explicitly disable those checks.

This can be done in two ways:

  • By using the unchecked-* family of operations, such as unchecked-add and unchecked-multiply, which is fine for some use-cases but can quickly get a bit verbose.
  • By setting the *unchecked-math* var to true (it defaults to false). This is a compiler flag, so it needs to be set before compiling the form you want to affect (and reset afterwards).

Here's an example:

(ns t.core
  (:require [criterium.core :as crit])
  (:gen-class))

(defn bench
  [f]
  (->> (crit/benchmark (f) {}) :mean first))

(defn checked
  []
  (loop [total 0, steps 1000000]
    (if (zero? steps)
      total
      (recur (+ total steps) (dec steps)))))

(defn unchecked
  []
  (loop [total 0, steps 1000000]
    (if (zero? steps)
      total
      (recur (unchecked-add total steps) (unchecked-dec steps)))))

(set! *unchecked-math* true)
(defn also-unchecked
  []
  (loop [total 0, steps 1000000]
    (if (zero? steps)
      total
      (recur (+ total steps) (dec steps)))))
(set! *unchecked-math* false)

(defn -main
  [& args]
  (doseq [[n f] [["checked" checked]
                 ["unchecked" unchecked]
                 ["also-unchecked" also-unchecked]]]
    (println (format "%-15s: %.2e (%d)" n (bench f) (f)))))

Running this yields:

checked        : 2.03e-03 (500000500000)
unchecked      : 3.52e-04 (500000500000)
also-unchecked : 3.52e-04 (500000500000)

Which is almost a 10x difference.

Boxing

Java1 has, from the start, been designed with a weird dichotomy between "primitive" numbers and "objects", where primitives are immediate values and objects are referenced values. So when you call a function and pass an object as argument, the function operates on a copy of the pointer, which still references the original object (enabling side effects), whereas when you pass a primitive to a function the function gets a copy of the primitive, and there is no way for the function to change the value of that primitive as seen by the calling code.

Clojure is generally thought of as a dynamic ("untyped") programming language, and that's for the most part true: Clojure functions literally compile to an interface that exposes methods of the type invoke(Object, Object, ...), and everything Clojure manipulates is generally typed, at the JVM level, as Object.

Common Clojure operations (think map, first, conj, etc.) cast their arguments to underlying Java interfaces (or abstract classes) and then call methods on those. For example, conj casts its first argument to IPersistentCollection and then calls the cons method on that.

In general, this is not an issue, because the JVM is very good at optimizing away type casts and associated checks.

However, specifically for numerics and because of the dichotomy with primitives at the Java level, this does mean that by default Clojure functions end up boxing all their numbers, i.e. passing around Long and Double instead of long and double.

The Clojure compiler makes some effort to infer the types of primitive local variables, such as in the loops in the example above where both total and steps are inferred to be primitive longs, but it doesn't do any automatic inference on function arguments.

When defining a function, though, one can annotate an argument to mark it as a primitive number, and the compiler will then make sure that it does not get boxed when that function is called. Similarly, one can annotate the argument vector with the return type of the function. The boxing/unboxing behaviour is typically not optimized away by the JIT, so getting rid of it at compile time can yield substantial benefits. Here's an example:

(defn boxed-add
  [a b]
  (+ a b))

(defn boxed
  []
  (loop [total 0, steps 1000000]
    (if (zero? steps)
      total
      (recur (boxed-add total steps)
             (dec steps)))))

(defn non-boxed-add
  ^long [^long a ^long b]
  (+ a b))

(defn non-boxed
  []
  (loop [total 0, steps 1000000]
    (if (zero? steps)
      total
      (recur (non-boxed-add total steps)
             (dec steps)))))

(defn non-boxed-unchecked-add
  ^long [^long a ^long b]
  (unchecked-add a b))

(defn non-boxed-unchecked
  []
  (loop [total 0, steps 1000000]
    (if (zero? steps)
      total
      (recur (non-boxed-unchecked-add total steps)
             (unchecked-dec steps)))))

With the same kind of -main, we get:

boxed          : 9.02e-03 (500000500000)
non-boxed      : 1.17e-03 (500000500000)
non-boxed-unchecked: 3.55e-04 (500000500000)

We can clearly see that boxing definitely has an impact. Surprisingly, it also looks like the JIT cuts right through the non-boxed-unchecked-add function and manages to get us the same performance as inlining the addition operation.

Because everyone loves ternary logic, you can set *unchecked-math* to :warn-on-boxed to get the same behaviour as true but also print a compile-time warning when an arithmetic operation is applied to boxed arguments.

t.core=> (set! *unchecked-math* :warn-on-boxed)
:warn-on-boxed
t.core=> (defn add [a b] (+ a b))
Boxed math warning, /repl:1:17 - call: public static java.lang.Number
  clojure.lang.Numbers.unchecked_add(java.lang.Object,java.lang.Object).
#'t.core/add
t.core=>

"Bigger" numbers

So far we've only been talking about 64-bit values, whether boxed or unboxed. Clojure also has "native" support for three other numeric types: clojure.lang.Ratio, clojure.lang.BigInt and java.math.BigDecimal. They all have their uses, but from a performance perspective it's usually better to avoid them. They are inherently boxed, and they use non-native operations since they are designed to handle things that do not fit in native operations.

They each have their own literal syntax: BigInt uses a N suffix (2N), BigDecimal uses a M suffix (2.0M), and Ratio uses the expected / spearator between its two components (2/3).

The "primed" operators (+', -', etc.) will auto-promote longs to BigInts, e.g.:

t.core=> (defn fib [n]
           (->> [0 1]
                (iterate (fn [[a b]] [b (+' a b)]))
                (map first)
                (drop n)
                first))
#'t.core/fib
t.core=> (fib 10)
55
t.core=> (fib 100)
354224848179261915075N
t.core=>

Because they are not good for performance, I will not say more about them here.

"Smaller" numbers

While the JVM does have support for 32-bit values (specifically float/int and Float/Integer), Clojure does not. It is not possible to type-hint a function argument (or return value) as either float or int (or the even smaller byte and char).

This can be worked around to some extent by using arrays instead, as Clojure does let one define an array of any primitive type, but that should be used only for interop as the arithmetic operations in Clojure will likely still "upcast" int and float to long and double respectively. There are a few unchecked-*-int operations that work on ints, but nothing for floats as far as I'm aware.

Leiningen's :global-vars

In some cases, it may be appropriate to set *unchecked-math* globally, for an entire project, rather than just around a single form as we've done here. There is an inherent loss of safety, so one should make sure this is appropriate for the project, but if it is what you want to do and you are using Leiningen, you can add the :global-vars entry to your project configuration, e.g.:

(defproject t "app"
  :dependencies [[org.clojure/clojure "1.10.0"]
                 [criterium "0.4.6"]]
  :global-vars {*unchecked-math* :warn-on-boxed}
  :main ^:skip-aot t.core
  :target-path "target/%s"
  :profiles {:uberjar {:aot :all}})

You can also add it to a specific profile, just like any Leiningen property, and you can use it to set all the global compiler flags you may want to set.

If you are not using Leiningen, chances are whatever youre using still has some way of stting global compiler flags, so it may be worth looking into that.

Conclusion

Safety by default is nice in general, but when you're striving for maximum performance it means you have to work a little bit harder. Fortunately, it's easy to set the *unchecked-math* flag with the appropriate scope, and :warn-on-boxed can readily find any number of easy places where performance can be greatly improved with a simple annotation.


  1. Clojure, at least as far as this series on optimizing is concerned, compiles down to JVM bytecode. JVM bytecode is not quite Java, but in terms of base types it's close enough that we can think about Java types instead of digging into JVM bytecode.

Tags: clojure