Notes on Optimizing Clojure Code: Arrays
Arrays are the most primitive collections on the JVM, and therefore the ones with the fastest operations. This doesn't mean they are right for every situation, but when they are, they can greatly enhance performance.
Working with arrays in Clojure is surprisingly easy once you get used to it, but there are a few gotchas worth pointing out.
Type hints
Array operations are only fast if they avoid reflection. Therefore, you really need to make sure you provide appropriate type hints any time you use any array operation.
In most cases, turning on the compiler flag *warn-on-reflection*
will
prevent you from generating slow, reflective code.
Microbenchmark
In the following sections, I will be referencing lines from the output of this program:
(ns t.core
(:require [criterium.core :as crit])
(:gen-class))
(set! *warn-on-reflection* true)
(set! *unchecked-math* :warn-on-boxed)
(defn bench
[n f]
(let [b #(loop [n 1000]
(when-not (== 0 n)
(f)
(recur (dec n))))
time-in-seconds (->> (crit/benchmark (b) {}) :mean first)]
(println (format "%30s: %.3e" n time-in-seconds))))
(defn -main
[& args]
(let [r (fn [n f]
(let [arr (make-array Long/TYPE 10 10)]
(bench n #(f arr))))
v (fn [n s f]
(let [v (into [] (range s))]
(bench (str n " (" s ")") #(f v))))]
(bench "no-op" (fn []))
(r "count" count)
(r "alength (no hint, no warn)" alength)
(r "alength" (fn [^"[[J" arr] (alength arr)))
(r "aget (no hint, warns)" (fn [arr] (aget arr 3)))
(r "aget" (fn [^"[[J" arr] (aget arr 3)))
(r "deep aget (no hint, no warn)" (fn [arr] (aget arr 3 3)))
(r "deep aget (hint ignored)" (fn [^"[[J" arr] (aget arr 3 3)))
(r "aset (no hint, no warn)" (fn [arr] (aset arr 3 3 1)))
(r "aset (hint ignored)" (fn [^"[[J" arr] (aset arr 3 3 1)))
(r "deep aset" (fn [^"[[J" arr] (aset ^longs (aget arr 3) 3 1)))
(v "vector get" 10 (fn [v] (get v 2)))
(v "vector get" 1000 (fn [v] (get v 2)))
(v "vector 'set'" 10 (fn [v] (assoc v 3 1)))
(v "vector 'set'" 1000 (fn [v] (assoc v 3 1)))
(r "nested copy aset" (fn [^"[[J" arr]
(let [copy (aclone arr)
to-change (aclone ^"[J" (aget arr 3))]
(aset to-change 3 1)
(aset copy 3 to-change))))))
Here's our baseline for an empty loop:
no-op: 3.026e-07
alength
You can use count
on an array, but alength
will generate faster code by
directly emitting the correct JVM-level bytecode (the "length" property of an
array is a special property at the bytecode level).
count: 1.015e-04
alength (no hint, no warn): 1.082e-02
alength: 6.589e-06
The count
function has been designed to be fast, but also polymorphic. It
does have a specialization for arrays, meaning it does end up calling the
special JVM "array length" function in the end (so its performance does not
depend on the length of the array), but it does need to first discover that its
argument is an array.
The "no hint" variant illustrates a limit of the *warn-on-reflection*
flag:
it can only detect reflection issues on function calls, when it can
statically determine that the function called has some expectations about its
argument types. Here, the alength
function is just passed as an argument, not
called, so it does not warn. When the function is later called (within r
), we
have lost track of its type requirements.
It's also interesting to note that alength
is faster than count
when
properly hinted, but slower otherwise.
aset
, aget
The two most basic operations on an array are getting and setting its elements
by index. This is what aset
and aget
do.
They are faster than the Clojure collection counterpart (Vector), as should be expected:
aget (no hint, warns): 1.090e-02
aget: 6.539e-06
vector get (10): 7.566e-06
vector get (1000): 1.205e-05
vector 'set' (10): 2.159e-05
vector 'set' (1000): 5.888e-05
nested copy aset: 2.414e-05
Very small vectors are actually represented as plain arrays, hence the very close performance. As the vector size grows, it becomes an increasingly deep tree (branching factor 32).
There is one big caveat about aget
, though, which matters for
multidimensional arrays. aget
is defined as a multi-arity function, but, at
least as of Clojure 1.11.0-rc1, the multi-indices arities always produce
reflective code, and do not warn about it even with *warn-on-reflection*
turned on.
deep aget (no hint, no warn): 1.098e-02
deep aget (hint ignored): 1.046e-02
aset (no hint, no warn): 1.049e-02
aset (hint ignored): 1.102e-02
deep aset: 6.574e-06
The workaround is very easy: do not use the alternative arities for aget
, and
just nest calls instead. The same applies for aset
, and the workaround is
unfortunately a bit uglier as you have to nest aget
calls for a single
aset
. Note that the ^longs
and ^"[J"
type hints are equivalent.
Loops
We've covered individual operations on arrays, but how you put these operations together also matters. Once you're at a point in your performance journey where you need arrays as your data structures, you also need your operations "around" these arrays to be as fast as possible.
In Clojure, the fastest iteration construct is recur
, as it compiles down to
native JVM looping (think of a while
loop in Java). recur
must be in tail
position (this is checked by the compiler) and will not grow the stack; it
jumps back to the first parent anchor (in terms of nested lexical forms) and
takes as many arguments as the parent defined.
Valid parents are either function argument definitions or explicit loop
calls:
(loop [n 0]
(when (< n 5)
(recur (inc n))))
(defn recur-example
[n]
(if (< n 5)
n
(recur (- n 3))))
There is no performance advantage to jumping to a loop
instead of a function,
so you should only introduce a loop
when you need additional looping
variables compared to the current function's arguments.
Some access patterns on arrays are more common than others; Clojure provides
two macros to iterate on arrays in more convenient forms than writing explicit
loops: areduce
and amap
.
Let's look at (amap arr idx ret expr)
in more details:
arr
can be any expression, but it has to be type-hinted to a type of array if we want maximum performance. In practice it is going to be either a symbol or a list.idx
andret
must be symbols, which this anaphoric macro will bind. Therefore, to avoid confusion, they should probably not be bound in the current scope.amap
will initializeret
to a clone ofarr
, then executeexpr
once for each index inarr
withidx
bound to the current index (whileret
remains bound to the clone ofarr
, possibly mutated by previous executions ofexpr
). Once it runs out of indices, it returnsret
.
Here's an example:
t.core=> (set! *warn-on-reflection* true)
true
t.core=> (def arr (make-array Long/TYPE 5))
#'t.core/arr
t.core=> (seq arr)
(0 0 0 0 0)
t.core=> (seq (amap arr
#_=> my-idx-symbol
#_=> my-ret-symbol
#_=> (aset my-ret-symbol my-idx-symbol (* 2 my-idx-symbol))))
Reflection warning, call to static method alength can't be resolved.
Reflection warning, call to static method aclone can't be resolved.
Reflection warning, call to static method aset can't be resolved.
Reflection warning, call to static method aset can't be resolved.
(0 2 4 6 8)
t.core=> (seq (amap ^longs arr
#_=> my-idx-symbol
#_=> my-ret-symbol
#_=> (aset my-ret-symbol my-idx-symbol (* 2 my-idx-symbol))))
(0 2 4 6 8)
t.core=>
Note that the return value of expr
is ignored; the only way expr
can change
the return value of amap
is through mutating ret
.
The other macro, areduce
, works in a similar way. Specifically, looking at
(areduce arr idx ret init expr)
:
arr
is, again, an expression that evaluates (at run time) to an array and that is (at read time) type-hinted to the array type.idx
is a symbol you provide to serve as the index of the current element.init
is an expression that should evaluate to your "initial accumulator" value for the reduction, just like the init argument toreduce
. It cannot reference eitheridx
orret
as they are not bound yet.- Once per index, the expression
expr
will be evaluated withidx
bound to the current index andret
bound to the result of the last evaluation ofexpr
(orinit
for index 0), and its result will be bound toret
for subsequent evaluations. This rebinding ofret
is the main difference withamap
.
Classic sum of all the elements in an array:
t.core=> (def arr (into-array Long/TYPE [1 2 3 4 5]))
#'t.core/arr
t.core=> (areduce ^longs arr
#_=> i res 0
#_=> (+ res (aget ^longs arr i)))
15
t.core=>
Note that if you do use an array-producing expression in the arr
position
instead of a symbol, you have no way of referring to that array again in
expr
. This is technically also the case for amap
, but there you can at
least assume that ret
starts as a clone, which for some cases may be good
enough.
For this reason, areduce
and amap
are most often used with a simple, bound
symbol as their first argument, rather than a complex expression.
Making arrays
If you already have an array, you can make a copy with aclone
. If you know
the type of array you want to create, you can create it with one of the
type-array
functions (boolean-array
, int-array
, etc.). Those functions
type hint their return value, and there is one for each primitive type on the
JVM, plus object-array
. They all have the same arglist and behaviour: if
their only argument is a number, they create an array of that length; if their
only argument is a seq, they create an array with the same number of elements
(and same values) as that seq, provided the types match; if they get two
arguments, the first one must be a number, which will be the size of the array,
and the second one must be a seq of matching type, and will be used to
initialize the array. If the seq is longer than the array size, extra elements
are ignored; if it is shorter, it only initializes the first elements, and the
rest are kept to their default value.
If you want to assert the type of an array without using type-hints, you can use
the "type
s" functions (longs
, booleans
, ints
, etc.); again, there is
one for each primitive type, but this time not for Object
.
For example:
t.core=> (set! *warn-on-reflection* true)
true
t.core=> (aget (int-array [1 2 3]) 0)
1
t.core=> (aget (into-array Integer/TYPE [1 2 3]) 0)
Reflection warning, call to static method aget can't be resolved.
1
t.core=> (aget (ints (into-array Integer/TYPE [1 2 3])) 0)
1
t.core=> (aget (longs (into-array Integer/TYPE [1 2 3])) 0)
Execution error (ClassCastException).
class [I cannot be cast to class [J
t.core=> (seq (float-array 10 [1 2 3]))
(1.0 2.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0)
t.core=>
For more generic array construction, you can use to-array
, into-array
,
to-array-2d
, or make-array
. Why so many choices?
(into-array type? s)
turns the seqs
into an array of typetype
. Iftype
is omitted, it is set to the class of the first element in the array. If other elements are not the same class or one of its subclass, this will fail at runtime.(make-array type dim [& dims])
returns a multidimensional array of the given type with the given dimensions. The type is not optional and the returned arrays are all of the corresponding length. The initial values are the default value for the given type (0
for numbers,nil
for objects, etc.), per the Java memory model.(to-array coll)
expects ajava.util.Collection
and callsjava.util.Collection#toArray
on it. It's essentially just a convenience wrapper for.toArray
with type-hint included.(to-array-2d coll)
takes anyjava.util.Collection
of which the elements are themselvesjava.util.Collection
, and returns a[[Ljava.lang.Object;
with the same elements.
t.core=> (into-array Integer/TYPE [1 2 3 4])
#object["[I" 0x731a9924 "[I@731a9924"]
t.core=> (seq (into-array Integer/TYPE [1 2 3 4]))
(1 2 3 4)
t.core=> (into-array Integer [1 2 3 4])
Execution error (IllegalArgumentException).
array element type mismatch
t.core=> (into-array Long [1 2 3 4])
#object["[Ljava.lang.Long;" 0x636dde44 "[Ljava.lang.Long;@636dde44"]
t.core=> (seq (into-array Long [1 2 3 4]))
(1 2 3 4)
t.core=> (map seq (make-array Boolean/TYPE 3 2))
((false false) (false false) (false false))
t.core=> (map seq (to-array-2d [[1 2] [3] [4 5 6]]))
((1 2) (3) (4 5 6))
t.core=> (type (to-array-2d [[1 2] [3] [4 5 6]]))
[[Ljava.lang.Object;
t.core=>
In practice, into-array
is the one I've used the most.
Note: calling seq
on an array is the easiest way to get it displayed in a
Clojure REPL.
Conclusion
Arrays are the fastest collection you can use in Clojure, so if you're aiming at writing performant code, you should get familiar with them. Hopefully this blog helps with that.
As always, remember that performance is not the only factor in most code bases. Introducing mutable arrays in a Clojure code base, which is generally expected to deal mostly with immutable, Clojure-native data structures, should not be done lightly.