2 January 2022

AoC 24, part 1: Did I cheat?

I must admit, when I finished reading the problem description, I had no idea how to proceed. In fact, while I did get my two stars on December 24, I would not find a solution to the problem until the evening of December 27. Intrigued? Read on.

Spoilers ahead. This post is about my journey through the day 24 problem of this year's Advent of Code. In this post, I'll describe how I approached the problem on the day it was released, when I was trying to beat the other participants in my private leaderboard, and how I eventually got the right answers without properly solving the problem.

In the next post, I'll describe how I eventually did find a proper solution.

It looked easy at first

When I first started reading, I thought I'd have some advantage on this puzzle: it starts by describing a register machine with 4 registers, which I've written about in the past, recently enough that I felt confident in my ability to write a good, fast interpreter for it.

As I kept reading, though, the problem steadily walked away from my expectations. It turns out interpreting register-machine instructions is not the main difficulty here.

To summarize the problem, you get a program written in some register-based assembly. You are told that the program is going to call the inp instruction 14 times, each time generating a number between \(1\) and \(9\) included, and that your job is to find the sequence of such numbers that results in the value of a specific register (z) at the end of the program to be zero.

It should be immediately apparent that even with a very fast interpreter, it's going to take a while to walk through all \(9^{14}\) possible values.

I knew looking at all possible values would not work. But I also had no other idea, so I started there. The problem asked for the "largest" input, so maybe if I started at the maximum of all nines I would not have to walk through too many of them.

So I started by trying to write a reasonably fast interpreter for the given code. Since I was writing in Clojure, I opted for code generation. Assuming we'd parse the input to something of the form:

[[:inp :w]
 [:add :z :w]
 [:mod :z 2]
 [:div :w 2]
 [:add :y :w]
 [:mod :y 2]
 [:div :w 2]
 [:add :x :w]
 [:mod :x 2]
 [:div :w 2]
 [:mod :w 2]]

my interpreter (compiler?) was:

(defn to-fn
  [instr]
  (let [input (gensym "input")
        w (gensym "w")
        x (gensym "x")
        y (gensym "y")
        z (gensym "z")]
    `(fn [~input]
       (let [~w 0 ~x 0 ~y 0 ~z 0
             ~@(->> instr
                    (mapcat
                      (fn [[i a1 a2]]
                        (let [to-sym {:w w, :x x, :y y, :z z}
                              op {:add +,
                                  :mul *,
                                  :div quot,
                                  :mod mod,
                                  :eql (fn [a b] (if (== a b) 1 0))}]
                          (if (= i :inp)
                            `[~(to-sym a1)
                              (first ~input)
                              ~input (rest ~input)]
                            `[~(to-sym a1)
                              (~(op i)
                              ~(to-sym a1)
                              ~(if (keyword? a2) (to-sym a2) a2))])))))]
             [~w ~x ~y ~z]))))

This will not generate the best possible code, but it will be reasonably efficient, and it was fairly quick to write so I could get started early (I was trying to beat other people on my private leaderboard).

For the above list of instructions, that would generate code that looks like this1:

(fn [input]
  (let [w 0
        x 0
        y 0
        z 0
        w (first input)
        input (rest input)
        z (+ z w)
        z (mod z 2)
        w (quot w 2)
        y (+ y w)
        y (mod y 2)
        w (quot w 2)
        x (+ x w)
        x (mod x 2)
        w (quot w 2)
        w (mod w 2)]
    [w x y z]))

I had little hope for this approach, but I still started that in the background, while I explored other approaches. Needless to say, I eventually stopped it before it could reach any solution. Still, for performance comparison, I'd like to know how long it would have taken.

I now have the solution (98491959997994), which means I can actually estimate that. The corresponding "main" function was:

(defn part1
  [input]
  (let [f (eval (to-fn input))]
    (->> (make-inputs input)
         (map (fn [input] (cons input (f input))))
         (filter (fn [[in w x y z]] (zero? z)))
         ffirst
         (apply str)
         Long/parseLong)))

If I introduce a (take (* 10 1000 1000)) in that chain, the code finishes running in close to 30s, from which we can deduce it would have taken about 12 days to get the answer for part 1, and way more for part 2.

How do I get from 30s to 12 days? I know that the final solution is 98491959997994, so:

(let [difference (- 99999999999999
                    98491959997994)
      number-of-solutions (* difference (/ (Math/pow 9 14)
                                           (Math/pow 10 14)))
      seconds (* 30 (/ number-of-solutions 10 1000 1000))
      days (/ seconds 60 60 24)]
  days)

which yields 11.978.... The number-of-solutions is not exact here, but it's close enough for our purposes.

For part 2, we are asked to find the smallest input, so I would have started from 11...11, and the final solution is 61191516111321. Replacing the difference in the above calculation yields close to 400 days, which is more than I was willing to wait for.

I then spent the next hour or so just running the full function on random inputs, trying to discern any pattern. I didn't find a reliable one, but it did seem that inputs "around" (in 14-dimensional space) a low point were also low.

That gave me an idea.

If the solution space is "locally convex", I can automate the search, rather than do it manually. As I've mentioned in the past, my go-to search is genetic search, which in this case looks roughly like this:

(def monad (-> "data/day24"
             slurp
             (clojure.string/split-lines)
             parse
             to-fn
             eval))

(let [fitness (fn [i]
                (+ (/ (- 99999999999999 (Long/parseLong (apply str i)))
                      99999999999999.0)
                   (-> (monad i) (get 3))))
      mutate (fn [i] (assoc i (rand-int 14) (inc (rand-int 9))))
      crossover (fn [i1 i2]
                  (mapv (fn [x1 x2] (if (> 0.5 (rand)) x1 x2)) i1 i2))
      make-sol (fn [] (vec (repeatedly 14 #(inc (rand-int 9)))))
      carousel (fn [p] (let [maxi (reduce max (map first p))
                             inverted (map (fn [[f i]] [(- maxi f) f i]) p)
                             total (reduce + (map first inverted))
                             roll (rand total)]
                         (loop [r roll
                                [[f' f s] & p] inverted]
                           (if (<= r f')
                             [f s]
                             (recur (- r f') p)))))]
  (defn genetic
    ([] (genetic (->> (repeatedly 100 make-sol)
                      (map (fn [i] [(fitness i) i]))
                      sort)))
    ([init-pop]
     (loop [population (sort init-pop)
            step 0]
       (if (== step 1000)
         population
         (recur (let [survivors (concat (take 10 population)
                                        (take 3 (reverse population)))
                      children (repeatedly
                                 87
                                 #(let [[_ parent1] (carousel population)
                                        [_ parent2] (carousel population)
                                        child (mutate (crossover parent1
                                                                 parent2))]
                                    [(fitness child) child]))]
                  (sort (concat survivors children)))
                (inc step)))))))

Before you judge this code too harshly, remember that this was written in the context of trying something for a time-pressured competition. I am not claiming any of this is good software engineering. Hopefully, if you're familiar with genetic search, you can follow what this code is doing. (And if you're not, maybe my previous post on the topic can help.)

The salient points here are:

  • An "individual" is a vector of 14 elements, where each element is a number from 1 to 9 included.
  • A population is a list of (fitness, individual) tuples, sorted by fitness.
  • Fitness is computed by applying the "monad" function (the instructions given as the input to the problem), plus a fractional part that penalizes small numbers. The monad function always returns an integer, and the goal is to drive it towards zero, so by adding the "(max input - input) / max input" term, we ensure that, should we ever find two inputs for which monad returns 0, the algorithm will consider the larger one better (by making it smaller, as our genetic search tries to minimize fitness).

I played with that a bit. While it did give me a few "zero" inputs to play with, I saw those as clues in trying to understand what the monad function did and how I could derive from that some logic by which I could find the "largest zero".

I have a commit with some intermediate notes on that, specifically me trying to decompose a potential solution into prime factors. Ironically enough, that specific solution was actually the right answer for part 1, but I did not know that at the time. So I kept looking.

After about another hour of playing with the genetic search, I decided that was not going to lead to anything. Yes, that was a wonderfully insightful decision given that it had already given me the right answer.

But I didn't know that. So I deleted all my code so far and started over with a blank slate. And then, after much head scratching, the best I could come up with was to try and do the exhaustive search faster. I still knew that wasn't likely to work, given the sheer scale of the problem, but given I'd already found a solution around 98..., I figured the largest solution might be close enough that there was a small chance this might work. And I had ideas for making this faster, while I had no idea for anything else to try, so I decided to do this.

Specifically, I realized that the search I still had running was very inefficient in that it was, for each nine values of the last input, redoing the whole computation, from scratch, of the 13 first inputs. That was fairly easy to address and should result in a nice speedup.

With the same parsing result as above, this lead to code that looked like:

(defn to-fns
  [instr]
  (let [op {:add +, :mul *, :div quot, :mod mod,
            :eql (fn [a b] (if (== a b) 1 0))}
        arr (with-meta (gensym "arr") {:tag "[J"})
        ret (with-meta (gensym "ret") {:tag "[J"})
        w (gensym "w")
        x (gensym "x")
        y (gensym "y")
        z (gensym "z")
        to-sym {:w w, :x x, :y y, :z z}]
    (->> instr
         (partition-by #{[:inp :w]})
         (partition 2)
         (map (fn [[_ ops]]
                (eval `(fn [~arr in#]
                         (let [~w in#
                               ~x (aget ~arr 1)
                               ~y (aget ~arr 2)
                               ~z (aget ~arr 3)
                               ~ret (make-array Long/TYPE 4)
                               ~@(mapcat (fn [[i a1 a2]]
                                           `[~(to-sym a1)
                                             (~(op i)
                                             ~(to-sym a1)
                                             ~(if (keyword? a2)
                                                (to-sym a2)
                                                a2))])
                                         ops)]
                           (aset ~ret 0 ~(with-meta w {:tag "long"}))
                           (aset ~ret 1 ~(with-meta x {:tag "long"}))
                           (aset ~ret 2 ~(with-meta y {:tag "long"}))
                           (aset ~ret 3 ~(with-meta z {:tag "long"}))
                           ~ret))))))))

(defn run
  [input]
  (let [fns (to-fns input)
        counter (volatile! 0)
        h (fn rec [^longs start inputs fns]
            (vswap! counter inc)
            (when (zero? (rem @counter 10000000))
              (prn [:inputs inputs :start (seq start)]))
            (cond (and (empty? fns) (zero? (aget start 3)))
                  inputs
                  (empty? fns)
                  nil
                  :else
                  (or (rec ((first fns) start 9) (conj inputs 9) (rest fns))
                      (rec ((first fns) start 8) (conj inputs 8) (rest fns))
                      (rec ((first fns) start 7) (conj inputs 7) (rest fns))
                      (rec ((first fns) start 6) (conj inputs 6) (rest fns))
                      (rec ((first fns) start 5) (conj inputs 5) (rest fns))
                      (rec ((first fns) start 4) (conj inputs 4) (rest fns))
                      (rec ((first fns) start 3) (conj inputs 3) (rest fns))
                      (rec ((first fns) start 2) (conj inputs 2) (rest fns))
                      (rec ((first fns) start 1) (conj inputs 1) (rest fns)))))
        init (make-array Long/TYPE 4)]
    (h init [] fns)))

The main difference here is that to-fns returns a list of functions, each of which takes in an array of four elements (the state of the four registers at the end of the previous step) and a new input, and processes just that one input up until the point where the next input is requested. It returns an array representing the state of the four registers just before requesting the next input.

The run function then takes that and runs each function in turn, providing inputs one by one, without ever rerunning the same segment twice.

There's still room for improvement, though. I spent about an hour and a half trying to optimize that code, yielding the final version of to-fns as:

(defn to-fns
  [instr]
  (let [arr (with-meta (gensym "arr") {:tag "[J"})
        ret (with-meta (gensym "ret") {:tag "[J"})
        w (gensym "w")
        x (gensym "x")
        y (gensym "y")
        z (gensym "z")
        to-sym {:w w, :x x, :y y, :z z}
        hint-state (atom {})]
    (->> instr
         (partition-by #{[:inp :w]})
         (partition 2)
         (map-indexed
           (fn [i [_ ops]]
             (eval
               `(fn ~(symbol (str "opti-" i)) [~arr in#]
                  (let [~w in#
                        ~x (aget ~arr 1)
                        ~y (aget ~arr 2)
                        ~z (aget ~arr 3)
                        ~ret (make-array Long/TYPE 4)
                        ~@(mapcat
                            (fn [[i a1 a2]]
                              (match [i
                                      (to-sym a1)
                                      (if (keyword? a2) (to-sym a2) a2)]
                                [:add s1 0] []
                                [:add s1 s2] [s1
                                              (do (swap! hint-state assoc s1 false)
                                                  `(unchecked-add ~s1 ~s2))]
                                [:mul s1 0] [s1
                                             (do (swap! hint-state assoc s1 true)
                                                 0)]
                                [:mul s1 1] []
                                [:mul s1 s2] [s1
                                              (do (swap! hint-state assoc s1 false)
                                                  `(unchecked-multiply ~s1 ~s2))]
                                [:div s1 1] []
                                [:div s1 s2] [s1
                                              (do (swap! hint-state assoc s1 false)
                                                  `(quot ~s1 ~s2))]
                                [:mod s1 s2] [s1
                                              (do (swap! hint-state assoc s1 false)
                                                  `(rem ~s1 ~s2))]
                                [:eql s1 0] []
                                [:eql s1 s2] [s1 (do (swap! hint-state assoc s1 true)
                                                     `(if (== ~s1 ~s2) 0 1))]))
                              ops)]
                    (aset ~ret 0 ~(if (@hint-state w)
                                    w
                                    (with-meta w {:tag "long"})))
                    (aset ~ret 1 ~(if (@hint-state x)
                                    x
                                    (with-meta x {:tag "long"})))
                    (aset ~ret 2 ~(if (@hint-state y)
                                    y
                                    (with-meta y {:tag "long"})))
                    (aset ~ret 3 ~(if (@hint-state z)
                                    z
                                    (with-meta z {:tag "long"})))
                    ~ret))))))))

All of the hint-state related stuff can safely be ignored in trying to understand what that code does; the main point of it is to specify the right type hints in the generated code.2

Compared to previous states of the code, we are doing some relatively simple arithmetic simplifications and inlining operations rather than go through separate functions. This surprised me a little bit but it turns out inlining the symbol for operations (as we're doing here) generates faster code than inlining the functions themselves (as we did in to-fn above).

The careful reader might have noticed a spurious simplification in eql; this one is not generally true, but is true on my input: every eql r1 r2 operation is followed by an eql r 0 operation, meaning I can save one step by switching around the two possible results.

For reference, for the following code segment:

inp w
mul x 0
add x z
mod x 26
div z 1
add x 10
eql x w
eql x 0
mul y 0
add y 25
mul y x
add y 1
mul z y
mul y 0
add y w
add y 12
mul y x
add z y

the above would produce this Clojure code:

(fn opti-5 [arr in]
  (let [w in
        x (aget arr 1)
        y (aget arr 2)
        z (aget arr 3)
        ret (make-array java.lang.Long/TYPE 4)
        x 0
        x (unchecked-add x z)
        x (rem x 26)
        x (unchecked-add x 10)
        x (if (== x w) 0 1)
        y 0
        y (unchecked-add y 25)
        y (unchecked-multiply y x)
        y (unchecked-add y 1)
        z (unchecked-multiply z y)
        y 0
        y (unchecked-add y w)
        y (unchecked-add y 12)
        y (unchecked-multiply y x)
        z (unchecked-add z y)]
    (aset ret 0 w)
    (aset ret 1 x)
    (aset ret 2 y)
    (aset ret 3 z)
    ret))

More could be done, such as detecting that x is never read before being set to 0, and thus we could skip reading its initial value from the previous array. But this is where I did stop on Dec 24, so we won't go further here either.

Performance-wise, this is quite a bit faster. Using the same method of explicitly limiting the number of iterations, I can eliminate the first billion candidates in about 120s. So that's about 25 times faster than the previous exhaustive search (100 times more samples in 4 times longer).

That's a nice speed bump, but that's still about half a day for part 1 and over two weeks for part 2, which is clearly not great.

Getting the stars

So I started this new run, and it fairly quickly overtook the first one. But it was still clearly way too slow and I still didn't have any better idea, so I decided I should try to port some of the performance improvements from the above approach to my genetic search.

As I started playing with the genetic search code again, I suddenly realized I had access to an oracle. While I did not have any way within my code to know when to stop, I could just ask the Advent of Code website by trying to submit whatever solution I found. So I did that, and the solution I had at that point was not the largest one (I'd lost track of the one I had previously found in some code cleanup in-between).

It was fairly easy to tweak my genetic function such that:

  • I can easily reset the search at the REPL by running (genetic), but also ask the REPL to run one more round on the results of the preceding one by running (genetic *1) (*1 is a built-in variable in a Clojure REPL, which is bound to the result of the last command).
  • Newly generated children are always "larger" than some global constant.

The code looked like:

(def best-so-far
  [9 8 4 9 1 9 5 9 9 9 7 9 9 4])

(let [mutate (fn [i] (update i (rand-int 14)
                             (fn [old]
                               (let [n (if (> (rand) 0.5)
                                         (inc old)
                                         (dec old))]
                                 (cond (== 0 n) 9
                                       (== 10 n) 1
                                       :else n)))))
      crossover ...
      make-sol ...
      carousel ...]
   (defn genetic
     ([] (genetic (->> (repeatedly 100 make-sol)
                       (map (fn [i] [(fitness i) i]))
                       sort)))
     ([init-pop]
      (loop [population (sort init-pop)
             step 0]
        (if (== step 100)
          population
          (recur (let [survivors ...
                       make-child ...
                   (loop [nxt survivors
                          seen (set survivors)]
                     (if (== 100 (count nxt))
                       (sort nxt)
                       (let [child (make-child)]
                         (recur (if (or (seen child)
                                        (== -1 (compare (get child 1)
                                                        best-so-far)))
                                  nxt (conj nxt child))
                                (conj seen child))))))
           (inc step)))))))

Where I've elided the bits of code that have not changed. From this point, I essentially just ran the genetic search in the REPL, and each time I found a new maximum, I tried it on the Advent of Code website and updated my best-so-far definition if it wasn't the right answer. I don't think I needed to do that more than three times.

The second part of the problem asked for the smallest input. I did spend some time trying to find a better solution, but eventually I came back to the exact same genetic search with a small modification to the fitness function (to penalize large inputs rather than small ones) and the sign of the comparison with best-so-far.

Performance-wise, it took me about half an hour for each part, once I committed to this "constrained genetic search" approach. That's a lot faster than the exhaustive search I had as my only alternative, but the genetic approach was only possible because I had access to an oracle, which did not sit well with me.

Conclusion

And that's how I got my two stars while still having no idea how to properly solve the problem. I'm not sure whether what I did would be considered cheating, but I definitely did not feel very good about it. So as soon as I had some time to devote to this again (which turned out to be on Dec 27), I spent some more time on it and did, eventually, find what I consider a proper solution. In my next post, I'll relate how I got there and what that final solution is.


  1. I have manually added indentation, removed fully-qualified namespaces, and renamed generated symbols for clarity.

  2. Type hints are crucial to performance when doing interop and to avoid boxing, both of which are things we want to do here (arrays count as interop). But the Clojure compiler forbids putting a type hint on a local variable for which it can, by itself, already infer the final, primitive type. This means that we need to keep track of the circumstances in which it can, and only add the type hint when it can't.

Tags: clojure aoc