This code is currently single-threaded. It would be infinitely more useful if it were parallelized.
Initially, it might be multithreaded. Getting more ambitious, an implementation using the Akka actor library would make it scale far better, and solve some of the tougher problems that Knuth outlines in his paper, far more quickly.