Planck Lazy Analysis Cache Loading

January 4, 2016

One of the goals of Planck has been to minimize startup latency. It's pretty fast by making use of bootstrapped ClojureScript in JavaScriptCore. It also supports caching compiled namespaces and static function dispatch.

But, a significant portion of Planck initialization is time spent reading in the analysis cache for the ClojureScript runtime.




What is the analysis cache? It is essentially metadata about code that is generated by the ClojureScript compiler. Think AST, along with other ancillary stuff like docstrings. It is used, for example, to know information about the defs that exist in a namespace when compiling code that uses that namespace. In fact, when you load cached compiled namespaces, the analysis caches for those namespaces is used, while the compiled JavaScript is just … blindly executed.

As an aside, I wish JavaScriptCore had V8's startup snapshot feature. That would be a very compelling way to initialize Planck, perhaps bypassing analysis cache initialization along with the JavaScript representing the ClojureScript runtime itself!

Since it takes so long to load the analysis cache for cljs.core, I tried a quick experiment, commenting out the code that loads this cache. Planck's startup time was cut nearly in half.

A surprising result is that, even without the cljs.core analysis cache, the evaluation of certain forms like (inc 3) or (+ 1 2) works, while others like (map inc [1 2 3]) fail. I suspect that the root cause for this is that the ClojureScript analyzer can locate and execute the code needed for macroexpansion without even consulting the analysis cache. (Things like inc and + are functions and macros).

So, if the analysis cache is not needed for those forms, I thought it would be interesting to delay loading the cache, and to essentially fault it in only when needed. This would make use cases like

planck -e'(+ 1 2)'

even faster. One challenge, though, is that the cljs.core analysis cache is just one portion of a larger map maintained by the compiler that holds all of the analysis caches for all of the loaded namespaces, and this gigantic map can be consulted (or revised) at any time by the compiler when it is doing its work.

But... the cljs.core is essentially static, read-only. So, it is a candidate to be treated like a delay, and pulled in only when necessary.

Enter lazy-map. This is a library by Artur Malabarba that essentially lets you defer the realization of values in a map. I took it, and put it into Planck. Instead of greedily loading in the cljs.core analysis cache Transit data (already in use to make Replete faster), it makes use of lazy-map to set up triggers that load this info on demand perchance the compiler touches it.

It works! On one box, executing

planck -e'(+ 7 (inc 3))'

drops from 247 ms to 155 ms, shaving off a hefty 37% from the startup time. And, this is getting closer to the 80-ms threshold where I suspect you can no longer perceive the startup latency.

This lazy loading has landed in Planck master and will appear in Planck 1.9.

Tags: Planck ClojureScript