Direct Higher-Order Function Calls
When optimizing performance-critical ClojureScript code, function call overhead might be important to you. By default, when compiling ClojureScript in :advanced
mode, the :static-fns
compiler option is enabled. This post describes an additional option, :fn-invoke-direct
, which is an extension to :static-fns
for higher-order functions.
As background, it might be worth reviewing this post, which describes :static-fns
, giving a concrete example of what it does to emitted code. Additionally, Static-Free ClojureScript REPL gives a concrete example illustrating why it is good to have :static-fns
disabled by default for the REPL.
The :fn-invoke-direct
compiler option was introduced with the ClojureScript 1.9.660 release. Let's explore this option by way of an example.
Consider the function
(defn f [h x]
(h x))
where h
is a higher-order function, and we want to focus on the code emitted for applying it to x
.
Incidentally, Lumo and Planck support the same command line options for enabling
:static-fns
and:fn-invoke-direct
. This makes it really easy to play along at home, trying functions and seeing what code generated. For either,-s
/--static-fns
enables:static-fns
, and-f
/--fn-invoke-direct
enables:fn-invoke-direct
.
If you look at the function body generated for f
with :static-fns
disabled, you will see that it is
return h.call(null,x);
Note, to see this, you can enable verbose mode for your REPL, or alternatively
(set! *print-fn-bodies* true)
and then evaluate the functionf
in your REPL.
If instead you enable :static-fns
, you get
return (h.cljs$core$IFn$_invoke$arity$1 ? h.cljs$core$IFn$_invoke$arity$1(x) : h.call(null,x));
This looks a bit more complicated, but all it is really doing is this: If the passed function h
happens to have been defined as a multi-arity function and defines an arity for one argument, then it statically invokes that implementation. Otherwise it falls back to what you would get without :static-fns
enabled: h.call(null,x)
.
That will result in faster runtime performance in the case where passed function looks like this:
(defn g
([x] (* x 2))
([x y] (* x y)))
If we benchmark a call (f g 2)
under :advanced
with both :static-fns
disabled vs. enabled, we get these speedups:
V8: 5.6, SpiderMonkey: 1.8, JavaScriptCore: 2.6, Nashorn: 5.2, ChakraCore: 5.1
This is done using (simple-benchmark [] (f g 2) 10000000)
.
Now, what if instead g
were not multi-arity, but looked like
(defn g [x]
(* x 2))
Note that it is essentially the same, as far as f
is concerned, which only invokes h
with one argument. In this case enabling :static-fns
can't provide a speedup.
In fact, for this scenario, the simpler code generated with :static-fns
disabled runs faster under most VMs. Here are the same speedup benchmarks observed under :static-fns
with this new definition for g
:
V8: 0.42, SpiderMonkey: 0.47, JavaScriptCore: 0.62, Nashorn: 1.8, ChakraCore: 0.025
(Yes, it appears that disabling :static-fns
is really the right thing to do for this particular code if you are targeting ChakraCore!)
But, what if we now also enable :fn-invoke-direct
? The code emitted for f
in this case has a function body that looks like:
return (h.cljs$core$IFn$_invoke$arity$1 ? h.cljs$core$IFn$_invoke$arity$1(x) : h(x));
The only difference is that now, in the fallback branch, h
is directly invoked via h(x)
, rather than using the more conservative h.call(null,x)
construct.
Here are the speedups observed when both :static-fns
and :fn-invoke-direct
are enabled, compared to a baseline when :static-fns
is disabled:
V8: 0.86, SpiderMonkey: 0.54, JavaScriptCore: 0.63, Nashorn: 2.2, ChakraCore: 0.025
In this case :fn-invoke-direct
helped recover some of the performance drop. Another way to look at this improvement is the speedup where :static-fns
is always enabled, comparing :fn-invoke-direct
disabled vs. enabled:
V8: 2.1, SpiderMonkey: 1.1, JavaScriptCore: 1.0, Nashorn: 1.2, ChakraCore: 1.0
This last set of speedups is perhaps a more useful way of looking at things if you are going to assume :static-fns
is enabled anyway, and want to see how :fn-invoke-direct
might help things if it were enabled.
Given this, why isn't :fn-invoke-direct
enabled by default when :static-fns
is enabled? In other words, why is it a separate compiler option? There could still be corner cases where the more conservative invocation using call
produces the correct behavior. For this reason, the option is disabled by default, and requires that you explicitly configure it in order that it be enabled.
I'd suggest giving :fn-invoke-direct
a try on your codebase and see if it helps with performance!