Direct Higher-Order Function Calls

November 16, 2017

When optimizing performance-critical ClojureScript code, function call overhead might be important to you. By default, when compiling ClojureScript in :advanced mode, the :static-fns compiler option is enabled. This post describes an additional option, :fn-invoke-direct, which is an extension to :static-fns for higher-order functions.




As background, it might be worth reviewing this post, which describes :static-fns, giving a concrete example of what it does to emitted code. Additionally, Static-Free ClojureScript REPL gives a concrete example illustrating why it is good to have :static-fns disabled by default for the REPL.

The :fn-invoke-direct compiler option was introduced with the ClojureScript 1.9.660 release. Let's explore this option by way of an example.

Consider the function

(defn f [h x]
  (h x))

where h is a higher-order function, and we want to focus on the code emitted for applying it to x.

Incidentally, Lumo and Planck support the same command line options for enabling :static-fns and :fn-invoke-direct. This makes it really easy to play along at home, trying functions and seeing what code generated. For either, -s / --static-fns enables :static-fns, and -f / --fn-invoke-direct enables :fn-invoke-direct.

If you look at the function body generated for f with :static-fns disabled, you will see that it is

return h.call(null,x);

Note, to see this, you can enable verbose mode for your REPL, or alternatively (set! *print-fn-bodies* true) and then evaluate the function f in your REPL.

If instead you enable :static-fns, you get

return (h.cljs$core$IFn$_invoke$arity$1 ? h.cljs$core$IFn$_invoke$arity$1(x) : h.call(null,x));

This looks a bit more complicated, but all it is really doing is this: If the passed function h happens to have been defined as a multi-arity function and defines an arity for one argument, then it statically invokes that implementation. Otherwise it falls back to what you would get without :static-fns enabled: h.call(null,x).

That will result in faster runtime performance in the case where passed function looks like this:

(defn g
  ([x] (* x 2))
  ([x y] (* x y)))

If we benchmark a call (f g 2) under :advanced with both :static-fns disabled vs. enabled, we get these speedups:

V8: 5.6, SpiderMonkey: 1.8, JavaScriptCore: 2.6, Nashorn: 5.2, ChakraCore: 5.1

This is done using (simple-benchmark [] (f g 2) 10000000).

Now, what if instead g were not multi-arity, but looked like

(defn g [x]
  (* x 2))

Note that it is essentially the same, as far as f is concerned, which only invokes h with one argument. In this case enabling :static-fns can't provide a speedup.

In fact, for this scenario, the simpler code generated with :static-fns disabled runs faster under most VMs. Here are the same speedup benchmarks observed under :static-fns with this new definition for g:

V8: 0.42, SpiderMonkey: 0.47, JavaScriptCore: 0.62, Nashorn: 1.8, ChakraCore: 0.025

(Yes, it appears that disabling :static-fns is really the right thing to do for this particular code if you are targeting ChakraCore!)

But, what if we now also enable :fn-invoke-direct? The code emitted for f in this case has a function body that looks like:

return (h.cljs$core$IFn$_invoke$arity$1 ? h.cljs$core$IFn$_invoke$arity$1(x) : h(x));

The only difference is that now, in the fallback branch, h is directly invoked via h(x), rather than using the more conservative h.call(null,x) construct.

Here are the speedups observed when both :static-fns and :fn-invoke-direct are enabled, compared to a baseline when :static-fns is disabled:

V8: 0.86, SpiderMonkey: 0.54, JavaScriptCore: 0.63, Nashorn: 2.2, ChakraCore: 0.025

In this case :fn-invoke-direct helped recover some of the performance drop. Another way to look at this improvement is the speedup where :static-fns is always enabled, comparing :fn-invoke-direct disabled vs. enabled:

V8: 2.1, SpiderMonkey: 1.1, JavaScriptCore: 1.0, Nashorn: 1.2, ChakraCore: 1.0

This last set of speedups is perhaps a more useful way of looking at things if you are going to assume :static-fns is enabled anyway, and want to see how :fn-invoke-direct might help things if it were enabled.

Given this, why isn't :fn-invoke-direct enabled by default when :static-fns is enabled? In other words, why is it a separate compiler option? There could still be corner cases where the more conservative invocation using call produces the correct behavior. For this reason, the option is disabled by default, and requires that you explicitly configure it in order that it be enabled.

I'd suggest giving :fn-invoke-direct a try on your codebase and see if it helps with performance!

Tags: ClojureScript