'Lean' Runtime

The problem

Clojure's support for dynamic development incurs runtime costs that may be undesirable in certain production environments. These include:

Start time
Var.getRawRoot overhead
Heap use
Deployment size

Additionally, the indirection involved using a Var, especially via getting a reference to a Var via Var.intern(String, String) makes Clojure-generated bytecode hard to analyze via programs like ProGuard.

Startup time

Starting a Clojure program takes a significant amount of time, most of which is taken up by the Clojure runtime bootstrapping itself, i.e. loading and initialising namespaces and vars (see Why is Clojure bootstrapping so slow?). For long-running applications, this is probably not an issue. However, this is a problem in other scenarios:

Android

Android applications should ideally load about as fast as Java or Scala applications. Currently, on a high-end telephone, a minimal Clojure program will take about one second longer, which is perceptible. In some cases, having a Java splash screen thrown up while Clojure bootstraps may be an acceptable solution, but that doesn't work for all programs.

Command line programs/utilities

Depending on the type of utility, Clojure's start time can dominate the time used to do the actual work. Current workarounds include using persistent JVMs, i.e. nailgun, or using ClojureScript with Node.js.

Google App Engine (GAE)

GAE imposes a strict sixty second time limit for responding to a request. If the application hasn't warmed up, Clojure's startup can take up a significant portion of that time, resulting in the initial request timing out.

`Var.getRawRoot` overhead

Generally speaking, each time a Var is used within a function invocation, it's root binding must be retrieved. While this is a very simple operation, it does require reading a volatile variable, which may impede optimisation. In some programs, this overhead is measurable. Currently, this best workaround for this is using a macro or definline, but this results in code that is harder to read and write.

Heap use

Clojure 1.5.1 uses over seventeen megabytes of heap size just starting a basic REPL. Much of this heap use results from Clojure simply loading up a lot of Vars which may or may not be used in a given program. For servers with gigabytes of RAM, this may just be noise. However, in more constrained environments, e.g. Android, this can be more of an issue.

Manual tree-shaking via commenting out portions of clojure.core has shown significant reductions in heap use. Unfortunately, namespaces are currently atomic and cannot be easily broken up.

Deployment size

Clojure 1.5.1's JAR takes up about 3.5 megabytes. This isn't an issue for a lot of environments, but this can be an issue for mobile applications or Clojure as a library.

Mitigation strategies

A number of mitigation strategies have been proposed to resolve one or more of the above issues:

Static compilation
Lazy loading of Vars
Don't load the user namespace by default
Eliminate the compiler

Static compilation

Overall strategy:
- The general idea is to compile Vars down to static final fields and methods on namespace classes.
- In the ideal case, there are no Vars involved at the point of use/invocation, just an invokestatic or getstatic op.
- However, static methods aren't first class functions, so there is still a need to be able to invoke them through an object, possible a Var or another IFn instance.
- Dynamic Vars would remain more or less the same.
Things to think about:
- What's the best way to interoperate with code that explicitly manipulates namespaces or Vars, e.g. ns-resolve or with-redefs
- Where does metadata go?
- Should a 'static' Var invoke/get the static method/field, or should it have its w
- In a more 'static' environment, is it possible to eliminate the volatile on a Var's root binding?
- Given the changes in how things are compiled and possibly even changes to core Clojure classes, how do we solve the artifact distribution problem? See Build Profiles.
- Is there a way to resolve Vars that enable dynamic development starting from a largely static runtime?

Lazy loading of Vars

Currently, all Vars are created when their namespace is loaded. What if that work could be deferred?
This strategy has been used in clojure-objc
While this ameliorates the startup time problem, all it does is put off the work to some other point in the program. However, in conjunction with static compilation, it might mean that the Vars are only created for use in higher-order functions.
Is there an inexpensive way to keep track of whether a Var has been loaded? Could we use the linker for this?

Don't load the `user` namespace by default

In some environments, creating the user namespace is pure overhead. Removing this can make a difference at startup time and on memory use.
How do we know when to do this or not?

Eliminate the compiler

If it's not needed, why include it? This can make a big difference in deployment size.
See Build Profiles

Other things to think about

The big one is Build Profiles.
- At least in the Clojure open source world, most libraries are distributed as JARs of Clojure source files. As such, library authors are largely off the hook as to make decisions about compilation decisions.
- However, this isn't the case for everything, namely Clojure itself.
What should be the default compilation mode?
What sorts of tests should be run to ensure that the technical solution is optimal?
How independent should each of these changes be? Is it just a master production-/development-mode switch or are individual compilation features independently switchable?

Development plan

There is a good chance we can have a Google Summer of Code student work on this.
It would be good to see some form of this in the next Clojure release.

'Lean' Runtime

The problem

Startup time

Android

Command line programs/utilities

Google App Engine (GAE)

`Var.getRawRoot` overhead

Heap use

Deployment size

Mitigation strategies

Static compilation

Lazy loading of Vars

Don't load the `user` namespace by default

Eliminate the compiler

Other things to think about

Development plan

Related Hammock topics

1 Comment

Mike Anderson

'Lean' Runtime

The problem

Startup time

Android

Command line programs/utilities

Google App Engine (GAE)

Var.getRawRoot overhead

Heap use

Deployment size

Mitigation strategies

Static compilation

Lazy loading of Vars

Don't load the user namespace by default

Eliminate the compiler

Other things to think about

Development plan

Related Hammock topics

1 Comment

Mike Anderson

`Var.getRawRoot` overhead

Don't load the `user` namespace by default