A small journey on how to make your Scala applications faster and slimmer taking advantage of GraalVM native-image.
Scala development is notoriously plagued by its long compilation times and large deployment artifacts. This gets even worse if you are using Docker considering you need a Java Runtime that can take almost 100 megabytes.
In the first quarter of 2018 Oracle publicly announced a project they had been working for 6 years called GraalVM.
Graal is a polyglot virtual machine providing high performance for individual languages and interoperability with zero performance overhead for creating polyglot applications.
Now supporting languages from Java (JVM: Scala, Kotlin, …) to Node.js, Python or R; adding out-of-the-box zero configuration improvements to the runtime.
Graal has two flavors, the Community Edition and the Enterprise Edition. The former is open source in GitHub and available for Linux while the latter is not open source, is available for Mac OS, and you need to request a license to use it in production. The Enterprise Edition has benefits such as having a smaller footprint, and better performance and sandbox capabilities.
In this blog post we will use the Community Edition on Linux but you can also go through the same steps with the Enterprise version on your Mac.
Now the question is, can GraalVM solve or even help with our problems?
Regarding compilation time, Vojin Jovanovic already wrote a nice article called Compiling Scala Faster with GraalVM and left some ideas for improvements, and Christian Schmitt, from the Play Framework team, wrote Running Play on GraalVM that might interest some Scala developers.
But what about size?
Of all the features GraalVM provides one caught my attention: native-image. The GraalVM native-image is a tool that enables ahead-of-time (AOT) compilation of JVM applications into native executables or shared libraries.
While the regular Java Virtual Machine (JVM) code is just-in-time (JIT) compiled at runtime, GraalVM native-image has AOT compilation. As described by Codrut Stancu in Instant Netty Startup using GraalVM Native Image Generation AOT has two main advantages:
First, it improves the start-up time since the code is already pre-compiled into efficient machine code. Secondly, it reduces the memory footprint of Java applications, since it eliminates the need to include infrastructure to load and optimize code at run time.
Still, there are additional advantages such as more predictable performance, less total CPU usage, and the ability to remove unused parts from the runtime binary thus making it smaller, much smaller.
The technology behind the GraalVM native-image is called Substrate virtual machine (VM), which in addition to your application code and dependencies, also contains parts of the Java runtime, including the garbage collector. As Codrut Stancu explains in Instant Netty Startup using GraalVM Native Image Generation:
“To achieve the goals of a self-contained executable, the native-image tool performs static analysis to find ahead-of-time the code that your application uses. This includes the parts of the JDK that your application uses, third party library code, and the VM code itself also written in Java). Therefore, at run time the native executable depends on your system native libraries, like libc, but even this dependency can be removed if you choose to fully link the native dependencies statically.”
In the end, you should get a native application binary that is both smaller and faster.
Scala, SBT and Docker
At Codacy, we deploy most of our Scala applications using the amazing sbt-native-packager directly from our SBT projects.
This makes for minimal setup and quick and simple deployment. The problem with this deployment method is that the size of the generated docker image is around 100 MB (or higher), even for small applications.
This is due to the size of the Java runtime itself, but also all the dependencies and Scala boilerplate which are also included, even when they are not even used.
Reducing this to the minimum required, should help us achieve a minimal size matching the quantity of code in the artifact and not start from 100 MB for single-file applications.
Scala, SBT, Docker and GraalVM native-image
GraalVM native-image to the rescue.
The requirement of native-image is the application classpath. In SBT, you can get it running
sbt 'export runtime:fullClasspath'.
After we have this we need to invoke native-image with the obtained classpath, a name for the binary and the entrypoint for our application.
You might need some more flags and to tweak the build for your specific application. But if your app is simple enough something like
native-image -cp <APP_CLASSPATH> -H:Name=<BINARY_NAME> -H:Class=<APP_MAIN_CLASS> should do it.
One of the biggest overheads we have at Codacy is in our tools. We run a lot of Open Source tools to do static analyses so we have a Scala wrapper invoking each tool and preparing the results in a nice format for us. These projects are composed of only a couple of source files, but will always lead to docker images between 100 MB – 200 MB, if not 1 GB when we depend on other platforms like Swift or Haskell (but that is a different problem altogether).
Using GraalVM my aim was to slim the Java runtime and to have a smaller docker image with my statically linked binary.
For the sake of this tutorial, I’ve extracted the scripts so they can be used as standalone steps in any build.
build-native-image.sh is a simple native-image wrapper allowing you to build your binary for Linux and MacOS if you have GraalVM configured in your machine, or for Linux if you have docker.
The script should support generic SBT apps and requires a name for the target binary, the main class of your application, and the app version to build the binary name. By default, it will build the binary for Linux using docker, but you can specify the target as native and it will use the locally configured GraalVM to build it for your platform.
By adding the code to a file named
build-native-image.sh and giving execution permissions you can run:
./build-native-image.sh -n my-app -m my.main.App -t docker 1.0.0 and it will produce a binary inside a build directory in your project root.
The following tables summarize the results obtained by using a standard jar versus using the GraalVM binary (both inside a docker container) for the above case of the codacy-eslint tool.
|Docker w/ Jar (MB)||Docker w/ GraalVM binary (MB)|
|Node + Yarn + Dependencies||151||166|
These values reflect the big improvement in the size of the binary where AOT reduced the required libraries versus the full JVM and dependencies.
|Docker w/ Jar (s)||Docker w/ GraalVM binary (s)|
This speed improvement (3s) does not seem to vary significantly with the total run time but is instead a constant startup time improvement. It is, therefore, particularly helpful for short running applications. In our specific case our analyses entail running several small tools, so this small improvement quickly adds up, leading to a high impact on the total analysis time.
Problems and Limitations
- If your application or any of its dependencies are using the Java Native Interface (JNI) you need to enable it with the flag
- Due to AOT and the removal of unused elements, some runtime dependencies might be missing and you need to force their inclusion. For example, when using some XML features I had to add
- Lazy values are also a problem because they depend on static initializers. SubstractVM has limitations regarding what you can do in a static initializer and sometimes you might need to remove lazy values. In most cases that will also improve your code considering AOT will do the work during compile time. Reading resources or preparing any structures will be ready when running, thus reducing the overhead startup time.
- Seeing that you might still bring features you are not using adding
-H:+ReportUnsupportedElementsAtRuntimewill help you pass the compilation and it will warn you at runtime (it can be dangerous)
- HTTPS support is available since release candidate 7, but you need to distribute the binary together with
libsunec.sobecause it contains parts of the implementation