Home Developer How to use GraalVM to make Scala applications faster

How to use GraalVM to make Scala applications faster

Author

Date

Category

A small journey on how to make your Scala applications faster and slimmer taking advantage of GraalVM native-image.

Intro

Scala development is notoriously plagued by its long compilation times and large deployment artifacts. This gets even worse if you are using Docker considering you need a Java Runtime that can take almost 100 megabytes.

In the first quarter of 2018 Oracle publicly announced a project they had been working for 6 years calledย GraalVM.

Graal is a polyglot virtual machine providing high performance for individual languages and interoperability with zero performance overhead for creating polyglot applications.

Now supporting languages from Java (JVM: Scala, Kotlin, …) to Node.js, Python or R; adding out-of-the-box zero configuration improvements to the runtime.

Graal has two flavors, theย Community Editionย and the Enterprise Edition. The former is open source in GitHub and available for Linuxย while the latter is not open source, is available for Mac OS, and you need to request a license to use it in production. The Enterprise Edition has benefits such as having a smaller footprint, and better performance and sandbox capabilities.

In this blog post we will use the Community Edition on Linux but you can also go through the same stepsย with the Enterprise version on your Mac.

Problems

Now the question is, can GraalVM solve or even help with our problems?

Regarding compilation time, Vojin Jovanovic already wrote a nice article calledย Compiling Scala Faster with GraalVMย and left someย ideas for improvements, and Christian Schmitt, from the Play Framework team, wroteย Running Play on GraalVMย that mightย interest someย Scala developers.

But what about size?

GraalVM features

Of all the features GraalVM provides one caught my attention: native-image. The GraalVM native-image is a tool that enables ahead-of-time (AOT) compilation of JVM applications into native executables or shared libraries.

While the regular Java Virtual Machine (JVM) code is just-in-time (JIT) compiled at runtime, GraalVM native-image has AOT compilation. As described by Codrut Stancu inย Instant Netty Startup using GraalVM Native Image Generationย AOT has two main advantages:

First, it improves the start-up time since the code is already pre-compiled into efficient machine code. Secondly, it reduces the memory footprint of Java applications, since it eliminates the need to include infrastructure to load and optimize code at run time.

Still, there are additional advantages such as more predictable performance, less total CPU usage, and the ability to remove unused parts from the runtime binary thus making it smaller, much smaller.

The technology behind the GraalVM native-image is called Substrate virtual machine (VM), which in addition to your application code and dependencies, also contains parts of the Java runtime, including the garbage collector. As Codrut Stancu explains inย Instant Netty Startup using GraalVM Native Image Generation:

“To achieve the goals of a self-contained executable, the native-image tool performs static analysis to find ahead-of-time the code that your application uses. This includes the parts of the JDK that your application uses, third party library code, and the VM code itself also written in Java). Therefore, at run time the native executable depends on your system native libraries, like libc, ย but even this dependency can be removed if you choose to fully link the native dependencies statically.”

In the end,ย you should get a native application binary that is both smaller and faster.

Scala, SBT and Docker

At Codacy, we deployย most of our Scala applications using the amazingย sbt-native-packagerย directly from our SBT projects.

This makes for minimal setupย and quick and simple deployment. The problem with this deployment method is that the size of the generated docker image is around 100 MB (or higher), even for small applications.

This is due to the size of the Java runtime itself, but also all the dependencies and Scala boilerplate which are also included, even when they are not even used.

Reducing this to the minimum required, should help us achieve a minimal size matching the quantity of code in the artifact and not start from 100 MB for single-file applications.

Scala, SBT, Docker and GraalVM native-image

GraalVM native-image to the rescue.

The requirement of native-image is the application classpath. In SBT, you can get it runningย sbt 'export runtime:fullClasspath'.

After we have this we need to invoke native-image with the obtained classpath, a name for the binary and the entrypoint for our application.

You might need someย more flags and to tweak the build for your specific application. But if your app is simple enough something likenative-image -cp <APP_CLASSPATH> -H:Name=<BINARY_NAME> -H:Class=<APP_MAIN_CLASS>ย should do it.

Example

One of the biggest overheads we have at Codacy is in our tools. We run a lot of Open Source tools to do static analyses so we have a Scala wrapper invoking each tool and preparing the results in a nice format for us. These projects are composed of only a couple of source files, but will always lead to docker images between 100 MB – 200 MB, if not 1 GB when we depend on other platforms like Swift or Haskell (but that is a different problem altogether).

Using GraalVM my aim was to slim the Java runtime and to have a smaller docker image with my statically linked binary.

The first example I tried wasย codacy-eslint, as it has other details I am currentlyย working onย codacy/codacy-eslint#native-image.

For the sake of this tutorial, I’ve extracted the scripts so they can be used as standalone steps in any build.

build-native-image.shย is a simple native-image wrapper allowing you to build your binary for Linux and MacOS if you have GraalVM configured in your machine, or for Linux if you have docker.

The script should support generic SBT apps and requires a name for the target binary, the main class of your application, and the app version to build the binary name. By default, it will build the binary for Linux using docker, but you can specify the target as native and it will use the locally configured GraalVM to build it for your platform.

By adding the code to a file namedย build-native-image.shย and giving execution permissions you can run:./build-native-image.sh -n my-app -m my.main.App -t docker 1.0.0ย and it will produce a binary inside a build directory in your project root.

Improvements

The following tables summarize the results obtained by using a standard jar versus using the GraalVM binary (both inside a docker container) for the above case of the codacy-eslint tool.

Size

ย  Docker w/ Jar (MB) Docker w/ GraalVM binary (MB)
Node + Yarn + Dependencies 151 166
Application 92 (w/JVM) 24
Alpine base 4 4
Total 247 194
Improvement ย  ~ -53

These values reflect the big improvement in the size of the binary where AOT reduced the required libraries versus the full JVM and dependencies.

In this specific case because we require over 150 MB of Javascript dependencies the reduced size might seem small, but for most cases, the improvement can be more than half of the size of the docker.

Speed

ย  Docker w/ Jar (s) Docker w/ GraalVM binary (s)
Total 6 3
Improvement ย  ~ 3

This speed improvement (3s) does not seem to vary significantly with the total run time but is instead a constant startup time improvement. It is, therefore, particularly helpful for short running applications. In our specific case our analyses entail running several small tools, so this small improvement quickly adds up, leading to a high impact on the total analysis time.

Problems and Limitations

  • If your application or any of its dependencies are using the Java Native Interface (JNI) you need to enable it with the flagย -H:+JNI
  • Due to AOT and the removal of unused elements, some runtime dependencies might be missing and you need to force their inclusion. For example, when using some XML features I had to add-H:IncludeResourceBundles=com.sun.org.apache.xerces.internal.impl.msg.XMLMessages.
  • Lazy values are also a problem because they depend on static initializers. ย  ย  SubstractVM has limitations regarding what you can do in a static initializer and sometimes you might need to remove lazy values. In most cases that will also improve your code considering AOT will do the work during compile time. Reading resources or preparing any structures will be ready when running, thus reducing the overhead startup time.
  • Seeing thatย you might still bring features you are not using addingย -H:+ReportUnsupportedElementsAtRuntimeย will help you pass the compilation and it will warn you at runtime (it can be dangerous)
  • HTTPS supportย is available since release candidate 7, but you need to distribute the binary together withย libsunec.soย because it contains parts of the implementation

Resources

5 COMMENTS

  1. I read thั–s piece of writing fแฅ™lly reษกarding the comparison ึ…f
    most recent and preceding technologies, ั–t’s awesome article.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Subscribe to our newsletter

To be updated with all the latest news, offers and special announcements.

Recent posts

How does code quality fit into your CI/CD pipeline?

Continuous Integration and Continuous Deployment (CI/CD) are key for organizations wanting to deliver software at scale. CI/CD allows developers to automate...

How Stim uses Codacy to achieve high-quality code

We spoke with Tobias Sjรถsten, Head of Software Engineering at Stim, about how Codacy helps them guarantee code quality and standardization...

6 things developers should do to ship more secure code

Writing better, more secure source code is fundamental to prevent potential exploits and attacks that could undermine your software applications. However,...

Best practices for security code reviews

In today's interconnected world, where data breaches and cyber threats are increasingly common, one of your top priorities should be to...

April Product Update ๐Ÿš€

Hi there ๐Ÿ‘‹ It's been a whirlwind month, and we have big news to share: