Improving Median Lambda Init Times By 23%
Sebastian Staffa
Published on Dec 9, 2024, 12:56 PM
One of the most powerful features of the nix package manager is the ability to modify the build of each and every piece of software with ease - you don't need to, but you can. This is especially useful whenever we want to customize a tool or library for our use case or to optimize it for a specific environment.
In this post, we will leverage the capabilities of nix to build a minimalistic Docker image for a Node.js lambda application, speed up the median cold start of the lambda function by 23% percent and thus save money in the process.
This is a long blog post, click here if you are just interested in the results.
The idea
I initially got the idea for this blog post when bundling an application for Lambda for a customer of mine. To debug some permission issues I deployed an empty lambda function using a custom Docker image and noticed that the cold start times were consistently in the mid-200ms range, which is quite a bit higher than I would expect for a lambda function that does nothing.
Upon further investigation I noticed that the Node.js AWS lambda base image, that I was using to build my docker image which contained the function code, was already scratching the 380MB mark unzipped / 115MB when compressed. As a cloud is nothing more than a bunch of computers I figured that the size of the image might have an impact on the cold start, as the image needs to be downloaded and unpacked before the function can be executed.
To test my hypothesis, I set out to build the smallest image possible that still runs a function. In order to achieve that, I would need maximum control over everything that goes into the image - and what better tool to do that than nix?
About Lambda init times
For those that don't know: AWS Lambda is a serverless "function as a service", which, in simple terms, allows you to upload your code to the service which will handle its execution for you without the need to manage actual VMs or machines.
Whenever a new request is sent to a Lambda function, the service first checks if
there are any idle workers available to process it. If not, a new worker is
started. The time it takes for this worker to be ready to process the request is
called init time
and actually consists of
multiple init phases:
extension, runtime and function init.
After a worker is done processing a request, it does not shut down instantly, but is kept in a warm state, ready to accept the next request. In this case, no additional init phase is needed. If the worker does not receive a new request within a certain timeframe it is shut down completely.
In practice, you notice these init times most often whenever there is a spike in requests and a lot of additional workers need to be started. If you are using lambda to serve HTTP requests, e.g. through an API Gateway, you may notice your site becoming sluggish.
Building a basic node image using dockerTools
The nix build system provides its own way to build a docker image from any given
nix expression with it's dockerTools
suite of tools. We'll be using the
vanilla dockerTools.buildImage
function to build our minimal docker image.
If we want to run Node.js on a Lambda, we'll at least need the Node.js binary. The derivation to create a Docker image that only contains Node.js 20 would look like this:
{pkgs ? import <nixpkgs> {}}:
pkgs.dockerTools.buildImage {
name = "node-only-test";
tag = "latest";
copyToRoot = pkgs.buildEnv {
name = "image-root";
paths = [pkgs.nodejs_20];
pathsToLink = ["/bin"];
};
config = {
Cmd = ["/bin/node"];
};
}
We can assert that this derivation works as we would expect by running:
$ docker load -i $(nix-build)
d03ef6a91078: Loading layer [=====================>] 194.7MB/194.7MB
$ docker run -it node-only-test:latest
Welcome to Node.js v20.17.0.
Type ".help" for more information.
> []+[]
''
Adding a handler
Now that we have got a working Node.js installation, we need some code to run.
For now, a simple console.log
will suffice:
"use strict"
console.log("hello world")
To get our handler.js
into our image, we need to create a new derivation. As
we don't need to bundle/build our JavaScript app right now, we can just copy the
file to the nix store as is:
{pkgs ? import <nixpkgs> {}}:
let
fs = pkgs.lib.fileset;
in
pkgs.stdenv.mkDerivation {
name = "example-aws-lambda-handler";
src = fs.toSource {
root = ./.;
fileset = ./handler.js;
};
postInstall = ''
mkdir -p $out/lib
cp -v handler.js $out/lib
'';
}
Next, we need to update our existing image derivation to include our handler and change the Docker command to run it:
{pkgs ? import <nixpkgs> {}}:
let
handler = import ./handler.nix {inherit pkgs;};
in
pkgs.dockerTools.buildImage {
name = "node-plus-handler-test";
tag = "latest";
copyToRoot = pkgs.buildEnv {
name = "image-root";
paths = [pkgs.nodejs_20 handler];
pathsToLink = ["/bin"];
};
config = {
Cmd = ["/bin/node" "${handler}/lib/handler.js"];
};
}
Again, we can validate that our image works by loading and running it:
$ docker load -i $(nix-build)
$ docker run -it node-plus-handler-test:latest
hello world
Building the AWS Lambda Ric
Until now, our image would not work in a Lambda environment, as we need one additional component: The AWS Lambda Runtime Interface Client or RIC for short. This library manages the communication between the Lambda service and our application code that is deployed with the function. It accepts incoming events, calls the handler code, and passes the response back to Lambda.
There are a few different RICs for different languages, but we'll use the native
Node.js RIC in this example. This library is available on npm and not yet part
of the nixpkgs repository, so we'll need to package it ourselves. Nix already
provides a builder for packages from npm, called buildNpmPackage
, which we can
use to build the RIC package. As the RIC has a few native dependencies, and
requires building the
aws-lambda-cpp
library using
node-gyp
, we'll need to add a few additional nativeBuildInputs
to our
derivation. Not all of them are documented in the aws-lambda-cpp
README, so
the following derivation required a bit of trial and error to get right:
{
buildNpmPackage,
fetchFromGitHub,
nodejs,
gcc,
libgnurl,
autoconf271,
automake,
cmake,
libtool,
perl,
}:
buildNpmPackage rec {
pname = "aws-lambda-nodejs-runtime-interface-client";
version = "3.2.1";
src = fetchFromGitHub {
owner = "aws";
repo = pname;
rev = "v${version}";
hash = "sha256-5NfhSavcrBlGZ4UYXRqPTXhB3FO0DhRq/2dg15D6tFc=";
};
inherit nodejs;
npmDepsHash = "sha256-XyHystDd+oxwhuNr5jpcqeVdMoEMUiSkvNF9P0M29Hs=";
nativeBuildInputs = [autoconf271 automake cmake libtool perl];
buildInputs = [gcc libgnurl];
dontUseCmakeConfigure = true;
}
Now that we have a working runtime interface, we need to update our handler.js
to conform to the
requirements
of Lambda:
"use strict"
exports.handler = async (event, context) => {
await new Promise(resolve => setTimeout(resolve, 1000)) // Simulate slow function
return {
statusCode: 200,
body: JSON.stringify({
message: "Hello World!",
streamName: context.logStreamName,
now: new Date().toISOString(),
}),
}
}
All that's left to do is to put the RIC into our image and change the Docker command to run it:
{pkgs ? import (import ./pinned-nixpkgs.nix) {}}:
let
node = pkgs.nodejs_20;
lambdaRic = with pkgs; (import ./aws-lambda-ric/aws-lambda-ric.nix {
nodejs = node;
inherit buildNpmPackage fetchFromGitHub gcc libgnurl autoconf271 automake cmake libtool perl lib;
});
handler = import ./handler.nix {inherit pkgs;};
in
pkgs.dockerTools.buildImage {
name = "node-with-handler-and-ric";
tag = "latest";
copyToRoot = pkgs.buildEnv {
name = "image-root";
paths = [node lambdaRic handler];
pathsToLink = ["/bin"];
};
config = {
Cmd = ["/bin/aws-lambda-ric" "${handler}/lib/handler.handler"];
};
}
Let's validate that our image still works:
$ docker load -i $(nix-build)
$ docker run -it node-with-handler-and-ric:latest
Executing '/nix/store/p6gq8hm2gf36hsf7wyq7dhywpds08v92-example-aws-lambda-handler/lib/handler.handler' in function directory '/'
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string: construction from null is not valid
Yikes! Did we take a wrong turn somewhere? No, not really. The above error
message is a bit misleading. The RIC is trying to connect to the Lambda services
and fails to do so (as it is not running in a Lambda environment). To test our
image locally, we need the aptly named Lambda Runtime Interface Emulator
or
Lambda RIE
for short. Luckily, this tool is already available in the nixpkgs
as aws-lambda-rie
.
Adding this package to the build inputs of the image and changing the docker CMD
to
Cmd = ["aws-lambda-rie" "/bin/aws-lambda-ric" "${handler}/lib/handler.handler"];
allows us to run the image locally:
$ docker run -p 8080:8080 -it node-with-handler-and-ric:latest
$ curl "http://localhost:8080/2015-03-31/functions/function/invocations" -d '{}'
{"statusCode":200,"body":"{\"message\":\"Hello World!\",\"streamName\":\"$LATEST\",\"now\":\"2024-11-30T13:00:01.128Z\"}"}%
Note that later examples will omit the RIE package and assume that the image will run in a Lambda environment.
Shrinking the image
Now that we have a working image that can run on AWS Lambda, it's time to take a
look at the size of our image. We'll be using
dive
to get a better grasp of the
contents of our image:
┃ ● Current Layer Contents ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Permission UID:GID Size Filetree
dr-xr-xr-x 0:0 310 MB ├── nix
dr-xr-xr-x 0:0 310 MB │ └── store
dr-xr-xr-x 0:0 121 MB │ ├─⊕ h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9
dr-xr-xr-x 0:0 67 MB │ ├─⊕ k5inwzpp6a0295pd3nfckk9hq8wmifhz-nodejs-20.15.1
dr-xr-xr-x 0:0 40 MB │ ├─⊕ 9as4j2wqvd88n6xf7bmwsb6n4riqjpww-icu4c-73.2
dr-xr-xr-x 0:0 30 MB │ ├─⊕ c10zhkbp6jmyh0xc5kd123ga8yy2p4hk-glibc-2.39-52
dr-xr-xr-x 0:0 12 MB │ ├─⊕ dsf31aa5l9lqpyk7cgnniwrinw7yzrbb-aws-lambda-nodejs-runtime-interface-client-3.2.1
dr-xr-xr-x 0:0 9.1 MB │ ├─⊕ swcl0ynnia5c57i6qfdcrqa72j7877mg-gcc-13.2.0-lib
dr-xr-xr-x 0:0 6.5 MB │ ├─⊕ 0d6qbqbgq8vl0nb3fy6wi9gfn6j3023d-openssl-3.0.14
dr-xr-xr-x 0:0 5.6 MB │ ├─⊕ 5lgw85a88a1bkwj0v3l367afrpyd4ac0-icu4c-73.2-dev
dr-xr-xr-x 0:0 3.1 MB │ ├─⊕ add5sh635jdzwsqn297fxhd13d5whxn0-ncurses-6.4
dr-xr-xr-x 0:0 2.0 MB │ ├─⊕ zdj2as83y9mcyv8xzhwgl2g8g1rr6mgn-openssl-3.0.14-dev
dr-xr-xr-x 0:0 1.9 MB │ ├─⊕ b7ipdvixma2jn8xv50kl2i55pb7ccb7q-tzdata-2024a
dr-xr-xr-x 0:0 1.8 MB │ ├─⊕ zvwpisszhpkkk8spqyya8n3bpm7wj39p-libunistring-1.1
dr-xr-xr-x 0:0 1.6 MB │ ├─⊕ 516kai7nl5dxr792c0nzq0jp8m4zvxpi-bash-5.2p32
dr-xr-xr-x 0:0 1.5 MB │ ├─⊕ mq90h9rv4yf8cz422ayy56fr4w8mi8fx-sqlite-3.45.3
dr-xr-xr-x 0:0 1.4 MB │ ├─⊕ mr63za5vkxj0yip6wj3j9lya2frdm3zc-coreutils-9.5
dr-xr-xr-x 0:0 1.0 MB │ ├─⊕ axajb2j3y3flz5knwybwa87c8l1zmcm9-openssl-3.0.14-bin
dr-xr-xr-x 0:0 811 kB │ ├─⊕ 6g1aqhzy0h5bzj82warn9vky175cyhag-gdbm-1.23
dr-xr-xr-x 0:0 809 kB │ ├─⊕ g78jna1i5qhh8gqs4mr64648f0szqgw4-xz-5.4.7
dr-xr-xr-x 0:0 757 kB │ ├─⊕ 9w009i6vvmjqcahi779yfx7wmd11m9hj-gmp-with-cxx-6.3.0
dr-xr-xr-x 0:0 469 kB │ ├─⊕ wz24hvvjsgfvz1hp06awzdvsvq4mc6x2-readline-8.2p10
dr-xr-xr-x 0:0 346 kB │ ├─⊕ 9jivp79yv91fl1i6ayq2107a78q7k43i-libidn2-2.3.7
dr-xr-xr-x 0:0 270 kB │ ├─⊕ mgmr6imyh0smpqhccb31dlk3hmnacjpc-expat-2.6.3
dr-xr-xr-x 0:0 248 kB │ ├─⊕ arhy8i96l81wz3zrldiwcmiax2gc2w7s-libuv-1.48.0
dr-xr-xr-x 0:0 216 kB │ ├─⊕ nfi1g1zr5di6xkqjwxwk5kdingf17h3g-mpdecimal-4.0.0
dr-xr-xr-x 0:0 159 kB │ ├─⊕ 2y852kcvb7shrj8f3z8j22pa0iybcbgj-xgcc-13.2.0-libgcc
dr-xr-xr-x 0:0 159 kB │ ├─⊕ yfd49ay99aa1a0jg80jsvnxbyl61fsh6-gcc-13.2.0-libgcc
dr-xr-xr-x 0:0 129 kB │ ├─⊕ c4v5jw6cp68m9a7akr157lyxiwa3byjf-libxcrypt-4.4.36
dr-xr-xr-x 0:0 127 kB │ ├─⊕ pkl664rrz6vb95piixzfm7qy1yc2xzgc-zlib-1.3.1
dr-xr-xr-x 0:0 122 kB │ ├─⊕ 7px4n99mcmdzx8nygx59f28j8g7vj0kb-acl-2.3.2
dr-xr-xr-x 0:0 114 kB │ ├─⊕ 77i5izsm6i7fyzdb5h8w28rcpawwqj6q-zlib-1.3.1-dev
dr-xr-xr-x 0:0 110 kB │ ├─⊕ c3xdbjc25q27x68swhkx1mfyw6vf5pc8-mailcap-2.1.53
dr-xr-xr-x 0:0 107 kB │ ├─⊕ qdr5aqn6v6035mpqxxzxk12i82j7g402-libuv-1.48.0-dev
dr-xr-xr-x 0:0 85 kB │ ├─⊕ dwkhspb2qz0pbkkxlr6ajgqi388phhwa-attr-2.5.2
dr-xr-xr-x 0:0 79 kB │ ├─⊕ zindpxb2vylx0nsh2jjyg2fhaiakf8d9-bzip2-1.0.8
dr-xr-xr-x 0:0 72 kB │ ├─⊕ 6rwkpxli14j08klbsiwn23jmp9r46dmi-libffi-3.4.6
dr-xr-xr-x 0:0 365 B │ └─⊕ p6gq8hm2gf36hsf7wyq7dhywpds08v92-example-aws-lambda-handler
Total Image size: 310 MB
is quite a lot for a Lambda function and barely less
than the public.ecr.aws/lambda/nodejs:20.2024.11.06.17
base image we set out
to beat that weighs in at Total Image size: 378 M
.
Where did we go wrong? It's time to take a closer look at how the packages that
we need are structured right now and what their dependencies are. To do this, we
can use the nix-store
command to inspect the dependencies of a derivation. To
be able to get a better understanding of how this tree of dependencies looks, we
can use the --graph
flag to generate a
dot file which we can render as an
SVG:
nix-store -q --graph "$(nix-build node-with-handler-and-ric-container.nix)"| dot -Tsvg
If you are missing the dot executable in your path, you can use
nix-shell -p graphviz
to get a temporary shell containing the graphviz
tooling. The output of this command can be piped to a file and opened in any
image viewer or browser:
Note: If you want to reproduce this graph, you'll need to inspect the
output of the buildEnv
builder. Inspecting the output of buildImage
will
just show a single node for the docker image, as all dependencies are flattened
into a self-contained single layer without any reference to the nix store on the
build system.
Eureka! It looks like our aws-lambda-ric
derivation is pulling python 3.11 as
a dependency. This package alone is about 120MB in size, as we can deduce from
the dive
output. But why?
The npm aws-lambda-ric
package is pure JavaScript, but has
two dependencies:
node-gyp
and node-addon-api
. Both are only used to build the native ric
bindings written in C++, but are still marked as runtime dependencies in the
package.json
file. To determine runtime dependencies that need to be copied to
the final derivation as buildNpmPackage
runs a npm install --omit=dev
and
copies the resulting node_modules
directory to the final derivation. As
node-gyp
contains a few python scripts these are copied to the final
derivation as well, which in turn pulls in python as a dependency.
As we know that these dependencies are not needed during runtime we can solve
this problem easily by removing the whole node_modules
directory from the
final derivation. This can be done by adding a postInstall
hook:
# parameters omitted
buildNpmPackage rec {
# ...rest of the derivation
postInstall = lib.optionalString prunePython ''
# All dependencies of the ric are not actual
# runtime deps, but are build deps. We can remove
# them to reduce the size of the final image.
rm -rf $nodeModulesPath
'';
}
But we are still not done here! While working through the node-gyp
issue, I
noticed that source code of the aws-lambda-cpp
library is bundled with the npm
package. There are also a few zip files of source code of its dependencies in a
directory called deps
. While we are here, we can remove those as well. This
leaves us with this final postInstall
hook:
# parameters omitted
buildNpmPackage rec {
# ...rest of the derivation
postInstall = lib.optionalString prunePython ''
# All dependencies of the ric are not actual
# runtime deps, but are build deps. We can remove
# them to reduce the size of the final image.
rm -rf $nodeModulesPath
# remove build leftovers
find $out/lib/node_modules/aws-lambda-ric/deps -maxdepth 1 \
-not \( -path $out/lib/node_modules/aws-lambda-ric/deps/artifacts -prune \) \
-not \( -name deps \) \
-exec rm -rf {} \;
'';
}
Taking a look at our image right now we can see that we made a lot of progress,
as it now stands at Total Image size: 168 MB
. But there is still more we can
do!
Currently, we are using nodejs_20
from the nixpkgs, which contains a full node
installation including tools like npm
and npx
, which we don't need if we
just want to run a prebundled JavaScript app. Luckily, nixpkgs also contains a
slim
package for every Node.js version. In our case this is nodejs-slim_20
.
If our handler was the only JS code being added to the image, this package would
be a drop-in replacement for nodejs_20
, but we also need to build the
aws-lambda-ric
package. To do so, we need the full Node.js installation.
During build, the package generates a wrapper js executable. buildNpmPackage
patches this executable to point to the Node.js installation used during the
build. This in turn pulls the full Node.js installation into the final
derivation:
#! /nix/store/syl4snn859kpqvn9qh91kr7n9i4dws04-bash-5.2p32/bin/bash -e
exec "/nix/store/if6aqyl3sl0hz14a12mndj35swb1mcwi-nodejs-20.17.0/bin/node" /nix/store/fv8d452b9hr0pf0w9bdc3ja7n0fb653r-aws-lambda-nodejs-runtime-interface-client-3.2.1/lib/node_modules/aws-lambda-ric/bin/index.mjs "$@"
To fix this, we could either change the path in the aws-lambda-ric
executable
that pulls the build Node.js version after it is built, or we could get rid of
the executable altogether by updating our Docker CMD
line to include the node
invocation directly.
I've chosen to do the former by wrapping the aws-lambda-ric
derivation into a
new derivation that fixes the shebang line by adding a postFixup
hook:
{
pkgs,
minifiedNode,
buildNode,
}:
with pkgs; let
lambdaRic =
(import ./aws-lambda-ric/aws-lambda-ric.nix {
nodejs = buildNode;
inherit buildNpmPackage fetchFromGitHub gcc libgnurl autoconf271 automake cmake libtool perl lib;
})
.overrideAttrs (oldAttrs: {
postFixup = ''
escapedNewInterpeter=$(printf '%s\n' "${minifiedNode}" | sed -e 's/[\/&]/\\&/g')
escapedOldInterpeter=$(printf '%s\n' "${buildNode}" | sed -e 's/[]\/$*.^[]/\\&/g')
find $out -type f -print0 | xargs -0 sed -i "s/$escapedOldInterpeter/$escapedNewInterpeter/g"
'';
});
in
lambdaRic
Using nodejs-slim
pushes our Image down to Total Image size: 158 MB
according to dive. This is as far down as we can get without removing any of the
functionality that Node.js offers.
Removing intl capabilities
Usually, when I ship some kind of backend application, I don't need a lot of
I18n support, as the frontend of my application takes care of that. Using this
precondition, we can safely remove all I18n capabilities from the node version
that we want to bundle. This allows us to get rid of the
intl
support from node,
which removes the icu4c
dependency which comes in at 40MB. We can also remove
most of the glibc locales which are another 16MB:
...
dr-xr-xr-x 0:0 40 MB │ ├─⊕ 9as4j2wqvd88n6xf7bmwsb6n4riqjpww-icu4c-73.2
...
dr-xr-xr-x 0:0 30 MB ├── c10zhkbp6jmyh0xc5kd123ga8yy2p4hk-glibc-2.39-5
dr-xr-xr-x 0:0 16 MB │ └── share
dr-xr-xr-x 0:0 16 MB │ ├── i18n
...
Beyond the 40MB of icu4c
itself, it also pulls a few additional dependencies:
To remove intl
support, we need to
compile Node.js with the --without-intl
flag.
The nixpkgs builder for Node.js adds the --with-intl=system-icu
flag by
default, so we need to remove this flag and add the --without-intl
flag
instead. We'll do this by creating a new derivation that takes any Node.js
package and recompiles it without intl
by changing the build flags. While we
are here, we can also remove some additional stuff that we don't need during
runtime like header files and man pages:
{
lib,
nodeToMinify,
}:
let
minifiedNode = nodeToMinify.overrideAttrs (oldAttrs: {
postInstall =
oldAttrs.postInstall
+ ''
rm -r $out/share
rm -r $out/lib
rm -r $out/include
'';
configureFlags =
(lib.remove "--with-intl=system-icu" (lib.flatten oldAttrs.configureFlags))
++ ["--without-intl"];
});
in
minifiedNode
For glibc
we'll create a similar derivation that removes the locales and
charmaps that we don't need:
{
glibc,
concatMapStringsSep,
localesToKeep ? [],
charmapsToKeep ? [],
}:
let
glibCLocales = localesToKeep ++ ["C"];
glibCCharmaps = charmapsToKeep ++ ["UTF-8"];
minGlibC = glibc.overrideAttrs (oldAttrs: {
postInstall =
oldAttrs.postInstall
+ ''
find $out/share/i18n/locales -type f ${concatMapStringsSep " " (l: "-not -name '${l}'") glibCLocales} -exec rm {} \;
find $out/share/i18n/charmaps -type f ${concatMapStringsSep " " (l: "-not -name '${l}.gz'") glibCCharmaps} -exec rm {} \;
'';
});
in
minGlibC
Now we just have to make sure that every package that gets put into the final
Docker image uses this minified glibc
to prevent the unmodified glibc
from
being pulled in as a dependency. The simplest way to do this is to define an
overlay over the used nixpkgs that replaces the glibc package with the minified
version:
{pkgs ? import (import ./pinned-nixpkgs.nix) {}}:
let
glibcNoLocales = import ./glibc-no-locales.nix {
glibc = pkgs.glibc;
lib = pkgs.lib;
};
moddedPkgs = pkgs.extend (self: super: {glibc = glibcNoLocales;});
minifiedNode = import ./minified-node.nix {
lib = pkgs.lib;
nodeToMinify = moddedPkgs.nodejs-slim_20;
};
lambdaRic = import ./lambda-ric-with-interpreter.nix {
buildNode = moddedPkgs.nodejs_20;
pkgs = moddedPkgs;
inherit minifiedNode;
};
handler = import ./handler.nix {pkgs = moddedPkgs;};
in
pkgs.dockerTools.buildImage {
name = "nodetest";
tag = "latest";
copyToRoot = pkgs.buildEnv {
name = "image-root";
paths = [lambdaRic handler];
pathsToLink = ["/bin"];
};
config = {
Cmd = ["/bin/aws-lambda-ric" "${handler}/lib/handler.handler"];
};
}
This final image now only needs Total Image size: 93 M
(or 31MB compressed),
which is almost a 75 percent reduction in size compared to the AWS base image!
Tests
To validate if my initial assumptions are correct, and a smaller image equals faster init times, I've set up the following test. I've created three Lambda compatible docker images that contain the same function code:
- The first one is built using the AWS base image
public.ecr.aws/lambda/nodejs:20.2024.11.06.17
:FROM public.ecr.aws/lambda/nodejs:20.2024.11.06.17 COPY handler.js ${LAMBDA_TASK_ROOT} CMD [ "handler.handler" ]
- The second is built using the nix derivation described above, but with
Intl
enabled - The third image is built using the nix derivation described above, but with
Intl
disabled and glibc locales removed.
The handler that is used in both images is the same as described above. To measure the init times of both Lambda functions, the following test is performed for each image:
- The image is pushed to ECR
- A new Lambda function using the image is created
- The function is invoked once using the AWS CLI. The init times as reported by AWS are logged.
- The function is invoked in parallel up to the parallelization limit of the account. The init times as reported by AWS are logged.
- A new version of the Lambda function is created using the CLI. This forces a new cold start upon the next invocation.
- Steps 3-5 are repeated until the Lambda has been invoked a total of 10000 times.
Step 3 is necessary because the init times of the very first Lambda call after a new deployment can vary wildly from the init times of subsequent calls (I've seen times of up to five seconds independent of the image). As I don't want to optimize the very first cold start after an update, but scale out events, I will ignore these init times.
Results
A total of 10000 Lambda invocations were performed for each docker image using the test setup described above. Of these, 9000 were cold starts with an init time being reported by Lambda. The following chart shows the distribution of these init times, binned to buckets of 10ms size:
The number you see in the chart above don't quite add up to 9000 measurements per image, as depending on the behaviour of the Lambda scheduler and the timing of the invocations, some request were not routed to new workers but instead processed by existing ones.
By computing a few performance indicators, we can quantify the performance gains of our nix-based image even better:
AWS Base Image | Wit Intl | Δ | Without Intl | Δ | |
---|---|---|---|---|---|
Mean | 246 ms | 204 ms | (-17,07%) | 196 ms | (-20,32%) |
Median | 236 ms | 191 ms | (-19,06%) | 180 ms | (-23,72%) |
Standard Deviation | 41 ms | 51 ms | (+24,39%) | 52 ms | (+26,82%) |
The mean and median init times of our nix-based image are significantly lower compared to the AWS base image. Curiously, the standard deviation of the init times is worse than the AWS base image. This phenomenon might just be caused by blocking communication with the overarching service or differences in extension init times, but I am not sure at the moment.
Outlook
In this article, I've used nix to drastically reduce the image size of a Docker image that can be deployed to AWS Lambda, which cuts down the init times of this function.
The code that I am executing in the Lambda function is (intentionally) very simple to keep the focus on the impact of the execution environment. The next step would be to test this image with a real-world application to check whether the performance and size improvements can be translated to a full, usable application.
As I am focusing on Node.js only, it would be interesting to look at other (interpreted) languages like Python or Ruby to see if similar improvements can be made.
If one wanted to reduce the size of the existing derivation even further, there are a few avenues that could be explored. For example, one could try to replace all usages of glibc with the much smaller musl. This would probably require a lot of work, as musl is not a drop-in replacement for glibc, but one could start by looking at how the needed applications are built in alpine Linux, which uses musl as its libc implementation.
Sourcecode
You can find the nix derivation to build the images that I've tested, as well as all intermediate images that I've described throughout this article, on GitHub.
About the header image
Created with OpenAIs GPT-4o on 2024-11-13. Prompt:
A comic-style illustration of a large, majestic whale flying through the sky, breaking through fluffy clouds as it ascends toward a bright, radiant sun. The whale is depicted in mid-air, surrounded by scattered clouds that part as it moves, leaving a dynamic trail. The sunlight shines through, casting a warm golden glow over the scene. The background sky transitions from deep blue near the clouds to bright yellow-orange near the sun. High-energy motion lines and bold colors emphasize the whale's powerful movement, capturing a sense of wonder and freedom.