Blog

Improving Median Lambda Init Times By 23%

S

Sebastian Staffa

Published on Dec 9, 2024, 12:56 PM

An illustration of a whale breaking through clouds, generated by GPT-4o.

One of the most powerful features of the nix package manager is the ability to modify the build of each and every piece of software with ease - you don't need to, but you can. This is especially useful whenever we want to customize a tool or library for our use case or to optimize it for a specific environment.

In this post, we will leverage the capabilities of nix to build a minimalistic Docker image for a Node.js lambda application, speed up the median cold start of the lambda function by 23% percent and thus save money in the process.

This is a long blog post, click here if you are just interested in the results.

The idea

I initially got the idea for this blog post when bundling an application for Lambda for a customer of mine. To debug some permission issues I deployed an empty lambda function using a custom Docker image and noticed that the cold start times were consistently in the mid-200ms range, which is quite a bit higher than I would expect for a lambda function that does nothing.

Upon further investigation I noticed that the Node.js AWS lambda base image, that I was using to build my docker image which contained the function code, was already scratching the 380MB mark unzipped / 115MB when compressed. As a cloud is nothing more than a bunch of computers I figured that the size of the image might have an impact on the cold start, as the image needs to be downloaded and unpacked before the function can be executed.

To test my hypothesis, I set out to build the smallest image possible that still runs a function. In order to achieve that, I would need maximum control over everything that goes into the image - and what better tool to do that than nix?

About Lambda init times

For those that don't know: AWS Lambda is a serverless "function as a service", which, in simple terms, allows you to upload your code to the service which will handle its execution for you without the need to manage actual VMs or machines.

Whenever a new request is sent to a Lambda function, the service first checks if there are any idle workers available to process it. If not, a new worker is started. The time it takes for this worker to be ready to process the request is called init time and actually consists of multiple init phases: extension, runtime and function init.

After a worker is done processing a request, it does not shut down instantly, but is kept in a warm state, ready to accept the next request. In this case, no additional init phase is needed. If the worker does not receive a new request within a certain timeframe it is shut down completely.

In practice, you notice these init times most often whenever there is a spike in requests and a lot of additional workers need to be started. If you are using lambda to serve HTTP requests, e.g. through an API Gateway, you may notice your site becoming sluggish.

Building a basic node image using dockerTools

The nix build system provides its own way to build a docker image from any given nix expression with it's dockerTools suite of tools. We'll be using the vanilla dockerTools.buildImage function to build our minimal docker image.

If we want to run Node.js on a Lambda, we'll at least need the Node.js binary. The derivation to create a Docker image that only contains Node.js 20 would look like this:

{pkgs ? import <nixpkgs> {}}:
pkgs.dockerTools.buildImage {
  name = "node-only-test";
  tag = "latest";

  copyToRoot = pkgs.buildEnv {
    name = "image-root";
    paths = [pkgs.nodejs_20];
    pathsToLink = ["/bin"];
  };
  config = {
    Cmd = ["/bin/node"];
  };
}

We can assert that this derivation works as we would expect by running:

$ docker load -i $(nix-build)
d03ef6a91078: Loading layer [=====================>]  194.7MB/194.7MB
$ docker run -it node-only-test:latest
Welcome to Node.js v20.17.0.
Type ".help" for more information.
> []+[]
''

Adding a handler

Now that we have got a working Node.js installation, we need some code to run. For now, a simple console.log will suffice:

"use strict"
console.log("hello world")

To get our handler.js into our image, we need to create a new derivation. As we don't need to bundle/build our JavaScript app right now, we can just copy the file to the nix store as is:

{pkgs ? import <nixpkgs> {}}:
let
  fs = pkgs.lib.fileset;
in
  pkgs.stdenv.mkDerivation {
    name = "example-aws-lambda-handler";
    src = fs.toSource {
      root = ./.;
      fileset = ./handler.js;
    };
    postInstall = ''
      mkdir -p $out/lib
      cp -v handler.js $out/lib
    '';
  }

Next, we need to update our existing image derivation to include our handler and change the Docker command to run it:

{pkgs ? import <nixpkgs> {}}:
let
  handler = import ./handler.nix {inherit pkgs;};
in
  pkgs.dockerTools.buildImage {
    name = "node-plus-handler-test";
    tag = "latest";

    copyToRoot = pkgs.buildEnv {
      name = "image-root";
      paths = [pkgs.nodejs_20 handler];
      pathsToLink = ["/bin"];
    };
    config = {
      Cmd = ["/bin/node" "${handler}/lib/handler.js"];
    };
  }

Again, we can validate that our image works by loading and running it:

$ docker load -i $(nix-build)
$ docker run -it node-plus-handler-test:latest
hello world

Building the AWS Lambda Ric

Until now, our image would not work in a Lambda environment, as we need one additional component: The AWS Lambda Runtime Interface Client or RIC for short. This library manages the communication between the Lambda service and our application code that is deployed with the function. It accepts incoming events, calls the handler code, and passes the response back to Lambda.

There are a few different RICs for different languages, but we'll use the native Node.js RIC in this example. This library is available on npm and not yet part of the nixpkgs repository, so we'll need to package it ourselves. Nix already provides a builder for packages from npm, called buildNpmPackage, which we can use to build the RIC package. As the RIC has a few native dependencies, and requires building the aws-lambda-cpp library using node-gyp, we'll need to add a few additional nativeBuildInputs to our derivation. Not all of them are documented in the aws-lambda-cpp README, so the following derivation required a bit of trial and error to get right:

{
  buildNpmPackage,
  fetchFromGitHub,
  nodejs,
  gcc,
  libgnurl,
  autoconf271,
  automake,
  cmake,
  libtool,
  perl,
}:
buildNpmPackage rec {
  pname = "aws-lambda-nodejs-runtime-interface-client";
  version = "3.2.1";

  src = fetchFromGitHub {
    owner = "aws";
    repo = pname;
    rev = "v${version}";
    hash = "sha256-5NfhSavcrBlGZ4UYXRqPTXhB3FO0DhRq/2dg15D6tFc=";
  };

  inherit nodejs;
  npmDepsHash = "sha256-XyHystDd+oxwhuNr5jpcqeVdMoEMUiSkvNF9P0M29Hs=";
  nativeBuildInputs = [autoconf271 automake cmake libtool perl];
  buildInputs = [gcc libgnurl];

  dontUseCmakeConfigure = true;
}

Now that we have a working runtime interface, we need to update our handler.js to conform to the requirements of Lambda:

"use strict"

exports.handler = async (event, context) => {
  await new Promise(resolve => setTimeout(resolve, 1000)) // Simulate slow function
  return {
    statusCode: 200,
    body: JSON.stringify({
      message: "Hello World!",
      streamName: context.logStreamName,
      now: new Date().toISOString(),
    }),
  }
}

All that's left to do is to put the RIC into our image and change the Docker command to run it:

{pkgs ? import (import ./pinned-nixpkgs.nix) {}}:
let
  node = pkgs.nodejs_20;
  lambdaRic = with pkgs; (import ./aws-lambda-ric/aws-lambda-ric.nix {
    nodejs = node;
    inherit buildNpmPackage fetchFromGitHub gcc libgnurl autoconf271 automake cmake libtool perl lib;
  });
  handler = import ./handler.nix {inherit pkgs;};
in
  pkgs.dockerTools.buildImage {
    name = "node-with-handler-and-ric";
    tag = "latest";

    copyToRoot = pkgs.buildEnv {
      name = "image-root";
      paths = [node lambdaRic handler];
      pathsToLink = ["/bin"];
    };

    config = {
      Cmd = ["/bin/aws-lambda-ric" "${handler}/lib/handler.handler"];
    };
  }

Let's validate that our image still works:

$ docker load -i $(nix-build)
$ docker run -it node-with-handler-and-ric:latest
Executing '/nix/store/p6gq8hm2gf36hsf7wyq7dhywpds08v92-example-aws-lambda-handler/lib/handler.handler' in function directory '/'
terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string: construction from null is not valid

Yikes! Did we take a wrong turn somewhere? No, not really. The above error message is a bit misleading. The RIC is trying to connect to the Lambda services and fails to do so (as it is not running in a Lambda environment). To test our image locally, we need the aptly named Lambda Runtime Interface Emulator or Lambda RIE for short. Luckily, this tool is already available in the nixpkgs as aws-lambda-rie.

Adding this package to the build inputs of the image and changing the docker CMD to Cmd = ["aws-lambda-rie" "/bin/aws-lambda-ric" "${handler}/lib/handler.handler"]; allows us to run the image locally:

$ docker run -p 8080:8080 -it node-with-handler-and-ric:latest
$ curl "http://localhost:8080/2015-03-31/functions/function/invocations" -d '{}'
{"statusCode":200,"body":"{\"message\":\"Hello World!\",\"streamName\":\"$LATEST\",\"now\":\"2024-11-30T13:00:01.128Z\"}"}%

Note that later examples will omit the RIE package and assume that the image will run in a Lambda environment.

Shrinking the image

Now that we have a working image that can run on AWS Lambda, it's time to take a look at the size of our image. We'll be using dive to get a better grasp of the contents of our image:

┃ ● Current Layer Contents ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Permission     UID:GID       Size  Filetree
dr-xr-xr-x         0:0     310 MB  ├── nix
dr-xr-xr-x         0:0     310 MB  │   └── store
dr-xr-xr-x         0:0     121 MB  │       ├─⊕ h723hb9m43lybmvfxkk6n7j4v664qy7b-python3-3.11.9
dr-xr-xr-x         0:0      67 MB  │       ├─⊕ k5inwzpp6a0295pd3nfckk9hq8wmifhz-nodejs-20.15.1
dr-xr-xr-x         0:0      40 MB  │       ├─⊕ 9as4j2wqvd88n6xf7bmwsb6n4riqjpww-icu4c-73.2
dr-xr-xr-x         0:0      30 MB  │       ├─⊕ c10zhkbp6jmyh0xc5kd123ga8yy2p4hk-glibc-2.39-52
dr-xr-xr-x         0:0      12 MB  │       ├─⊕ dsf31aa5l9lqpyk7cgnniwrinw7yzrbb-aws-lambda-nodejs-runtime-interface-client-3.2.1
dr-xr-xr-x         0:0     9.1 MB  │       ├─⊕ swcl0ynnia5c57i6qfdcrqa72j7877mg-gcc-13.2.0-lib
dr-xr-xr-x         0:0     6.5 MB  │       ├─⊕ 0d6qbqbgq8vl0nb3fy6wi9gfn6j3023d-openssl-3.0.14
dr-xr-xr-x         0:0     5.6 MB  │       ├─⊕ 5lgw85a88a1bkwj0v3l367afrpyd4ac0-icu4c-73.2-dev
dr-xr-xr-x         0:0     3.1 MB  │       ├─⊕ add5sh635jdzwsqn297fxhd13d5whxn0-ncurses-6.4
dr-xr-xr-x         0:0     2.0 MB  │       ├─⊕ zdj2as83y9mcyv8xzhwgl2g8g1rr6mgn-openssl-3.0.14-dev
dr-xr-xr-x         0:0     1.9 MB  │       ├─⊕ b7ipdvixma2jn8xv50kl2i55pb7ccb7q-tzdata-2024a
dr-xr-xr-x         0:0     1.8 MB  │       ├─⊕ zvwpisszhpkkk8spqyya8n3bpm7wj39p-libunistring-1.1
dr-xr-xr-x         0:0     1.6 MB  │       ├─⊕ 516kai7nl5dxr792c0nzq0jp8m4zvxpi-bash-5.2p32
dr-xr-xr-x         0:0     1.5 MB  │       ├─⊕ mq90h9rv4yf8cz422ayy56fr4w8mi8fx-sqlite-3.45.3
dr-xr-xr-x         0:0     1.4 MB  │       ├─⊕ mr63za5vkxj0yip6wj3j9lya2frdm3zc-coreutils-9.5
dr-xr-xr-x         0:0     1.0 MB  │       ├─⊕ axajb2j3y3flz5knwybwa87c8l1zmcm9-openssl-3.0.14-bin
dr-xr-xr-x         0:0     811 kB  │       ├─⊕ 6g1aqhzy0h5bzj82warn9vky175cyhag-gdbm-1.23
dr-xr-xr-x         0:0     809 kB  │       ├─⊕ g78jna1i5qhh8gqs4mr64648f0szqgw4-xz-5.4.7
dr-xr-xr-x         0:0     757 kB  │       ├─⊕ 9w009i6vvmjqcahi779yfx7wmd11m9hj-gmp-with-cxx-6.3.0
dr-xr-xr-x         0:0     469 kB  │       ├─⊕ wz24hvvjsgfvz1hp06awzdvsvq4mc6x2-readline-8.2p10
dr-xr-xr-x         0:0     346 kB  │       ├─⊕ 9jivp79yv91fl1i6ayq2107a78q7k43i-libidn2-2.3.7
dr-xr-xr-x         0:0     270 kB  │       ├─⊕ mgmr6imyh0smpqhccb31dlk3hmnacjpc-expat-2.6.3
dr-xr-xr-x         0:0     248 kB  │       ├─⊕ arhy8i96l81wz3zrldiwcmiax2gc2w7s-libuv-1.48.0
dr-xr-xr-x         0:0     216 kB  │       ├─⊕ nfi1g1zr5di6xkqjwxwk5kdingf17h3g-mpdecimal-4.0.0
dr-xr-xr-x         0:0     159 kB  │       ├─⊕ 2y852kcvb7shrj8f3z8j22pa0iybcbgj-xgcc-13.2.0-libgcc
dr-xr-xr-x         0:0     159 kB  │       ├─⊕ yfd49ay99aa1a0jg80jsvnxbyl61fsh6-gcc-13.2.0-libgcc
dr-xr-xr-x         0:0     129 kB  │       ├─⊕ c4v5jw6cp68m9a7akr157lyxiwa3byjf-libxcrypt-4.4.36
dr-xr-xr-x         0:0     127 kB  │       ├─⊕ pkl664rrz6vb95piixzfm7qy1yc2xzgc-zlib-1.3.1
dr-xr-xr-x         0:0     122 kB  │       ├─⊕ 7px4n99mcmdzx8nygx59f28j8g7vj0kb-acl-2.3.2
dr-xr-xr-x         0:0     114 kB  │       ├─⊕ 77i5izsm6i7fyzdb5h8w28rcpawwqj6q-zlib-1.3.1-dev
dr-xr-xr-x         0:0     110 kB  │       ├─⊕ c3xdbjc25q27x68swhkx1mfyw6vf5pc8-mailcap-2.1.53
dr-xr-xr-x         0:0     107 kB  │       ├─⊕ qdr5aqn6v6035mpqxxzxk12i82j7g402-libuv-1.48.0-dev
dr-xr-xr-x         0:0      85 kB  │       ├─⊕ dwkhspb2qz0pbkkxlr6ajgqi388phhwa-attr-2.5.2
dr-xr-xr-x         0:0      79 kB  │       ├─⊕ zindpxb2vylx0nsh2jjyg2fhaiakf8d9-bzip2-1.0.8
dr-xr-xr-x         0:0      72 kB  │       ├─⊕ 6rwkpxli14j08klbsiwn23jmp9r46dmi-libffi-3.4.6
dr-xr-xr-x         0:0      365 B  │       └─⊕ p6gq8hm2gf36hsf7wyq7dhywpds08v92-example-aws-lambda-handler

Total Image size: 310 MB is quite a lot for a Lambda function and barely less than the public.ecr.aws/lambda/nodejs:20.2024.11.06.17 base image we set out to beat that weighs in at Total Image size: 378 M.

Where did we go wrong? It's time to take a closer look at how the packages that we need are structured right now and what their dependencies are. To do this, we can use the nix-store command to inspect the dependencies of a derivation. To be able to get a better understanding of how this tree of dependencies looks, we can use the --graph flag to generate a dot file which we can render as an SVG:

nix-store -q --graph "$(nix-build node-with-handler-and-ric-container.nix)"| dot -Tsvg

If you are missing the dot executable in your path, you can use nix-shell -p graphviz to get a temporary shell containing the graphviz tooling. The output of this command can be piped to a file and opened in any image viewer or browser:

tree of all node20 and lambda ric dependencies

Note: If you want to reproduce this graph, you'll need to inspect the output of the buildEnv builder. Inspecting the output of buildImage will just show a single node for the docker image, as all dependencies are flattened into a self-contained single layer without any reference to the nix store on the build system.

Eureka! It looks like our aws-lambda-ric derivation is pulling python 3.11 as a dependency. This package alone is about 120MB in size, as we can deduce from the dive output. But why?

The npm aws-lambda-ric package is pure JavaScript, but has two dependencies: node-gyp and node-addon-api. Both are only used to build the native ric bindings written in C++, but are still marked as runtime dependencies in the package.json file. To determine runtime dependencies that need to be copied to the final derivation as buildNpmPackage runs a npm install --omit=dev and copies the resulting node_modules directory to the final derivation. As node-gyp contains a few python scripts these are copied to the final derivation as well, which in turn pulls in python as a dependency.

As we know that these dependencies are not needed during runtime we can solve this problem easily by removing the whole node_modules directory from the final derivation. This can be done by adding a postInstall hook:

# parameters omitted
buildNpmPackage rec {
  # ...rest of the derivation

  postInstall = lib.optionalString prunePython ''
    # All dependencies of the ric are not actual
    # runtime deps, but are build deps. We can remove
    # them to reduce the size of the final image.
    rm -rf $nodeModulesPath
  '';
}

But we are still not done here! While working through the node-gyp issue, I noticed that source code of the aws-lambda-cpp library is bundled with the npm package. There are also a few zip files of source code of its dependencies in a directory called deps. While we are here, we can remove those as well. This leaves us with this final postInstall hook:

# parameters omitted
buildNpmPackage rec {
  # ...rest of the derivation

  postInstall = lib.optionalString prunePython ''
    # All dependencies of the ric are not actual
    # runtime deps, but are build deps. We can remove
    # them to reduce the size of the final image.
    rm -rf $nodeModulesPath

    # remove build leftovers
    find $out/lib/node_modules/aws-lambda-ric/deps -maxdepth 1 \
        -not \( -path $out/lib/node_modules/aws-lambda-ric/deps/artifacts -prune \) \
        -not \( -name deps \) \
        -exec rm -rf {} \;
  '';
}

Taking a look at our image right now we can see that we made a lot of progress, as it now stands at Total Image size: 168 MB. But there is still more we can do!

Currently, we are using nodejs_20 from the nixpkgs, which contains a full node installation including tools like npm and npx, which we don't need if we just want to run a prebundled JavaScript app. Luckily, nixpkgs also contains a slim package for every Node.js version. In our case this is nodejs-slim_20.

If our handler was the only JS code being added to the image, this package would be a drop-in replacement for nodejs_20, but we also need to build the aws-lambda-ric package. To do so, we need the full Node.js installation. During build, the package generates a wrapper js executable. buildNpmPackage patches this executable to point to the Node.js installation used during the build. This in turn pulls the full Node.js installation into the final derivation:

#! /nix/store/syl4snn859kpqvn9qh91kr7n9i4dws04-bash-5.2p32/bin/bash -e
exec "/nix/store/if6aqyl3sl0hz14a12mndj35swb1mcwi-nodejs-20.17.0/bin/node"  /nix/store/fv8d452b9hr0pf0w9bdc3ja7n0fb653r-aws-lambda-nodejs-runtime-interface-client-3.2.1/lib/node_modules/aws-lambda-ric/bin/index.mjs "$@"

To fix this, we could either change the path in the aws-lambda-ric executable that pulls the build Node.js version after it is built, or we could get rid of the executable altogether by updating our Docker CMD line to include the node invocation directly.

I've chosen to do the former by wrapping the aws-lambda-ric derivation into a new derivation that fixes the shebang line by adding a postFixup hook:

{
  pkgs,
  minifiedNode,
  buildNode,
}:
with pkgs; let
  lambdaRic =
    (import ./aws-lambda-ric/aws-lambda-ric.nix {
      nodejs = buildNode;
      inherit buildNpmPackage fetchFromGitHub gcc libgnurl autoconf271 automake cmake libtool perl lib;
    })
    .overrideAttrs (oldAttrs: {
      postFixup = ''
        escapedNewInterpeter=$(printf '%s\n' "${minifiedNode}" | sed -e 's/[\/&]/\\&/g')
        escapedOldInterpeter=$(printf '%s\n' "${buildNode}" | sed -e 's/[]\/$*.^[]/\\&/g')
        find $out -type f -print0 | xargs -0 sed -i "s/$escapedOldInterpeter/$escapedNewInterpeter/g"
      '';
    });
in
  lambdaRic

Using nodejs-slim pushes our Image down to Total Image size: 158 MB according to dive. This is as far down as we can get without removing any of the functionality that Node.js offers.

Removing intl capabilities

Usually, when I ship some kind of backend application, I don't need a lot of I18n support, as the frontend of my application takes care of that. Using this precondition, we can safely remove all I18n capabilities from the node version that we want to bundle. This allows us to get rid of the intl support from node, which removes the icu4c dependency which comes in at 40MB. We can also remove most of the glibc locales which are another 16MB:

...
dr-xr-xr-x         0:0      40 MB  │       ├─⊕ 9as4j2wqvd88n6xf7bmwsb6n4riqjpww-icu4c-73.2
...
dr-xr-xr-x         0:0      30 MB          ├── c10zhkbp6jmyh0xc5kd123ga8yy2p4hk-glibc-2.39-5
dr-xr-xr-x         0:0      16 MB          │   └── share
dr-xr-xr-x         0:0      16 MB          │       ├── i18n
...

Beyond the 40MB of icu4c itself, it also pulls a few additional dependencies:

tree of all nodejs-slim_20 dependencies

To remove intl support, we need to compile Node.js with the --without-intl flag. The nixpkgs builder for Node.js adds the --with-intl=system-icu flag by default, so we need to remove this flag and add the --without-intl flag instead. We'll do this by creating a new derivation that takes any Node.js package and recompiles it without intl by changing the build flags. While we are here, we can also remove some additional stuff that we don't need during runtime like header files and man pages:

{
  lib,
  nodeToMinify,
}:
let
  minifiedNode = nodeToMinify.overrideAttrs (oldAttrs: {
    postInstall =
      oldAttrs.postInstall
      + ''
        rm -r $out/share
        rm -r $out/lib
        rm -r $out/include
      '';

    configureFlags =
      (lib.remove "--with-intl=system-icu" (lib.flatten oldAttrs.configureFlags))
      ++ ["--without-intl"];
  });
in
  minifiedNode

For glibc we'll create a similar derivation that removes the locales and charmaps that we don't need:

{
  glibc,
  concatMapStringsSep,
  localesToKeep ? [],
  charmapsToKeep ? [],
}:
let
  glibCLocales = localesToKeep ++ ["C"];
  glibCCharmaps = charmapsToKeep ++ ["UTF-8"];

  minGlibC = glibc.overrideAttrs (oldAttrs: {
    postInstall =
      oldAttrs.postInstall
      + ''
        find $out/share/i18n/locales -type f ${concatMapStringsSep " " (l: "-not -name '${l}'") glibCLocales} -exec rm {} \;
        find $out/share/i18n/charmaps -type f ${concatMapStringsSep " " (l: "-not -name '${l}.gz'") glibCCharmaps} -exec rm {} \;
      '';
  });
in
  minGlibC

Now we just have to make sure that every package that gets put into the final Docker image uses this minified glibc to prevent the unmodified glibc from being pulled in as a dependency. The simplest way to do this is to define an overlay over the used nixpkgs that replaces the glibc package with the minified version:

{pkgs ? import (import ./pinned-nixpkgs.nix) {}}:
let
  glibcNoLocales = import ./glibc-no-locales.nix {
    glibc = pkgs.glibc;
    lib = pkgs.lib;
  };
  moddedPkgs = pkgs.extend (self: super: {glibc = glibcNoLocales;});

  minifiedNode = import ./minified-node.nix {
    lib = pkgs.lib;
    nodeToMinify = moddedPkgs.nodejs-slim_20;
  };
  lambdaRic = import ./lambda-ric-with-interpreter.nix {
    buildNode = moddedPkgs.nodejs_20;
    pkgs = moddedPkgs;
    inherit minifiedNode;
  };

  handler = import ./handler.nix {pkgs = moddedPkgs;};
in
  pkgs.dockerTools.buildImage {
    name = "nodetest";
    tag = "latest";

    copyToRoot = pkgs.buildEnv {
      name = "image-root";
      paths = [lambdaRic handler];
      pathsToLink = ["/bin"];
    };

    config = {
      Cmd = ["/bin/aws-lambda-ric" "${handler}/lib/handler.handler"];
    };
  }

This final image now only needs Total Image size: 93 M (or 31MB compressed), which is almost a 75 percent reduction in size compared to the AWS base image!

Tests

To validate if my initial assumptions are correct, and a smaller image equals faster init times, I've set up the following test. I've created three Lambda compatible docker images that contain the same function code:

  • The first one is built using the AWS base image public.ecr.aws/lambda/nodejs:20.2024.11.06.17:
    FROM public.ecr.aws/lambda/nodejs:20.2024.11.06.17
    COPY handler.js ${LAMBDA_TASK_ROOT}
    CMD [ "handler.handler" ]
    
  • The second is built using the nix derivation described above, but with Intl enabled
  • The third image is built using the nix derivation described above, but with Intl disabled and glibc locales removed.

The handler that is used in both images is the same as described above. To measure the init times of both Lambda functions, the following test is performed for each image:

  1. The image is pushed to ECR
  2. A new Lambda function using the image is created
  3. The function is invoked once using the AWS CLI. The init times as reported by AWS are logged.
  4. The function is invoked in parallel up to the parallelization limit of the account. The init times as reported by AWS are logged.
  5. A new version of the Lambda function is created using the CLI. This forces a new cold start upon the next invocation.
  6. Steps 3-5 are repeated until the Lambda has been invoked a total of 10000 times.

Step 3 is necessary because the init times of the very first Lambda call after a new deployment can vary wildly from the init times of subsequent calls (I've seen times of up to five seconds independent of the image). As I don't want to optimize the very first cold start after an update, but scale out events, I will ignore these init times.

Results

A total of 10000 Lambda invocations were performed for each docker image using the test setup described above. Of these, 9000 were cold starts with an init time being reported by Lambda. The following chart shows the distribution of these init times, binned to buckets of 10ms size:

The number you see in the chart above don't quite add up to 9000 measurements per image, as depending on the behaviour of the Lambda scheduler and the timing of the invocations, some request were not routed to new workers but instead processed by existing ones.

By computing a few performance indicators, we can quantify the performance gains of our nix-based image even better:

A comparison of a Lambda init time performance indicators. Values are rounded to the nearest integer.
AWS Base ImageWit IntlΔWithout IntlΔ
Mean246 ms204 ms(-17,07%)196 ms(-20,32%)
Median236 ms191 ms(-19,06%)180 ms(-23,72%)
Standard Deviation41 ms51 ms(+24,39%)52 ms(+26,82%)

The mean and median init times of our nix-based image are significantly lower compared to the AWS base image. Curiously, the standard deviation of the init times is worse than the AWS base image. This phenomenon might just be caused by blocking communication with the overarching service or differences in extension init times, but I am not sure at the moment.

Outlook

In this article, I've used nix to drastically reduce the image size of a Docker image that can be deployed to AWS Lambda, which cuts down the init times of this function.

The code that I am executing in the Lambda function is (intentionally) very simple to keep the focus on the impact of the execution environment. The next step would be to test this image with a real-world application to check whether the performance and size improvements can be translated to a full, usable application.

As I am focusing on Node.js only, it would be interesting to look at other (interpreted) languages like Python or Ruby to see if similar improvements can be made.

If one wanted to reduce the size of the existing derivation even further, there are a few avenues that could be explored. For example, one could try to replace all usages of glibc with the much smaller musl. This would probably require a lot of work, as musl is not a drop-in replacement for glibc, but one could start by looking at how the needed applications are built in alpine Linux, which uses musl as its libc implementation.

Sourcecode

You can find the nix derivation to build the images that I've tested, as well as all intermediate images that I've described throughout this article, on GitHub.

About the header image

Created with OpenAIs GPT-4o on 2024-11-13. Prompt:

A comic-style illustration of a large, majestic whale flying through the sky, breaking through fluffy clouds as it ascends toward a bright, radiant sun. The whale is depicted in mid-air, surrounded by scattered clouds that part as it moves, leaving a dynamic trail. The sunlight shines through, casting a warm golden glow over the scene. The background sky transitions from deep blue near the clouds to bright yellow-orange near the sun. High-energy motion lines and bold colors emphasize the whale's powerful movement, capturing a sense of wonder and freedom.

More posts like this one

Note: NixOS Feedback Handling

A quick, positive example on how a project can handle critical feedback

By Sebastian Staffa

MQTT For Web Developers

MQTT is a protocol that is typically used in an IoT context. In this article, we'll explore how we could use its capabilities in a traditional web application to stream messages in real-time.

By Sebastian Staffa

In today's blog I want to talk about why I have chosen NixOS as my main operating system.

By Sebastian Staffa