Possible k8s OOM Kill prevention pill 2 - rlimit by taylordowns2000 · Pull Request #1370 · OpenFn/kit

taylordowns2000 · 2026-04-13T18:48:19Z

Background: It took me about 20 seconds to crash a staging worker in Kubernetes:

The above run will show up as "lost" in the next 30 minutes.

This PR uses prlimit to set RLIMIT_AS on each forked child process, capping virtual address space so a runaway
run crashes itself instead of OOM-killing the pod.

It's opt-in by detection: active when prlimit (from util-linux) is available on Linux; no-op on macOS / local dev, and it adds util-linux to the worker Docker image so it's available

Testing on staging

Deploy this branch to a worker connected to app.staging.openfn.org
Create a workflow with a job that spikes memory beyond the configured limit, e.g.:

  fn(state => {
    const arr = [];
    while (true) { arr.push(new Array(1e6).fill('x')); }
    return state;
  });

Run the workflow and confirm:

The run fails with an OOM error (not a pod restart)
Other concurrent runs on the same worker are unaffected
The worker recovers and picks up new runs normally

Check worker logs for cgroup memory enforcement enabled at startup and killed by SIGKILL (probable OOM) on the failing run
Verify no leftover openfn-worker-* directories under the cgroup root after the run completes

AI Usage

Please disclose whether you've used AI anywhere in this PR (it's cool, we just
want to know!):

I have used Claude Code
I have used another model
I have not used AI

You can read more details in our
Responsible AI Policy

Release branch checklist

Delete this section if this is not a release PR.

If this IS a release branch:

Run pnpm changeset version from root to bump versions
Run pnpm install
Commit the new version numbers
Run pnpm changeset tag to generate tags
Push tags git push --tags

Tags may need updating if commits come in after the tags are first generated.

josephjclark · 2026-04-14T07:11:32Z

Gosh there's a lot of stuff here, and I have no idea what any of it does. I'll take a close look at it (probably tomorrow)

josephjclark

I'm a lot more comfortable with this approach over the cgroup stuff. This feels more direct and targeted at the use-case we need.

I still want to read a little more about it, and I'll make a few changes to the PR.

The big thing I'm concerned about is: will the engine be able to set the correct exit condition when rlimit kills the process? I need to test that locally

josephjclark · 2026-04-14T15:48:34Z

+  const hasPrlimit = detectPrlimitSupport(logger);
+
+  if (hasPrlimit && options.maxWorkerMemoryMb) {
+    logger.info(


I would append this to the previous startup log - it'll be way more useful there

josephjclark · 2026-04-14T15:51:05Z


 type WorkerOptions = {
  maxWorkers?: number;
+  maxWorkerMemoryMb?: number; // kernel-level memory limit per child process (cgroup v2)


I would just re-use the existing memoryLimitMb option

How that limit gets used in rgroups or heap memory settings or whatever is an implementation detail. The admin just needs to say "don't let any given job take more than 500mb"

josephjclark · 2026-04-14T15:51:47Z

      allWorkers[child.pid!] = child;
+
+      if (hasPrlimit && options.maxWorkerMemoryMb) {
+        // RLIMIT_AS counts virtual address space, not RSS.


This comment doesn't belong here. We should just pass the mb limit into the applyMemoryLimit function

I think I'd also say: take the memory limit used in the run, add 10%, (20%?) and set that as the hard process limit.

I don't really know why - I just feel like like we should let node control the exit itself, and use limit as a hard fallback

josephjclark · 2026-04-14T15:53:25Z

+    execFileSync('prlimit', [
+      '--pid',
+      String(pid),
+      `--as=${limitBytes}:${limitBytes}`,


I need to look into:

should the soft and hard limit be the same?

should we be setting address space or RSS? or both?

josephjclark · 2026-04-14T15:57:15Z

@@ -0,0 +1,51 @@
+import test from 'ava';


Yeah I don't know what these tests are going to tell us. Maybe this is something to do at the integration test level. Maybe it's more appropriate that we don't test this at all?

taylordowns2000 · 2026-04-14T21:42:50Z

Given the early success of "pill 1", I'm happy to close this. Whatcha think?

josephjclark · 2026-04-15T15:25:45Z

@taylordowns2000 I'm still interested in this but do need to test and probe further.

Memory was still spiking super high even after the fix. It might be that runs are consuming 1.5 gb memory before getting killed by the process - which is still 500mb over the allowed limit. Two of those workflows running at once would kill the worker.

Using rlimit and enforcing memory constraints in the OS should give us far stricter control of memory, which would mean we can kill the job the moment it allocates outside of its bounds. I'm very interested in that.

If it works well, we could even think more seriously about dynamic memory allocation: claiming a run with a 512 memory limit if you only have 800mb available, for example. Capacity would be determined purely by memory availability, not by an arbitrary number, and we could totally trust it.

I've always been a bit stuck on cgroups (not that I really understand them) because they're designed to restrict a set of processes. But that's not really what I want, I don't think. rlimit feels nicer because I can control it per-process, so it feels more suitable to the task.

josephjclark · 2026-06-24T15:05:13Z

      allWorkers[child.pid!] = child;
+
+      if (hasPrlimit && options.maxWorkerMemoryMb) {
+        // RLIMIT_AS counts virtual address space, not RSS.


I hadn't appreciated what this comment meant when I looked at this before. Nor when I started developng out the solution for my own tests.

What's really important is this: pr limit sets a limit of 10x the actual run limit we want. Ten times. It's setting a limit of 10gb per run. But kubernetes will OOMkill us when the pod hits 2.5gb. So our rlimit guide will never fire.

josephjclark · 2026-06-24T15:12:17Z

Here's a summary:

We explored using prlimit to enforce per-worker memory limits in our Node.js worker pool. The plan was to set RLIMIT_AS (virtual address space) on each child process.

The problem: RLIMIT_AS limits virtual address space, not physical memory. Node/V8 reserves 4-8GB of virtual address space at startup regardless of actual usage, so you have to set RLIMIT_AS to roughly 10x your intended memory limit just to let the process start. In our environment (Kubernetes pod with 2.5GB), that means RLIMIT_AS would never fire before Kubernetes kills the pod anyway — so it provides no protection.

RLIMIT_RSS (physical memory) would be ideal but Linux doesn't enforce it. Cgroups would work but require root or pre-configured Kubernetes resource limits.

Conclusion: prlimit isn't the right tool here. We already set --max-old-space-size on each child process, which limits the V8 JS heap directly. When a worker exceeds that, Node throws a heap OOM, the process exits cleanly, and the pool detects it and raises an OOMError — all before Kubernetes intervenes. That's our enforcement mechanism and it's sufficient for the vast majority of cases (native addon memory leaks are the only gap, and those aren't practical to guard against here).

cgroup

d158f4f

taylordowns2000 added this to Core Apr 13, 2026

github-project-automation Bot moved this to New Issues in Core Apr 13, 2026

taylordowns2000 changed the title ~~cgroup~~ Possible k8s OOM Kill prevention via cgroup - pill 2 Apr 13, 2026

taylordowns2000 changed the title ~~Possible k8s OOM Kill prevention via cgroup - pill 2~~ Possible k8s OOM Kill prevention pill 2 - cgroup Apr 13, 2026

taylordowns2000 requested a review from josephjclark April 13, 2026 18:49

taylordowns2000 added 2 commits April 13, 2026 14:58

fix tests

85f79fe

rlimit

a19869d

taylordowns2000 changed the title ~~Possible k8s OOM Kill prevention pill 2 - cgroup~~ Possible k8s OOM Kill prevention pill 2 - rlimit Apr 13, 2026

bigger

9066dac

josephjclark reviewed Apr 14, 2026

View reviewed changes

josephjclark mentioned this pull request Jun 23, 2026

Use prlimit #1462

Closed

3 tasks

josephjclark reviewed Jun 24, 2026

View reviewed changes

josephjclark closed this Jun 24, 2026

github-project-automation Bot moved this from New Issues to Done in Core Jun 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible k8s OOM Kill prevention pill 2 - rlimit#1370

Possible k8s OOM Kill prevention pill 2 - rlimit#1370
taylordowns2000 wants to merge 4 commits into
mainfrom
memory-concept-2

taylordowns2000 commented Apr 13, 2026 •

edited

Loading

Uh oh!

josephjclark commented Apr 14, 2026

Uh oh!

josephjclark left a comment

Uh oh!

josephjclark Apr 14, 2026

Uh oh!

josephjclark Apr 14, 2026

Uh oh!

josephjclark Apr 14, 2026

Uh oh!

josephjclark Apr 14, 2026

Uh oh!

josephjclark Apr 14, 2026

Uh oh!

josephjclark Apr 14, 2026

Uh oh!

taylordowns2000 commented Apr 14, 2026

Uh oh!

josephjclark commented Apr 15, 2026

Uh oh!

josephjclark Jun 24, 2026

Uh oh!

josephjclark commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

taylordowns2000 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing on staging

AI Usage

Release branch checklist

Uh oh!

josephjclark commented Apr 14, 2026

Uh oh!

josephjclark left a comment

Choose a reason for hiding this comment

Uh oh!

josephjclark Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

josephjclark Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

josephjclark Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

josephjclark Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

josephjclark Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

josephjclark Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

taylordowns2000 commented Apr 14, 2026

Uh oh!

josephjclark commented Apr 15, 2026

Uh oh!

josephjclark Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

josephjclark commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

taylordowns2000 commented Apr 13, 2026 •

edited

Loading