Skip to content

Kill local cluster runner child processes on abnormal test exit#4597

Open
tillrohrmann wants to merge 1 commit intorestatedev:mainfrom
tillrohrmann:kill-restate-processes
Open

Kill local cluster runner child processes on abnormal test exit#4597
tillrohrmann wants to merge 1 commit intorestatedev:mainfrom
tillrohrmann:kill-restate-processes

Conversation

@tillrohrmann
Copy link
Copy Markdown
Contributor

The #[restate_core::test] macro installs a panic hook that calls std::process::exit(1), which bypasses Rust Drop impls. This left spawned Restate server processes running as orphans when tests failed.

Add an atexit handler that tracks child PGIDs in a global registry and sends SIGKILL on process exit. Since std::process::exit() runs C atexit handlers, this ensures cleanup even when Drop impls are skipped.

Set LOCAL_CLUSTER_RUNNER_RETAIN_CLUSTER=true to opt out and keep the cluster running for investigation (similar to the existing LOCAL_CLUSTER_RUNNER_RETAIN_TEMPDIR pattern).

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

The #[restate_core::test] macro installs a panic hook that calls
std::process::exit(1), which bypasses Rust Drop impls. This left
spawned Restate server processes running as orphans when tests failed.

Add an atexit handler that tracks child PIDs in a global registry and
sends SIGKILL on process exit. Since std::process::exit() runs C atexit
handlers, this ensures cleanup even when Drop impls are skipped.

Set LOCAL_CLUSTER_RUNNER_RETAIN_CLUSTER=true to opt out and keep the
cluster running for investigation (similar to the existing
LOCAL_CLUSTER_RUNNER_RETAIN_TEMPDIR pattern).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tillrohrmann tillrohrmann force-pushed the kill-restate-processes branch from 6236eec to 0136fd9 Compare April 14, 2026 14:01
Copy link
Copy Markdown
Contributor

@muhamadazmy muhamadazmy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The only question that comes to mind is why we start all child processes in different groups and not just one group. This is normally supported by libc and I think is possible in rust via the https://doc.rust-lang.org/std/os/unix/process/trait.CommandExt.html#tymethod.pre_exec . This means that we don't need to track all group ids of the child processes but just one.

Anyway, the changes looks good to me and is good to merge as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants