Description
When nim-libp2p reaches the operating system's file descriptor limit (e.g., ulimit -n), the accept loop in the TCP transport layer falls into an infinite, non-yielding busy loop. This causes the node to instantly peg the CPU at 100%, starving the chronos async event loop and effectively deadlocking the entire application.
This is a critical stability issue for production nodes (which primarily run on Linux) under heavy connection load or targeted connection-spam attacks.
Root Cause & Platform Differences
The bug stems from how the async engine (chronos) interacts with the underlying OS event polling mechanisms (epoll on Linux vs. kqueue on macOS) when the accept() system call fails with EMFILE (Too many open files).
-
The TCP Transport accept Loop (tcptransport.nim)
When tcptransport.accept() encounters TransportTooManyError from Chronos, it catches the exception, logs a debug message, and returns nil.
-
The Switch accept Loop (switch.nim)
When the switch receives a nil connection from the transport layer, it simply calls continue to instantly retry the accept() call without yielding or backing off.
-
The Platform Difference (Why it hides on macOS but kills Linux nodes)
- On Linux (
epoll): Chronos uses level-triggered epoll (EPOLLIN). Because there are still pending TCP connections in the kernel's listen backlog that we couldn't accept, epoll instantly wakes up the event loop again. The loop calls accept(), hits EMFILE, returns nil, and loops again immediately. This loops thousands of times a second, consuming 100% CPU.
- On macOS (
kqueue): Chronos uses kqueue. When accept() fails, Chronos removes and re-adds the socket reader. Because no new state change has occurred on the listen socket since it was re-added, kqueue does not wake the event loop. The loop naturally pauses until a new connection arrives, masking the 100% CPU bug on local Mac development machines.
Steps to Reproduce (on Linux)
- Checkout to this commit: https://github.com/vacp2p/nim-libp2p/tree/2a1411ddc06ccabf33b93a914ddd09a43953a7f5
- Run
docker build -t reproduce-emfile -f Dockerfile.emfile . && docker run --rm reproduce-emfile
- You will see errors keep popping up: Server accept error (x1830000): [EMFILE] Too many open files in the process
Proposed Solution
Introduce an explicit async backoff mechanism in tcptransport.nim when TransportTooManyError is caught. This yields control back to the event loop, allowing the application to process existing connections, close old ones, and eventually free up file descriptors.
File: libp2p/transports/tcptransport.nim
except TransportTooManyError as exc:
debug "Too many files opened", description = exc.msg
+ await sleepAsync(100.milliseconds)
return nil
By adding a 100ms sleepAsync, we guarantee that the event loop can breathe and process other events, completely mitigating the CPU starvation deadlock.
Description
When
nim-libp2preaches the operating system's file descriptor limit (e.g.,ulimit -n), theacceptloop in the TCP transport layer falls into an infinite, non-yielding busy loop. This causes the node to instantly peg the CPU at 100%, starving thechronosasync event loop and effectively deadlocking the entire application.This is a critical stability issue for production nodes (which primarily run on Linux) under heavy connection load or targeted connection-spam attacks.
Root Cause & Platform Differences
The bug stems from how the async engine (
chronos) interacts with the underlying OS event polling mechanisms (epollon Linux vs.kqueueon macOS) when theaccept()system call fails withEMFILE(Too many open files).The TCP Transport
acceptLoop (tcptransport.nim)When
tcptransport.accept()encountersTransportTooManyErrorfrom Chronos, it catches the exception, logs adebugmessage, and returnsnil.The Switch
acceptLoop (switch.nim)When the
switchreceives anilconnection from the transport layer, it simply callscontinueto instantly retry theaccept()call without yielding or backing off.The Platform Difference (Why it hides on macOS but kills Linux nodes)
epoll): Chronos uses level-triggeredepoll(EPOLLIN). Because there are still pending TCP connections in the kernel's listen backlog that we couldn't accept,epollinstantly wakes up the event loop again. The loop callsaccept(), hitsEMFILE, returnsnil, and loops again immediately. This loops thousands of times a second, consuming 100% CPU.kqueue): Chronos useskqueue. Whenaccept()fails, Chronos removes and re-adds the socket reader. Because no new state change has occurred on the listen socket since it was re-added,kqueuedoes not wake the event loop. The loop naturally pauses until a new connection arrives, masking the 100% CPU bug on local Mac development machines.Steps to Reproduce (on Linux)
docker build -t reproduce-emfile -f Dockerfile.emfile . && docker run --rm reproduce-emfileProposed Solution
Introduce an explicit async backoff mechanism in
tcptransport.nimwhenTransportTooManyErroris caught. This yields control back to the event loop, allowing the application to process existing connections, close old ones, and eventually free up file descriptors.File:
libp2p/transports/tcptransport.nimexcept TransportTooManyError as exc: debug "Too many files opened", description = exc.msg + await sleepAsync(100.milliseconds) return nilBy adding a 100ms
sleepAsync, we guarantee that the event loop can breathe and process other events, completely mitigating the CPU starvation deadlock.