CPU-Intensive Workers over TCP
When processing CPU-heavy jobs over TCP, synchronous work blocks the event loop and can cause the TCP connection to drop, losing all in-flight jobs. This guide explains the problem and how to avoid it.
The Problem
The TCP client sends periodic ping health checks (default: every 30s). Under heavy CPU load, the event loop can’t process these pings in time. After 3 consecutive failures (~90s), the client triggers a forced reconnect, which closes the socket. When the server detects the socket close, it calls releaseClientJobs() and requeues all processing jobs. The worker then fails to ACK completed jobs with:
Error: Job not found or not in processing stateConnection Options
Disable the ping health check and increase the command timeout:
const worker = new Worker('heavy-queue', processor, { concurrency: 3, connection: { port: 6789, pingInterval: 0, // Disable ping health check commandTimeout: 60000, // Increase command timeout to 60s }, useLocks: false, // Avoid lock expiration under load heartbeatInterval: 0, // Disable heartbeat});Non-Blocking CPU Work
Even with pings disabled, long synchronous CPU work blocks heartbeats, lock renewals, and TCP responses. Break up CPU-heavy loops with periodic yields:
// Bad — blocks event loop for entire durationfunction findNthPrime(n: number): number { let count = 0, candidate = 1; while (count < n) { candidate++; if (isPrime(candidate)) count++; } return candidate;}
// Good — yields every 500 iterationsasync function findNthPrime(n: number): Promise<number> { let count = 0, candidate = 1, ops = 0; while (count < n) { candidate++; if (isPrime(candidate)) count++; if (++ops % 500 === 0) await Bun.sleep(0); } return candidate;}await Bun.sleep(0) yields to the event loop for one tick, allowing timers, TCP I/O, and heartbeats to fire.
Default Timeouts Reference
| Setting | Default | Effect under CPU load |
|---|---|---|
pingInterval | 30000ms | 3 consecutive failures → forced reconnect (~90s) |
commandTimeout | 30000ms | Long-running commands timeout |
LOCK_TIMEOUT_MS | 5000ms | Lock expires before worker finishes |
stallInterval | 30000ms | Job marked stalled if no heartbeat |
Alternative: SandboxedWorker
For truly CPU-bound work, consider using SandboxedWorker instead. It runs each job in an isolated Bun Worker thread, so the main event loop is never blocked. This eliminates all TCP timeout issues without requiring connection tuning.