Asynchronous Mode Support in Nuclio#
⚠️ Warning: Technical preview
Overview#
Nuclio now supports asynchronous function invocation within a single worker process. This feature enhances performance and resource utilization, particularly in use cases where functions handle multiple concurrent requests. It is especially beneficial for I/O-bound operations. Currently, asynchronous mode is available only for HTTP triggers and the Python runtime.
Enabling Async Mode#
To enable asynchronous mode, set the spec.triggers.trigger-name.mode field to async in the function’s configuration.
Asynchronous mode operates by establishing multiple socket connections between the Nuclio processor and the worker process. By default, Nuclio creates 1000 connections per worker.
This value can be customized using the spec.triggers.trigger-name.async.maxConnectionsNumber field.
Architecture Considerations and Failure Recovery#
Architecture description can be found here.
Async mode is designed to complement, not replace, Nuclio synchronous processing. It should be used only when asynchronous handling aligns with the function’s behavior. If a blocking operation is executed within an async function, it can block the entire worker process, preventing other requests from being processed.
When spec.eventTimeout is configured, any connection that exceeds the timeout is marked for restart. The connection is then closed, and the Nuclio processor attempts to re-establish it.
If the connection cannot be re-established due to a blocking operation or any other problem, the processor retries up to 3 times, with a 30-second timeout for each attempt.
If all retries fail, the worker process is restarted to ensure proper recovery.
Concurrency and Connection Management#
The total number of events that can be processed concurrently is determined by the product of numWorkers and maxConnectionsNumber.
This represents the maximum number of events that can be processed concurrently by one pod.
If all connections are occupied and a new event arrives, the event will wait for a connection to become available for up to spec.triggers.trigger-name.async.connectionAvailabilityTimeout.
By default, this timeout is set to 10 seconds, but it can be customized to any desired value.
Slow init_context and Startup Timeouts#
When init_context takes a long time (e.g., loading a large model), the Go processor must wait for the Python wrapper to start listening before it can establish connections.
By default the startup budget is 3 × readinessTimeoutSeconds, which scales automatically with the function’s configured readiness window.
If your init_context routinely exceeds this budget, set an explicit override:
spec:
readinessTimeoutSeconds: 300 # 5 minutes
triggers:
myTrigger:
kind: http
mode: async
async:
establishConnectionTimeout: "20m" # override: 20 minutes
establishConnectionTimeout accepts any value valid for Go’s time.ParseDuration (e.g., "5m", "600s").
Troubleshooting#
“dial tcp 127.0.0.1:1337: connect: connection refused”#
This error occurs during deployment when the Python wrapper has not started listening by the time the startup budget is exhausted.
Causes:
init_contexttakes longer than3 × readinessTimeoutSeconds
Solutions:
Increase readiness timeout: Set
spec.readinessTimeoutSecondsto a value larger than yourinit_contextduration — the startup budget will grow automatically (3×).Set an explicit override: Set
spec.triggers.trigger-name.async.establishConnectionTimeoutto a fixed duration if you need finer control (e.g.,"10m").
“Failed to allocate connection for processing event”#
This error occurs when a new event arrives but all connections are occupied, and the connectionAvailabilityTimeout expires before a connection becomes available.
Causes:
High concurrency with long-running requests
Blocking operations in async handlers
maxConnectionsNumberis too low for the workloadconnectionAvailabilityTimeoutis too short
Solutions:
Increase timeout: Set
spec.triggers.trigger-name.async.connectionAvailabilityTimeoutto a higher value (default is 10s)Increase connections: Set
spec.triggers.trigger-name.async.maxConnectionsNumberto a higher value (default is 1000)Scale horizontally: Increase
numWorkersor pod replicasReview handler code: Ensure async handlers don’t perform blocking operations