Production Integration Patterns

This page covers practical patterns for running Nyckel prediction endpoints reliably in production.

Synchronous vs. asynchronous invocation

Synchronous — Call /invoke inline with the user request. Works well when:

The prediction is shown to the user immediately.
Latency is acceptable (typically under 200ms for text, slightly more for images).

Asynchronous — Queue the input, invoke in a background worker, and store the result. Works well when:

The prediction is used internally (routing, tagging, filtering).
You are processing batches of inputs.
You want to decouple prediction latency from user response time.

Retries and error handling

Always implement retry logic for 429 (rate limit) and 5xx (server error) responses.

import time
import requests

def invoke_with_retry(function_id, data, token, retries=3):
    url = f"https://www.nyckel.com/v1/functions/{function_id}/invoke"
    headers = {"Authorization": f"Bearer {token}"}
    for attempt in range(retries):
        resp = requests.post(url, json={"data": data}, headers=headers)
        if resp.status_code == 200:
            return resp.json()
        if resp.status_code in (429, 500, 502, 503):
            time.sleep(2 ** attempt)  # exponential backoff
            continue
        resp.raise_for_status()
    raise RuntimeError("Max retries exceeded")

Store prediction results locally

Do not rely solely on Nyckel as your record of predictions. Store the sampleId, predicted label, confidence, and timestamp in your own database. This lets you:

Audit predictions later.
Submit annotations when outcomes are known.
Analyze model performance over time.

Managing multiple functions

If your application uses more than one Nyckel function (for example, an image classifier and a text classifier), keep function IDs in configuration rather than hardcoded in your application code.

NYCKEL_FUNCTIONS = {
    "image_moderation": "fn_abc123",
    "ticket_routing":   "fn_xyz789",
}

This makes it easy to swap or update functions without code changes.

Rate limits

Nyckel enforces per-account rate limits. If you expect high-volume traffic, contact Nyckel support to discuss your needs. For bursty workloads, consider queuing requests rather than calling /invoke directly at peak.

NoteSee the Invoke a function reference page for the complete request and response schema.