← shadowlabs.dev

API Reference

Everything you need to integrate Shadow Labs inference into your application.

Authentication

All requests require an API key via the X-API-Key header. Keys are tied to your Stripe subscription and control rate limits, usage metering, and billing.

header
X-API-Key: sk-live-your-key-here
Tip: Keep your API key secret. Never expose it in client-side code. Rotate keys from the dashboard.

Quickstart

Get a response in three lines. Install nothing — just send HTTP.

curl

bash
curl -X POST https://api.shadowlabs.dev/v1/inference \
  -H "X-API-Key: sk-live-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "hello"}]
  }'

Python

python
import requests

resp = requests.post(
    "https://api.shadowlabs.dev/v1/inference",
    headers={"X-API-Key": "sk-live-your-key"},
    json={
        "messages": [{"role": "user", "content": "hello"}],
        "max_tokens": 1024
    }
)
print(resp.json()["content"])

JavaScript / Node

javascript
const res = await fetch("https://api.shadowlabs.dev/v1/inference", {
  method: "POST",
  headers: {
    "X-API-Key": "sk-live-your-key",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    messages: [{ role: "user", content: "hello" }],
    max_tokens: 1024,
  }),
});
const data = await res.json();
console.log(data.content);

POST /v1/inference

The main inference endpoint. Send messages, get a response enriched with your context and tools.

POST/v1/inference

Request body

ParameterTypeDescription
messagesrequiredarrayConversation messages. Each has role and content.
modelstringModel override. Default: claude-sonnet-4-20250514.
max_tokensintegerMax response tokens. 1–8192. Default: 1024.
temperaturefloatSampling temperature. 0.0–1.0. Default: 0.7.
streambooleanEnable SSE streaming. Default: false.

Response

json
{
  "id": "req_7f3a9c2e4b81",
  "content": "The response text...",
  "model": "claude-sonnet-4-20250514",
  "usage": {
    "input_tokens": 312,
    "output_tokens": 535,
    "total_tokens": 847
  },
  "stop_reason": "end_turn"
}

GET /v1/usage

Check your current billing period token usage.

GET/v1/usage

No request body. Auth via X-API-Key header.

json
{
  "customer_id": "cus_abc123",
  "period": "2026-03",
  "total_tokens": 284750,
  "message": "Usage is eventually consistent."
}

GET /v1/health

GET/v1/health

No auth required. Returns service status.

json
{
  "status": "ok",
  "model": "claude-sonnet-4-20250514"
}

Error Handling

Standard HTTP status codes. Errors return JSON with a detail field.

StatusMeaningAction
401Invalid API keyCheck X-API-Key header
422Invalid requestCheck message format + param ranges
429Rate limitedBack off, retry with exponential delay
502Upstream errorRetry — transient model failure
500Server errorContact support if persistent

Streaming (SSE)

Set "stream": true to receive Server-Sent Events. Token metering accumulates during the stream and reports on completion.

event stream
data: {"type": "content_delta", "delta": "Hello"}
data: {"type": "content_delta", "delta": " world"}
data: {"type": "message_stop", "usage": {"total_tokens": 42}}

Rate Limits

Enforced per API key. Tier depends on your plan.

PlanReq / minTokens / month
Starter2010,000
Pro120500K + overage
ScaleCustomCustom

When rate-limited you'll get a 429 with a Retry-After header.

Shadow Labs API Docs · v1.0 · shadowlabs.dev