API Reference

Everything you need to integrate Shadow Labs inference into your application.

Authentication

All requests require an API key via the X-API-Key header. Keys are tied to your Stripe subscription and control rate limits, usage metering, and billing.

header

X-API-Key: sk-live-your-key-here

Tip: Keep your API key secret. Never expose it in client-side code. Rotate keys from the dashboard.

Quickstart

Get a response in three lines. Install nothing — just send HTTP.

curl

bash

curl -X POST https://api.shadowlabs.dev/v1/inference \
  -H "X-API-Key: sk-live-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "hello"}]
  }'

Python

python

import requests

resp = requests.post(
    "https://api.shadowlabs.dev/v1/inference",
    headers={"X-API-Key": "sk-live-your-key"},
    json={
        "messages": [{"role": "user", "content": "hello"}],
        "max_tokens": 1024
    }
)
print(resp.json()["content"])

JavaScript / Node

javascript

const res = await fetch("https://api.shadowlabs.dev/v1/inference", {
  method: "POST",
  headers: {
    "X-API-Key": "sk-live-your-key",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    messages: [{ role: "user", content: "hello" }],
    max_tokens: 1024,
  }),
});
const data = await res.json();
console.log(data.content);

POST /v1/inference

The main inference endpoint. Send messages, get a response enriched with your context and tools.

POST/v1/inference

Request body

Parameter	Type	Description
messagesrequired	array	Conversation messages. Each has `role` and `content`.
model	string	Model override. Default: `claude-sonnet-4-20250514`.
max_tokens	integer	Max response tokens. 1–8192. Default: 1024.
temperature	float	Sampling temperature. 0.0–1.0. Default: 0.7.
stream	boolean	Enable SSE streaming. Default: false.

Response

json

{
  "id": "req_7f3a9c2e4b81",
  "content": "The response text...",
  "model": "claude-sonnet-4-20250514",
  "usage": {
    "input_tokens": 312,
    "output_tokens": 535,
    "total_tokens": 847
  },
  "stop_reason": "end_turn"
}

GET /v1/usage

Check your current billing period token usage.

GET/v1/usage

No request body. Auth via X-API-Key header.

json

{
  "customer_id": "cus_abc123",
  "period": "2026-03",
  "total_tokens": 284750,
  "message": "Usage is eventually consistent."
}

GET /v1/health

GET/v1/health

No auth required. Returns service status.

json

{
  "status": "ok",
  "model": "claude-sonnet-4-20250514"
}

Error Handling

Standard HTTP status codes. Errors return JSON with a detail field.

Status	Meaning	Action
401	Invalid API key	Check `X-API-Key` header
422	Invalid request	Check message format + param ranges
429	Rate limited	Back off, retry with exponential delay
502	Upstream error	Retry — transient model failure
500	Server error	Contact support if persistent

Streaming (SSE)

Set "stream": true to receive Server-Sent Events. Token metering accumulates during the stream and reports on completion.

event stream

data: {"type": "content_delta", "delta": "Hello"}
data: {"type": "content_delta", "delta": " world"}
data: {"type": "message_stop", "usage": {"total_tokens": 42}}

Rate Limits

Enforced per API key. Tier depends on your plan.

Plan	Req / min	Tokens / month
Starter	20	10,000
Pro	120	500K + overage
Scale	Custom	Custom

When rate-limited you'll get a 429 with a Retry-After header.

Shadow Labs API Docs · v1.0 · shadowlabs.dev