API Reference
Everything you need to integrate Shadow Labs inference into your application.
Authentication
All requests require an API key via the X-API-Key header. Keys are tied to your Stripe subscription and control rate limits, usage metering, and billing.
X-API-Key: sk-live-your-key-here
Quickstart
Get a response in three lines. Install nothing — just send HTTP.
curl
curl -X POST https://api.shadowlabs.dev/v1/inference \ -H "X-API-Key: sk-live-your-key" \ -H "Content-Type: application/json" \ -d '{ "messages": [{"role": "user", "content": "hello"}] }'
Python
import requests resp = requests.post( "https://api.shadowlabs.dev/v1/inference", headers={"X-API-Key": "sk-live-your-key"}, json={ "messages": [{"role": "user", "content": "hello"}], "max_tokens": 1024 } ) print(resp.json()["content"])
JavaScript / Node
const res = await fetch("https://api.shadowlabs.dev/v1/inference", { method: "POST", headers: { "X-API-Key": "sk-live-your-key", "Content-Type": "application/json", }, body: JSON.stringify({ messages: [{ role: "user", content: "hello" }], max_tokens: 1024, }), }); const data = await res.json(); console.log(data.content);
POST /v1/inference
The main inference endpoint. Send messages, get a response enriched with your context and tools.
Request body
| Parameter | Type | Description |
|---|---|---|
| messagesrequired | array | Conversation messages. Each has role and content. |
| model | string | Model override. Default: claude-sonnet-4-20250514. |
| max_tokens | integer | Max response tokens. 1–8192. Default: 1024. |
| temperature | float | Sampling temperature. 0.0–1.0. Default: 0.7. |
| stream | boolean | Enable SSE streaming. Default: false. |
Response
{
"id": "req_7f3a9c2e4b81",
"content": "The response text...",
"model": "claude-sonnet-4-20250514",
"usage": {
"input_tokens": 312,
"output_tokens": 535,
"total_tokens": 847
},
"stop_reason": "end_turn"
}GET /v1/usage
Check your current billing period token usage.
No request body. Auth via X-API-Key header.
{
"customer_id": "cus_abc123",
"period": "2026-03",
"total_tokens": 284750,
"message": "Usage is eventually consistent."
}GET /v1/health
No auth required. Returns service status.
{
"status": "ok",
"model": "claude-sonnet-4-20250514"
}Error Handling
Standard HTTP status codes. Errors return JSON with a detail field.
| Status | Meaning | Action |
|---|---|---|
| 401 | Invalid API key | Check X-API-Key header |
| 422 | Invalid request | Check message format + param ranges |
| 429 | Rate limited | Back off, retry with exponential delay |
| 502 | Upstream error | Retry — transient model failure |
| 500 | Server error | Contact support if persistent |
Streaming (SSE)
Set "stream": true to receive Server-Sent Events. Token metering accumulates during the stream and reports on completion.
data: {"type": "content_delta", "delta": "Hello"} data: {"type": "content_delta", "delta": " world"} data: {"type": "message_stop", "usage": {"total_tokens": 42}}
Rate Limits
Enforced per API key. Tier depends on your plan.
| Plan | Req / min | Tokens / month |
|---|---|---|
| Starter | 20 | 10,000 |
| Pro | 120 | 500K + overage |
| Scale | Custom | Custom |
When rate-limited you'll get a 429 with a Retry-After header.
Shadow Labs API Docs · v1.0 · shadowlabs.dev