Gateway API
The Crafted AI gateway exposes a single OpenAI-compatible endpoint:
POST /v1/chat/completionsIt accepts the same request shape as https://api.openai.com/v1/chat/completions
and returns the same response shape — including for Anthropic models, which
the gateway translates transparently.
Base URL
Section titled “Base URL”| Environment | Base URL |
|---|---|
| Staging | https://api-staging.craftedai.co/v1 |
| Production | https://api.craftedai.co/v1 (coming soon) |
Authentication
Section titled “Authentication”Send your gateway key in the Authorization header as a Bearer token:
Authorization: Bearer cai_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxGateway keys are minted in the
control panel and look like cai_…. They
authenticate the request and identify the organisation, key, and any
configured rate limits — no separate org header is needed.
Request
Section titled “Request”Bodies are JSON, capped at 4 MB. The shape mirrors OpenAI’s
chat.completions.create parameters.
curl "https://api-staging.craftedai.co/v1/chat/completions" \ -H "Authorization: Bearer $CRAFTED_AI_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [ { "role": "system", "content": "You are concise." }, { "role": "user", "content": "Summarise SOLID in one sentence." } ], "temperature": 0.2 }'import OpenAI from 'openai';
const client = new OpenAI({ apiKey: process.env.CRAFTED_AI_KEY, baseURL: 'https://api-staging.craftedai.co/v1',});
const res = await client.chat.completions.create({ model: 'gpt-4o-mini', messages: [ { role: 'system', content: 'You are concise.' }, { role: 'user', content: 'Summarise SOLID in one sentence.' }, ], temperature: 0.2,});import osfrom openai import OpenAI
client = OpenAI( api_key=os.environ["CRAFTED_AI_KEY"], base_url="https://api-staging.craftedai.co/v1",)
res = client.chat.completions.create( model="gpt-4o-mini", messages=[ {"role": "system", "content": "You are concise."}, {"role": "user", "content": "Summarise SOLID in one sentence."}, ], temperature=0.2,)Streaming
Section titled “Streaming”Set stream: true to receive Server-Sent Events. The gateway injects
stream_options.include_usage = true automatically, so the final SSE chunk
carries the same usage block you would see on a non-streamed response. This
keeps usage tracking accurate without any client-side change.
const stream = await client.chat.completions.create({ model: 'gpt-4o-mini', messages: [{ role: 'user', content: 'Tell me a joke.' }], stream: true,});
for await (const chunk of stream) { process.stdout.write(chunk.choices[0]?.delta?.content ?? '');}Errors
Section titled “Errors”Errors follow the OpenAI envelope so existing SDK error-handling continues to work. The shape is:
{ "error": { "message": "The model `gpt-foo` does not exist or is not configured for your organization.", "type": "invalid_request_error", "param": "model", "code": "model_not_found" }}Common error codes:
| HTTP | code | Meaning |
|---|---|---|
| 400 | invalid_request_error | Body failed validation. |
| 400 | unsupported_parameter | A field is recognised but not accepted (e.g. legacy params). |
| 401 | invalid_api_key | Missing or unrecognised cai_… key. |
| 404 | model_not_found | Model is not configured for your organisation, or doesn’t exist. |
| 429 | rate_limit_exceeded | Per-key RPM or daily-token cap hit. |
| 502 | upstream_unavailable | The upstream provider returned an unrecoverable error. |
| 503 | rate_limiter_unavailable | Rate-limiter Redis outage; gateway fails closed for billing safety. |
Plugins and tools
Section titled “Plugins and tools”User-supplied tools[] are passed through to the upstream provider unchanged.
Crafted AI plugin orchestration only activates when your organisation has
plugins configured and the request omits tools.
When orchestration is active, the gateway can inject first-party and MCP-backed
plugin tools, execute selected tool calls server-side, append tool results, and
continue the model loop until a final assistant response is produced. Tool names
are prefixed with the plugin slug, such as support-crm__lookup_customer.
For WhatsApp inbound replies, the reply worker calls the same gateway runtime
with the WhatsApp app’s enabledPluginIds, so only the selected plugins are
eligible for injection.
Read Plugins for the runtime rules and MCP integrations for installation.