Quickstart

Crafted AI is a drop-in gateway for OpenAI-compatible chat completions. Point the OpenAI SDK at our base URL, swap the API key for a cai_… gateway key, and you get unified usage tracking, BYOK provider keys, and per-key rate limits — without touching the rest of your code.

1. Create an account

2. Add a provider credential (BYOK)

Open Credentials → Provider keys in the control panel and paste an OpenAI or Anthropic API key. Provider keys are encrypted at rest with the gateway’s MODEL_CONFIG_ENCRYPTION_KEY and are never returned by the API after they are stored.

3. Mint a gateway key

Open Credentials → Gateway keys → New key. Copy the cai_… value — it is shown exactly once. Optionally set a requests-per-minute (RPM) cap and a daily token cap for the key. Treat the value like any other secret.

4. Configure your client

Set two environment variables:

export CRAFTED_AI_KEY="cai_..."
export CRAFTED_AI_BASE_URL="https://api-staging.craftedai.co/v1"

5. Send your first request

curl "$CRAFTED_AI_BASE_URL/chat/completions" \
  -H "Authorization: Bearer $CRAFTED_AI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "user", "content": "Say hello in one short sentence." }
    ]
  }'

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.CRAFTED_AI_KEY,
  baseURL: process.env.CRAFTED_AI_BASE_URL,
});

const res = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Say hello in one short sentence.' }],
});

console.log(res.choices[0]?.message?.content);

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["CRAFTED_AI_KEY"],
    base_url=os.environ["CRAFTED_AI_BASE_URL"],
)

res = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Say hello in one short sentence."}],
)

print(res.choices[0].message.content)

6. Inspect usage

Open app.craftedai.co/usage to see the call appear with token counts and cost, broken down by gateway key and model.

What you get for free

Drop-in OpenAI compatibility — works with the official OpenAI SDKs and any OpenAI-compatible client.
Anthropic translation — request a Claude model and the gateway adapts the request and response on the fly.
Usage tracking — every call is recorded with prompt/completion tokens, cost, and latency, scoped to your organisation, gateway key, and model.
Streaming — set stream: true and the gateway forwards Server-Sent Events with stream_options.include_usage already injected so the final chunk carries token counts.
Plugin runtime — omit tools[] and the gateway can inject configured organisation plugins, including MCP-backed tools and first-party plugins.
WhatsApp MVP path — configure a Twilio sender, model, system prompt, and selected plugins so inbound WhatsApp messages can reuse the same gateway.

Next: read the Gateway API reference for the full request and error envelope, configure WhatsApp, connect MCP integrations, or jump to Credentials for rotation and rate-limit guidance.