For API Integrators
Casola exposes OpenAI-compatible endpoints, so existing SDKs and tools work without changes. Create a token, make your first call, and set up a workflow — all in about ten minutes.
Create an API token
Section titled “Create an API token”- Sign in at casola.ai and go to Tokens (
/tokens). - Click Create token. Select the User Access scope group (
user:read+user:write) — this covers all inference and data operations. - Copy the token immediately. It’s only displayed once.
For service accounts or CI pipelines, admins can create tokens with narrower scopes. See the API Tokens guide and Scopes reference for the full list.
Make your first API call
Section titled “Make your first API call”Casola’s base URL is https://api.casola.ai. All OpenAI-compatible endpoints live under /openai/v1/. Set your token as an environment variable to use in the examples below:
export CASOLA_API_KEY="csl_your_token_here"Chat completion
Section titled “Chat completion”The most common starting point — send a message to an LLM:
curl https://api.casola.ai/openai/v1/chat/completions \ -H "Authorization: Bearer $CASOLA_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "Qwen/Qwen3.5-4B", "messages": [{"role": "user", "content": "What is Casola?"}] }'Response:
{ "id": "chatcmpl-abc123", "object": "chat.completion", "model": "Qwen/Qwen3.5-4B", "choices": [ { "index": 0, "message": {"role": "assistant", "content": "Casola is a distributed AI inference platform..."}, "finish_reason": "stop" } ], "usage": {"prompt_tokens": 12, "completion_tokens": 25, "total_tokens": 37}}Image generation
Section titled “Image generation”curl https://api.casola.ai/openai/v1/images/generations \ -H "Authorization: Bearer $CASOLA_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "flux", "prompt": "a neon-lit alley in the rain", "size": "1024x1024" }'Response:
{ "created": 1711234567, "data": [{"url": "https://cdn.casola.ai/outputs/img_abc123.png"}]}Python (OpenAI SDK)
Section titled “Python (OpenAI SDK)”from openai import OpenAI
client = OpenAI( base_url="https://api.casola.ai/openai/v1", api_key="YOUR_TOKEN",)
# Chatchat = client.chat.completions.create( model="Qwen/Qwen3.5-4B", messages=[{"role": "user", "content": "Hello!"}],)print(chat.choices[0].message.content)
# Imageimage = client.images.generate( model="flux", prompt="a neon-lit alley in the rain", size="1024x1024",)print(image.data[0].url)TypeScript (OpenAI SDK)
Section titled “TypeScript (OpenAI SDK)”import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://api.casola.ai/openai/v1", apiKey: "YOUR_TOKEN",});
// Chatconst chat = await client.chat.completions.create({ model: "Qwen/Qwen3.5-4B", messages: [{ role: "user", content: "Hello!" }],});console.log(chat.choices[0].message.content);
// Imageconst image = await client.images.generate({ model: "flux", prompt: "a neon-lit alley in the rain", size: "1024x1024",});console.log(image.data[0].url);Use async requests for long-running jobs
Section titled “Use async requests for long-running jobs”Video generation and other heavy tasks can take longer than a typical HTTP timeout. Add "async": true to your request body to get a 202 response with a request ID, then poll for the result.
Submit an async request
Section titled “Submit an async request”curl -X POST https://api.casola.ai/openai/v1/images/generations \ -H "Authorization: Bearer $CASOLA_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "flux", "prompt": "a sunset timelapse", "async": true}'Response (202):
{ "id": "req_abc123", "object": "image.generation.async", "status": "pending"}Poll for the result
Section titled “Poll for the result”curl https://api.casola.ai/fal/requests/req_abc123 \ -H "Authorization: Bearer $CASOLA_API_KEY"While processing:
{"request_id": "req_abc123", "status": "processing"}When complete:
{ "request_id": "req_abc123", "status": "completed", "images": [{"url": "https://cdn.casola.ai/outputs/img_abc123.png", "width": 1024, "height": 1024}]}Set up a workflow
Section titled “Set up a workflow”Workflows let you chain multiple models into a pipeline — for example, generate an image, then upscale it, then convert to video.
- Go to Workflows (
/workflows/new) in the UI to build a workflow visually. - Or create one via the API — see the Workflows guide for the DAG format and execution endpoints.
Evaluate models
Section titled “Evaluate models”Browse available models and their current status at /models or via the API:
curl https://api.casola.ai/openai/v1/models \ -H "Authorization: Bearer YOUR_TOKEN"Model status (online, warming up, standby, offline) is available at /api/model-status. See the Models reference for capabilities, latency profiles, and supported parameters.
Next steps
Section titled “Next steps”- API Tokens guide — scopes, rotation, and service accounts
- Workflows guide — build automated pipelines
- Models reference — all available models and their parameters
- API reference — full endpoint documentation