Skip to content

Studio

Studio is Casola’s unified creative interface. It works like a chat assistant that can generate images, videos, and speech — all in one conversation. Open it from the home page or navigate to Studio in the sidebar.

Studio page showing a conversation with generated image and mode selector

Studio supports five modes, selectable via chips at the top of the input:

ModeWhat it does
AutoDetects the most likely modality from your input (default)
ChatText-only conversation — generation tools only fire if you explicitly ask
ImageBiased toward image generation
VideoBiased toward video generation
VoiceBiased toward text-to-speech

Auto mode analyzes your prompt and infers intent. When it detects a match, a hint appears next to the chip — for example, “Auto → Image”. If you find Auto guessing wrong, switch to the explicit mode.

The quality dropdown in the top bar controls generation parameters across all modalities:

LevelImagesVideo
Fast15 steps, 3.5 guidance20 steps
BalancedModel defaultsModel defaults
Max40 steps, 7.5 guidance50 steps, 7.5 guidance

Higher step counts produce more detail at the cost of longer generation time. Higher guidance scale makes the output follow your prompt more closely. Quality applies to all media generated in that conversation.

Type a prompt and press Enter. Studio sends your message to an LLM that decides which tool to call:

  • generate_image — text-to-image with aspect ratio options (square, portrait, landscape)
  • generate_video — text-to-video or image-to-video
  • generate_speech — text-to-speech with voice selection
  • transcribe_audio — audio-to-text
  • search_library — search your past creations

Results appear inline in the conversation. Images are clickable for a full-size view, videos play inline (Shift+click for lightbox), and audio has a built-in player.

Each media result includes a View Code button that shows the equivalent API call (curl, Python, or TypeScript). Use this to reproduce the exact generation programmatically or integrate it into your application.

The assistant can chain multiple tools in one turn — for example, generating an image and then describing it.

The sidebar lists all your Studio sessions, sorted by most recent. Each session shows its title (taken from the first message) and last update time.

  • Click New Session to start fresh
  • Click a session to resume it
  • Delete sessions you no longer need

Sessions are saved to your Library automatically.

  • Iterate in context — ask the assistant to modify a previous result (“make it warmer”, “add a sunset background”). It uses conversation history to refine outputs.
  • Chain modalities — generate an image, then ask to animate it into a video, or describe a scene and ask for both an image and narration.
  • Switch modes mid-conversation — if Auto picks the wrong tool, switch to the explicit mode chip and resend your prompt.
  • Use View Code to graduate — prototype in Studio, then export the API call for production use.

Use Studio when you want a conversational workflow — asking the AI to generate, iterate, and combine different media types in context. Use the dedicated Image Generation, Chat, or Voice pages when you want direct control over model selection and advanced parameters.