Task Types

Task types define the input/output contract for jobs and agent nodes. Each model supports one or more task types — see the Models reference for which models support which tasks.

Overview

Task Type	Category	Sync	Description
`openai/chat-completion`	Text	Yes	Chat completion (LLM)
`openai/chat-completion/vision`	Text	Yes	Chat completion with image input
`openai/chat-completion/ocr`	OCR	Yes	OCR via vision model
`openai/embeddings`	Text	Yes	Text embeddings
`openai/rerank`	Text	Yes	Document reranking
`openai/score`	Text	Yes	Text pair similarity scoring
`openai/audio-speech`	Audio	Yes	Text-to-speech
`openai/audio-transcription`	Audio	Yes	Speech-to-text
`openai/image-generation`	Image	Yes	Image generation (OpenAI format)
`fal/text-to-image`	Image	Yes	Image generation (Fal format)
`fal/image-edit`	Image	Yes	Image editing / inpainting
`fal/text-to-video`	Video	No	Text-to-video generation
`fal/image-to-video`	Video	No	Image-to-video animation
`fal/speech-to-video`	Video	No	Speech-driven video (talking head)
`fal/audio-transcription`	Audio	Yes	Speech-to-text (Fal format)
`fal/video-interpolate`	Video	No	Frame interpolation (slow motion) — coming soon
`fal/video-upscale`	Video	No	Video super-resolution — coming soon
`fal/ocr`	OCR	Yes	Document OCR (Fal format)
`openai/chat-completion/moderation`	Moderation	Yes	Content moderation via chat model

Sync = can be dispatched synchronously (result returned inline). Async tasks return a request_id for polling.

Text Tasks

`openai/chat-completion`

Standard chat completion following the OpenAI API format.

Field	Type	Required	Description
`messages`	array	Yes	Array of `{role, content}` message objects
`model`	string	No	Model ID (set automatically when using a specific endpoint)
`temperature`	number	No	Sampling temperature (0-2)
`top_p`	number	No	Nucleus sampling
`max_tokens`	number	No	Maximum tokens to generate
`stream`	boolean	No	Enable streaming response
`stop`	string/array	No	Stop sequences
`tools`	array	No	Tool/function definitions

Output: choices[0].message.content (text)

Models: qwen3-0.6b-fp8, qwen3.5-4b, qwen3.5-9b, gpt-oss-20b, deepseek-ocr-v1, deepseek-ocr-v2

`openai/chat-completion/vision`

Chat completion with image input. Same parameters as openai/chat-completion, but messages can include image content:

{
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "What's in this image?"},
      {"type": "image_url", "image_url": {"url": "https://..."}}
    ]
  }]
}

Models: qwen3.5-4b, qwen3.5-9b

`openai/chat-completion/ocr`

OCR via a vision-capable model. Uses the same message format as vision, optimized for text extraction from images.

Models: deepseek-ocr-v1, deepseek-ocr-v2

`openai/embeddings`

Generate vector embeddings for text.

Field	Type	Required	Description
`input`	string/array	Yes	Text or array of texts to embed
`encoding_format`	string	No	`float` or `base64`
`dimensions`	number	No	Output dimensions

Output: data[0].embedding (float array)

Models: sglang-qwen3-0.6b-fp8-embed, sglang-gpt-oss-20b-embed

`openai/rerank`

Rerank documents by relevance to a query.

Field	Type	Required	Description
`model`	string	Yes	Model ID
`query`	string	Yes	Search query
`documents`	array	Yes	Array of strings or `{text}` objects
`top_n`	number	No	Number of top results to return
`return_documents`	boolean	No	Include document text in results

Output: Ranked documents with relevance scores

Models: qwen3-0.6b-fp8-score, gpt-oss-20b-score

`openai/score`

Compute similarity scores between text pairs.

Field	Type	Required	Description
`model`	string	Yes	Model ID
`text_1`	string/array	Yes	First text(s)
`text_2`	string/array	Yes	Second text(s)

Output: Similarity scores

Models: qwen3-0.6b-fp8-score, gpt-oss-20b-score

Audio Tasks

`openai/audio-speech`

Convert text to spoken audio.

Field	Type	Required	Description
`input`	string	Yes	Text to speak
`voice`	string	No	Voice ID (model-specific)
`response_format`	string	No	`mp3`, `opus`, `aac`, `flac`, `wav`, `pcm`
`speed`	number	No	Playback speed multiplier

Output: audio_url (audio file URL)

Models: qwen3-tts (9 voices), fox-tts (150+ voices)

`openai/audio-transcription`

Transcribe audio to text.

Field	Type	Required	Description
`audio_url`	string	Yes	URL to the audio file
`language`	string	No	Language code (ISO 639-1)
`task`	string	No	`transcribe` or `translate`

Also supports multipart file upload with a file field.

Output: text (transcribed text)

Models: whisper-large-v3-turbo

`fal/audio-transcription`

Same as openai/audio-transcription but using the Fal request format.

Field	Type	Required	Description
`audio_url`	string	Yes	URL to the audio file
`language`	string	No	Language code
`task`	string	No	`transcribe` or `translate`

Output: text

Models: whisper-large-v3-turbo

Image Tasks

`openai/image-generation`

Generate images using the OpenAI-compatible format.

Field	Type	Required	Description
`prompt`	string	Yes	Image description
`n`	number	No	Number of images
`size`	string	No	Image dimensions
`quality`	string	No	`standard` or `hd`
`response_format`	string	No	`url` or `b64_json`

Output: data[0].url (image URL)

Models: qwen-image-2512

`fal/text-to-image`

Generate images using the Fal format. More parameters than the OpenAI format.

Field	Type	Required	Description
`prompt`	string	Yes	Image description
`negative_prompt`	string	No	What to avoid
`image_size`	string	No	`square_hd`, `square`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`
`num_inference_steps`	number	No	Denoising steps
`guidance_scale`	number	No	Prompt adherence strength
`num_images`	number	No	Number of images (max 10)
`seed`	number	No	Reproducibility seed
`loras`	array	No	LoRA configs `[{path, scale}]` (scale 0-4)
`enable_safety_checker`	boolean	No	Content safety filter
`output_format`	string	No	Output image format

Output: images[0].url (image URL)

Models: nunchaku-flux1-schnell, sglang-diffusion-flux2-klein-4b, qwen-image-2512, sglang-diffusion-qwen-image-2512-fp8

`fal/image-edit`

Edit or inpaint images.

Field	Type	Required	Description
`prompt`	string	Yes	Edit instruction
`image_url`	string	Yes	Source image URL
`mask_url`	string	No	Mask for inpainting
`strength`	number	No	Edit strength (0-1)
`num_inference_steps`	number	No	Denoising steps
`guidance_scale`	number	No	Prompt adherence
`num_images`	number	No	Number of results (max 10)
`seed`	number	No	Reproducibility seed
`loras`	array	No	LoRA configs

Output: images[0].url (image URL)

Models: nunchaku-flux1-schnell, sglang-diffusion-flux2-klein-4b, qwen-image-edit-2511, sglang-diffusion-qwen-image-edit-2511-fp8

Video Tasks

All video tasks are async only — they return a request_id for polling.

`fal/text-to-video`

Generate video from a text prompt.

Field	Type	Required	Description
`prompt`	string	Yes	Video description
`resolution`	string	No	`480p`, `720p`
`aspect_ratio`	string	No	`16:9`, `9:16`, `1:1`
`num_inference_steps`	number	No	Denoising steps
`guidance_scale`	number	No	Prompt adherence
`num_frames`	number	No	Number of frames
`fps`	number	No	Frames per second
`seed`	number	No	Reproducibility seed
`output_format`	string	No	Video format

Output: video.url (video URL)

Models: wan22-ti2v, sglang-diffusion-wan22-t2v-a14b-fp8, ltx2-distilled

`fal/image-to-video`

Animate a still image into video.

Field	Type	Required	Description
`prompt`	string	Yes	Motion description
`image_url`	string	Yes	Source image URL
`resolution`	string	No	`480p`, `720p`
`aspect_ratio`	string	No	`16:9`, `9:16`, `1:1`
`num_inference_steps`	number	No	Denoising steps
`guidance_scale`	number	No	Prompt adherence
`num_frames`	number	No	Number of frames
`fps`	number	No	Frames per second
`seed`	number	No	Reproducibility seed

Output: video.url (video URL)

Models: wan22-ti2v, ltx2-distilled

`fal/speech-to-video`

Generate talking-head video driven by audio.

Field	Type	Required	Description
`audio_url`	string	Yes	Audio file URL
`image_url`	string	Yes	Face/character image URL
`prompt`	string	No	Additional scene description
`num_inference_steps`	number	No	Denoising steps
`guidance_scale`	number	No	Prompt adherence
`seed`	number	No	Reproducibility seed

Output: video.url (video URL)

Models: wan22-s2v

`fal/video-interpolate` (coming soon)

Increase video frame rate (slow motion effect). No model is currently enabled for this task.

`fal/video-upscale` (coming soon)

Upscale video resolution. This task is currently disabled while the model is being stabilized.

Agent Usage

Task types are used as the task field in agent DAG nodes. Each node specifies a task type and a model, and can wire outputs from upstream nodes into its inputs.

{
  "nodes": [
    {
      "id": "generate",
      "task": "fal/text-to-image",
      "model": "nunchaku-flux1-schnell",
      "payload": {
        "prompt": "{{input.prompt}}"
      }
    }
  ]
}

See the Agents guide for details on building multi-step pipelines.

Task Types

Overview

Text Tasks

openai/chat-completion

openai/chat-completion/vision

openai/chat-completion/ocr

openai/embeddings

openai/rerank

openai/score

Audio Tasks

openai/audio-speech

openai/audio-transcription

fal/audio-transcription

Image Tasks

openai/image-generation

fal/text-to-image

fal/image-edit

Video Tasks

fal/text-to-video

fal/image-to-video

fal/speech-to-video

fal/video-interpolate (coming soon)

fal/video-upscale (coming soon)

Agent Usage

`openai/chat-completion`

`openai/chat-completion/vision`

`openai/chat-completion/ocr`

`openai/embeddings`

`openai/rerank`

`openai/score`

`openai/audio-speech`

`openai/audio-transcription`

`fal/audio-transcription`

`openai/image-generation`

`fal/text-to-image`

`fal/image-edit`

`fal/text-to-video`

`fal/image-to-video`

`fal/speech-to-video`

`fal/video-interpolate` (coming soon)

`fal/video-upscale` (coming soon)