Video Generation API
Generate AI videos from text, images, video references, and audio inputs — all through a single unified endpoint. The generation mode is automatically determined by the combination of parameters you provide.
Endpoint
POST https://api.evolink.ai/v1/videos/generations
Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | — | Model ID. Use seedance-2.0 |
prompt | string | Yes | — | Text description of the desired video (max 2000 tokens). Use @ tags to reference uploaded files |
image_urls | array | No | — | Reference image URLs (up to 9). See Input File Requirements |
video_urls | array | No | — | Reference video URLs (up to 3). See Input File Requirements |
audio_urls | array | No | — | Reference audio URLs (up to 3). See Input File Requirements |
duration | integer | No | 5 | Video duration in seconds. Any integer from 4 to 15. Longer durations cost more |
quality | string | No | 720p | Video resolution: 480p, 720p, or 1080p. Higher quality costs more |
aspect_ratio | string | No | 16:9 | Aspect ratio: 16:9, 9:16, 1:1, 4:3, 3:4, 21:9, or adaptive |
generate_audio | boolean | No | true | Whether to generate synchronized audio. Enabling increases cost |
callback_url | string | No | — | HTTPS URL for task completion callback. See Webhooks |
Generation Modes
The API auto-detects the generation mode based on which input parameters you provide:
| Inputs Provided | Mode | Description |
|---|---|---|
prompt only | Text-to-Video | Generate video from text description |
prompt + image_urls (1 image) | Image-to-Video | Animate a reference image |
prompt + image_urls (2 images) | First-Last-Frame | Generate transition between two keyframes |
prompt + any combination of image_urls, video_urls, audio_urls | Multimodal | Use @ tags in prompt to assign roles to each input. See Multimodal Reference |
Input File Requirements
Images
| Property | Limit |
|---|---|
| Max count | 9 per request |
| Max file size | 30MB per image |
| Supported formats | .jpeg, .png, .webp, .bmp, .tiff, .gif |
Videos
| Property | Limit |
|---|---|
| Max count | 3 per request |
| Max file size | 50MB per video |
| Supported formats | .mp4, .mov |
| Duration | 2–15 seconds |
| Pixel range | 409,600 (480p) – 927,408 (720p) |
Audio
| Property | Limit |
|---|---|
| Max count | 3 per request |
| Max file size | 15MB per audio |
| Supported formats | .mp3, .wav |
| Total duration | ≤ 15 seconds |
Total file limit: Maximum 12 files across all modalities per request.
Face restriction: Realistic human face uploads are not supported and will be automatically rejected.
All file URLs must be directly accessible by the server.
Examples
Text-to-Video
import requests
response = requests.post(
"https://api.evolink.ai/v1/videos/generations",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
json={
"model": "seedance-2.0",
"prompt": "A luxury watch rotating slowly on a marble surface, soft studio lighting, product showcase, cinematic 4K",
"duration": 8,
"quality": "1080p",
"aspect_ratio": "16:9",
"generate_audio": False
}
)
print(response.json())
Image-to-Video
response = requests.post(
"https://api.evolink.ai/v1/videos/generations",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
json={
"model": "seedance-2.0",
"prompt": "The woman turns her head slowly and smiles, hair gently flowing in the wind",
"image_urls": ["https://example.com/portrait.jpg"],
"duration": 5,
"quality": "1080p"
}
)
First-Last-Frame
response = requests.post(
"https://api.evolink.ai/v1/videos/generations",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
json={
"model": "seedance-2.0",
"prompt": "Smooth camera pan revealing the landscape, golden hour lighting",
"image_urls": [
"https://example.com/frame-start.jpg",
"https://example.com/frame-end.jpg"
],
"duration": 8,
"quality": "1080p"
}
)
Multimodal with @Tags
response = requests.post(
"https://api.evolink.ai/v1/videos/generations",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
json={
"model": "seedance-2.0",
"prompt": "@Image1 as first frame, replicate @Video1 camera movement, @Audio1 for BGM rhythm",
"image_urls": ["https://example.com/scene.jpg"],
"video_urls": ["https://example.com/reference-camera.mp4"],
"audio_urls": ["https://example.com/bgm.mp3"],
"duration": 10,
"quality": "1080p"
}
)
See Multimodal Reference for the full @ tag syntax and role assignments.
Response
{
"id": "task-unified-1761313744-vux2jw0k",
"object": "video.generation.task",
"created": 1761313744,
"model": "seedance-2.0",
"status": "pending",
"progress": 0,
"type": "video",
"task_info": {
"can_cancel": true,
"estimated_time": 165,
"video_duration": 8
},
"usage": {
"billing_rule": "per_call",
"credits_reserved": 12,
"user_group": "default"
}
}
Response Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique task identifier for status polling |
object | string | Always video.generation.task |
created | integer | Task creation Unix timestamp |
model | string | Model used for generation |
status | string | pending, processing, completed, or failed |
progress | integer | Progress percentage (0–100) |
type | string | Output type: text, image, audio, or video |
task_info.can_cancel | boolean | Whether the task can be cancelled |
task_info.estimated_time | integer | Estimated completion time in seconds |
task_info.video_duration | integer | Requested video duration in seconds |
usage.billing_rule | string | Billing rule (per_call, per_token, per_second) |
usage.credits_reserved | number | Estimated credits consumed |
usage.user_group | string | User group category |
Audio Generation
Seedance 2.0 can automatically generate synchronized audio including voice, sound effects, and background music based on your text prompt and visual content.
- Set
generate_audiototrue(default) to enable audio - Place dialogue within double quotes in your prompt for better voice generation
- Example:
The man stopped the woman and said: "Remember, you must never point at the moon with your finger." - Set
generate_audiotofalsefor silent video output
Prompt Tips
- Be specific about camera angles, lighting, and motion
- Include style keywords: "cinematic", "slow motion", "aerial shot"
- Describe the subject, action, and atmosphere
- Maximum prompt length is 2000 tokens
- For detailed prompt engineering strategies, see our Seedance 2.0 Prompt Guide
Related
- Multimodal Reference — Control generation with @tag references for images, videos, and audio
- Async Tasks — Poll task status and retrieve results
- Webhooks — Real-time completion notifications via
callback_url - SDKs & Examples — Python, Node.js, Go, and cURL integration code