Image-to-Video API
Turn 1 or 2 images into a video. The behavior is determined by how many images you pass:
- 1 image → First-frame mode. The image becomes the first frame of the video; the model generates forward motion from it.
- 2 images → First-last-frame mode. The first image opens the video and the second image closes it; the model generates the transition animation between them.
Endpoint
POST https://api.evolink.ai/v1/videos/generations
Model ID: seedance-2.0-image-to-video
The Fast variant
seedance-2.0-fast-image-to-videoauto-detects first-frame vs first-last-frame mode based on the image count.
Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | — | Must be seedance-2.0-image-to-video |
prompt | string | Yes | — | Natural-language description of motion / camera / atmosphere. ≤ 500 Chinese chars or ≤ 1000 English words |
image_urls | array<string> | Yes | — | 1 or 2 publicly accessible image URLs |
duration | integer | No | 5 | Video duration in seconds, range 4–15 |
quality | string | No | 720p | 480p or 720p |
aspect_ratio | string | No | 16:9 | 16:9, 9:16, 1:1, 4:3, 3:4, 21:9, adaptive |
generate_audio | boolean | No | true | Whether to generate synchronized audio |
callback_url | string | No | — | HTTPS URL for task completion callback, max 2048 characters |
Note: Images are passed as URLs only — Base64 inlining is not supported. URLs must be publicly GET-able without authentication and must not redirect to login pages.
Image Input Requirements
| Constraint | Limit |
|---|---|
| Count | 1 or 2 images |
| Format | .jpeg, .png, .webp |
| Dimensions | 300–6000 px per side |
| Aspect ratio | 0.4 – 2.5 (i.e. 2:5 to 5:2) |
| Max size per image | 30 MB |
| Total request body | ≤ 64 MB |
Any request exceeding these limits returns invalid_request. Realistic human faces are not supported — the system rejects them automatically.
First-Frame Mode (1 image)
curl -X POST https://api.evolink.ai/v1/videos/generations \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "seedance-2.0-image-to-video",
"prompt": "The camera slowly pushes in and the scene comes alive, with wind gently moving the grass in the background.",
"image_urls": ["https://example.com/first-frame.jpg"],
"duration": 5,
"aspect_ratio": "adaptive"
}'
aspect_ratio: "adaptive" automatically matches the output's aspect ratio to the input image.
First-Last-Frame Mode (2 images)
curl -X POST https://api.evolink.ai/v1/videos/generations \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "seedance-2.0-image-to-video",
"prompt": "A smooth transition from the sunrise ocean to the sunset ocean in the same location",
"image_urls": [
"https://example.com/sunrise.jpg",
"https://example.com/sunset.jpg"
],
"duration": 6,
"quality": "720p",
"aspect_ratio": "16:9"
}'
Both images should have similar dimensions and aspect ratios — otherwise the model may produce distortion during the transition.
Python Example
import requests
response = requests.post(
"https://api.evolink.ai/v1/videos/generations",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
json={
"model": "seedance-2.0-image-to-video",
"prompt": "The model slowly turns, hair flowing gently in the wind",
"image_urls": ["https://example.com/portrait.jpg"],
"duration": 5,
"quality": "720p"
}
)
task = response.json()
print(f"Task ID: {task['id']}")
Response
{
"id": "task-unified-1774857405-abc123",
"object": "video.generation.task",
"created": 1774857405,
"model": "seedance-2.0-image-to-video",
"status": "pending",
"progress": 0,
"type": "video",
"task_info": {
"can_cancel": true,
"estimated_time": 165,
"video_duration": 5
},
"usage": {
"billing_rule": "per_second",
"credits_reserved": 50,
"user_group": "default"
}
}
Field semantics are identical to other Seedance 2.0 models — see Async Tasks for the full lifecycle.
FAQ
What happens if I pass 3 images?
Returns invalid_request. Image-to-video strictly requires 1 or 2 images. If you need more than 2 images as style or subject references, use Reference-to-Video.
Do the image URLs have to be self-hosted? Not required. Any publicly GET-able URL works. For production pipelines that need reproducibility, host images on your own object storage (OSS / S3 / R2) to avoid third-party URL expiration.
Will the output preserve human faces from the input? If the input image contains a realistic human face, the request is rejected outright. For face-consistent virtual characters, synthesize non-realistic faces with another tool first, then feed them to this API.
Related
- Models Overview
- Text-to-Video API
- Reference-to-Video API — When you need more than 2 images or multimodal inputs
- Fast Models —
seedance-2.0-fast-image-to-video - Async Tasks / Webhooks