Multimodal Reference
Seedance 2.0 supports a powerful @ tag reference system that lets you assign specific roles to uploaded images, videos, and audio files within your prompt. This gives you fine-grained creative control over the generated video.
@Tag Syntax
Reference uploaded files in your prompt using @ tags that correspond to the position of each URL in its respective array:
| Tag Format | Maps To | Examples |
|---|---|---|
@Image1 – @Image9 | image_urls[0] – image_urls[8] | @Image1 as first frame |
@Video1 – @Video3 | video_urls[0] – video_urls[2] | replicate @Video1 camera movement |
@Audio1 – @Audio3 | audio_urls[0] – audio_urls[2] | @Audio1 for BGM rhythm |
Tags are 1-indexed — @Image1 refers to the first URL in image_urls, @Image2 to the second, and so on.
File Limits
| Type | Max Count | Supported Formats | Max Size | Duration |
|---|---|---|---|---|
| Images | 9 | .jpeg, .png, .webp, .bmp, .tiff, .gif | 30MB each | — |
| Videos | 3 | .mp4, .mov | 50MB each | 2–15s total |
| Audio | 3 | .mp3, .wav | 15MB each | ≤ 15s total |
Total limit: 12 files across all modalities per request.
Face restriction: Realistic human face uploads are automatically rejected.
Image @Tag Roles
Use image references to control visual elements of the generated video:
| Role | Prompt Pattern | Description |
|---|---|---|
| First frame | @Image1 as first frame | Use the image as the opening frame of the video |
| Last frame | @Image2 as last frame | Use the image as the closing frame |
| Character reference | @Image1 as character | Maintain character appearance throughout |
| Style reference | @Image1 as style reference | Apply the visual style (colors, mood, aesthetics) |
| Scene reference | @Image1 as scene | Use as background or environment reference |
| Object reference | @Image1 as object | Reference a specific object to appear in the video |
| Composition | @Image1 as composition reference | Follow the layout and framing of the image |
Video @Tag Roles
Use video references to transfer motion, timing, and camera work:
| Role | Prompt Pattern | Description |
|---|---|---|
| Camera movement | replicate @Video1 camera movement | Copy the camera trajectory (pan, tilt, zoom, dolly) |
| Choreography | replicate @Video1 choreography | Match body/object motion patterns |
| Effects | replicate @Video1 effects | Transfer visual effects and transitions |
| Rhythm | match @Video1 rhythm | Sync cut timing and motion pacing |
| Full replication | replicate @Video1 | Reproduce overall motion, camera, and pacing |
| Audio extraction | use @Video1 audio | Extract and use the audio track from the reference video |
Audio @Tag Roles
Use audio references to drive the rhythm and soundtrack of the video:
| Role | Prompt Pattern | Description |
|---|---|---|
| Background music | @Audio1 for BGM rhythm | Sync motion energy and cuts to the music beat |
| Sound effects | @Audio1 as sound effects | Align visual events with audio cues |
| Beat sync | sync to @Audio1 beat | Match motion peaks to musical beats |
API Example
A complete multimodal request combining image, video, and audio references:
import requests
response = requests.post(
"https://api.evolink.ai/v1/videos/generations",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
},
json={
"model": "seedance-2.0",
"prompt": (
"@Image1 as first frame, @Image2 as character reference. "
"Replicate @Video1 camera movement. "
"Sync to @Audio1 beat. "
"A cinematic tracking shot through a neon-lit alley at night."
),
"image_urls": [
"https://example.com/scene-start.jpg",
"https://example.com/character-ref.jpg"
],
"video_urls": [
"https://example.com/camera-reference.mp4"
],
"audio_urls": [
"https://example.com/soundtrack.mp3"
],
"duration": 10,
"quality": "1080p",
"aspect_ratio": "16:9"
}
)
print(response.json())
Common Patterns
Character Consistency
Maintain the same character across different scenes by providing a clear character reference image:
@Image1 as character reference. The woman walks through a busy market, picking up an apple, examining it closely.
Camera Replication
Copy the exact camera trajectory from a reference video onto a completely new scene:
@Image1 as first frame. Replicate @Video1 camera movement. A sweeping drone shot over snow-covered mountains.
Music Video
Sync generated visuals to an audio track's beat and rhythm:
@Image1 as style reference. Sync to @Audio1 beat. Fast cuts of urban street scenes, neon lights, dancing figures.
Rules and Restrictions
- Tags must match the array position —
@Image1is alwaysimage_urls[0] - You cannot reference more files than provided in the URL arrays
- Maximum 12 files total across all modalities
- Realistic human face images are automatically rejected
- Video references increase generation cost
- All URLs must be directly accessible by the server (no authentication, no redirects to login pages)
- Prompt length limit: 2000 tokens including
@tag text
Related
- Video Generation API — Full endpoint reference with all parameters
- Seedance 2.0 Multimodal Tags Guide — In-depth tutorial with creative examples
- Camera Movement API Tutorial — Replicate camera work from reference videos
- SDKs & Examples — Integration code in Python, Node.js, Go, and cURL