Seedance 2.0 @Tags Guide: Multimodal Reference System Explained

Most AI video generators take a text prompt and give you whatever they feel like. Seedance 2.0 works differently. You upload images, videos, and audio files, then use @tags to tell the model exactly what each file should do — act as a first frame, define camera movement, set the music tempo, or provide a character reference.

This @tag reference system is what separates Seedance 2.0 from Sora 2, Kling 3.0, and Veo 3.1. None of them offer this level of multimodal control.

This guide covers every @tag type, the syntax rules, file limits, and real prompt examples you can use immediately. If you want to follow along with API calls, get your free EvoLink API key — it takes 30 seconds.

What Is the @Tag Reference System?

Traditional text-to-video is a one-input, one-output process: you write a prompt, the model interprets it however it wants. Seedance 2.0 turns this into a multi-input, directed-output process.

Here's the difference:

Approach	Input	Control Level	Result
Text-only	"A woman dances on stage"	Low — model decides everything	Random woman, random dance, random stage
With @tags	@Image1 (character) + @Video1 (dance reference) + prompt	High — you direct each element	Your specific character performs the exact dance you referenced

The @tag system works like a film director's shot list. Each uploaded file gets a role assignment through natural language in your prompt:

@Image1 as the first frame — pins the opening visual
@Video1 for camera movement reference — copies the cinematography
@Audio1 as background music — sets the soundtrack and rhythm

You can combine up to 12 files (9 images + 3 videos + 3 audio clips) in a single generation, each tagged with a specific purpose.

@Tag Syntax Rules — The Complete Reference

Basic Syntax

The format is straightforward: @ + asset type + number.

@Image1, @Image2, @Image3 ... @Image9
@Video1, @Video2, @Video3
@Audio1, @Audio2, @Audio3

In your prompt, you reference these tags and describe their role in natural language:

@Image1 as the first frame, @Image2 as character reference,
reference @Video1's camera movement and tracking shots,
use @Audio1 for background music tempo.

Note: On the Jimeng (即梦) platform, tags use Chinese format: @图片1, @视频1, @音频1. Through the API, use @Image1, @Video1, @Audio1.

File Limits and Formats

Asset Type	Max Count	Formats	Size Limit	Notes
Images	9	JPEG, PNG, WebP, BMP, TIFF, GIF	30 MB each	Higher resolution = better output
Videos	3	MP4, MOV	50 MB each	Total duration: 2–15s, resolution: 480p–720p
Audio	3	MP3, WAV	15 MB each	Total duration: ≤ 15s
Combined	12 total	—	—	Mix any combination within limits

The Two Entry Modes

Seedance 2.0 has two generation modes. Your input determines which one to use:

First/Last Frame Mode — Upload only a starting image (+ optional ending image) with a text prompt. Simple and fast.
All-Round Reference Mode — Upload any combination of images, videos, and audio with @tag assignments. This is where the full power lives.

Rule: If you upload any video or audio reference, or more than 2 images, you must use All-Round Reference mode.

Image @Tags — Control Visual Identity

Image references are the most versatile @tag type. A single image can serve many different purposes depending on how you describe it in your prompt.

Reference Types for Images

Purpose	Prompt Pattern	Example
First frame	`@Image1 as the first frame`	Pins the exact opening visual of your video
Last frame	`@Image2 as the last frame`	Defines the ending visual for transitions
Character identity	`@Image1 is the main character`	Maintains face/body consistency throughout
Style reference	`reference @Image1's art style`	Applies painting style, color palette, or visual aesthetic
Scene/environment	`scene references @Image3`	Sets the location, background, architecture
Object reference	`the product in @Image1`	Maintains product details for commercials
Composition	`framing references @Image1`	Copies the camera angle and layout

Example: Style Transfer with Van Gogh

Prompt:

A young woman with long blonde hair in a blue dress stands on a hilltop,
gazing at a Provençal village at sunset. Entirely rendered in @Image1's
post-impressionist art style — thick impasto brushstrokes, swirling textures,
rich yellows and blues.

Input: One Van Gogh painting as @Image1

Result: The model renders the entire scene in Van Gogh's signature style — not a filter overlay, but genuine style transfer that maintains the brushstroke texture throughout the video.

Video: Style transfer using @Image reference — Van Gogh post-impressionist rendering

Example: Product Commercial

Prompt:

Commercial showcase of the handbag in @Image2.
Side profile references @Image1.
Surface material texture references @Image3.
Display all product details with cinematic camera movement.
Grand orchestral background music.

Input: 3 images — side view, main product photo, material close-up

Result: A polished product video that maintains exact material textures and proportions from your reference images — no AI hallucination on product details.

Multi-Image Character Consistency

When you need the same character across multiple shots, upload several reference images from different angles:

@Image1 and @Image2 define the main character's appearance.
The character walks through @Image3's environment,
wearing the outfit from @Image4.

The more reference images you provide for a character, the more consistent the output. This solves the "face morphing" problem that plagues single-image generation.

Video @Tags — Replicate Camera & Motion

Video references unlock Seedance 2.0's most impressive capability: precise replication of camera work and physical motion. Upload a reference video, and the model copies the exact cinematography, action choreography, or visual effects.

Reference Types for Videos

Purpose	Prompt Pattern	What Gets Copied
Camera movement	`reference @Video1's camera movement`	Pan, tilt, dolly, tracking, zoom patterns
Action/choreography	`perform the actions from @Video1`	Body movement, dance steps, fight choreography
Visual effects	`reference @Video1's transition effects`	Particle effects, style transitions, VFX
Rhythm/pacing	`match @Video1's editing rhythm`	Cut timing, beat synchronization, tempo
Full replication	`completely reference @Video1`	Everything — camera, action, effects, pacing

Example: Cinematic Camera Replication

Prompt:

Reference @Image1's character. He is in @Image2's elevator.
Completely reference @Video1's camera movements and the protagonist's
facial expressions. Hitchcock zoom when the character is frightened,
then several orbiting shots inside the elevator.
The elevator door opens, tracking shot follows him out.
Exterior scene references @Image3.

Input: 3 images (character, elevator interior, exterior scene) + 1 reference video (with desired camera work)

Result: The model reproduces the exact Hitchcock zoom, orbital camera movements, and tracking shots from your reference video — applied to a completely different character and setting.

Camera Techniques You Can Replicate

Seedance 2.0 can reproduce these camera movements from a reference video:

Hitchcock zoom (dolly zoom / vertigo effect)
360° orbit around the subject
One-shot continuous take (no cuts)
Mechanical arm multi-angle tracking
Low-angle hero shots
Handheld chase camera
Fish-eye lens distortion
Push-pull rhythmic movement

Prompt tip: Be specific about which aspect of the reference video to copy. "Reference @Video1's camera movement" is better than just "reference @Video1" — it tells the model to focus on cinematography rather than trying to copy everything. For camera reference examples with complete Python code, see our dedicated camera movement tutorial.

Example: Action Parkour

Video: Dynamic parkour with cinematic tracking shot — generated with camera movement reference

Audio @Tags — Sound Design with References

Seedance 2.0 generates native audio with every video — sound effects, ambient noise, music, and even dialogue. Audio @tags give you control over what it sounds like.

Reference Types for Audio

Purpose	Prompt Pattern	What Gets Copied
Background music	`use @Audio1 for background music`	Musical style, tempo, instruments
Sound effects	`sound effects reference @Audio1`	Specific sound textures and timing
Voice/narration style	`narration voice references @Video1`	Vocal tone, speaking pace, accent
Beat sync	`match @Audio1's rhythm for editing cuts`	Music beats drive visual transitions

Beat Synchronization (Music Video Mode)

One of the most powerful audio features: upload a music track, and the model synchronizes visual cuts and transitions to the beat.

Prompt:

@Image1 through @Image7 as scene references.
Match @Video1's visual rhythm and beat synchronization.
Each image appears on a music beat with dynamic transitions.
Enhance visual impact with dramatic lighting changes on each cut.

Result: The model creates a music-video-style edit where scene transitions, camera movements, and lighting shifts happen precisely on the beat of your reference audio.

Using Video Audio as Reference

You don't need a separate audio file — you can reference the audio track from an uploaded video:

Background music references @Video1's audio.

This is useful when you want to replicate the sound design of an existing video while changing the visuals.

Example: Character Dialogue

Video: AI-generated character dialogue with natural voice acting and ambient café sounds

Seedance 2.0 supports multi-language dialogue generation, including English, Chinese, Spanish, Korean, and more. Write the dialogue directly in your prompt, and the model generates matching lip-sync and voice acting.

The real power of @tags emerges when you combine multiple modalities. Here are three proven recipes for common production scenarios.

Recipe 1: Cinematic Short Film

Goal: Film-quality scene with specific character, camera work, and soundtrack

Files:

@Image1: Character face/body reference
@Image2: Environment/location reference
@Video1: Camera movement reference (e.g., tracking shot from a film)
@Audio1: Background music track

Prompt:

@Image1's character walks through @Image2's environment.
Camera movement follows @Video1's tracking shot pattern.
Background music uses @Audio1.
Cinematic lighting, shallow depth of field, 24fps film grain.

File allocation: 2 images + 1 video + 1 audio = 4/12 files used

Recipe 2: E-Commerce Product Video

Goal: Professional product showcase from static product photos

Files:

@Image1: Product main shot
@Image2: Product side view
@Image3: Material/texture close-up
@Video1: Camera movement reference (orbiting product shot)

Prompt:

Commercial showcase of the product in @Image2.
Side profile references @Image1.
Surface material and texture reference @Image3.
Camera movement references @Video1's orbiting rotation.
Studio lighting, reflective dark surface, premium aesthetic.

File allocation: 3 images + 1 video = 4/12 files used

Recipe 3: Multi-Character Animation

Goal: Two characters interacting with choreographed action

Files:

@Image1, @Image2: Character A (front + side reference)
@Image3, @Image4: Character B (front + side reference)
@Image5: Background/scene reference
@Video1: Action choreography reference

Prompt:

@Image1 and @Image2 define Character A (spear wielder).
@Image3 and @Image4 define Character B (dual swords).
They fight in @Image5's autumn forest, mimicking @Video1's
combat choreography. White dust rises on impact.
Dramatic star-filled night sky.

File allocation: 5 images + 1 video = 6/12 files used

The 12-File Budget: Allocation Strategy

You have 12 slots. Here's how to allocate them for maximum impact:

Priority	Allocation	Why
Character identity	2-3 images per character	More angles = better consistency
Camera/motion reference	1 video	One good reference is enough
Scene/environment	1-2 images	Sets the world
Audio/music	1 audio or video (for its audio track)	Sets the mood
Style reference	1 image (if needed)	Only if you want non-realistic style
Reserve	Keep 2-3 slots free	For iteration and additional detail

Pro tip: Don't use all 12 slots. Start with 4-6 files and add more only if the output needs more precision. Overloading with references can confuse the model.

API Call Example

Here's how a multimodal generation looks through the API:

import requests

response = requests.post(
    "https://api.evolink.ai/v1/videos/generations",
    headers={"Authorization": "Bearer YOUR_EVOLINK_API_KEY"},
    json={
        "model": "seedance-2.0",
        "prompt": (
            "@Image1 as the main character. "
            "@Image2 as the environment. "
            "Reference @Video1's tracking shot and camera movement. "
            "The character walks through a misty forest at dawn. "
            "Cinematic lighting, shallow depth of field."
        ),
        "image_urls": [
            "https://your-cdn.com/character.jpg",
            "https://your-cdn.com/forest.jpg"
        ],
        "video_urls": [
            "https://your-cdn.com/tracking-shot.mp4"
        ],
        "duration": 10,
        "quality": "1080p",
        "generate_audio": true
    }
)

task_id = response.json()["id"]
print(f"Generation started: {task_id}")

Poll for the result:

import time

while True:
    status = requests.get(
        f"https://api.evolink.ai/v1/tasks/{task_id}",
        headers={"Authorization": "Bearer YOUR_EVOLINK_API_KEY"}
    )
    result = status.json()

    if result["status"] == "completed":
        print(f"Video ready: {result['results'][0]}")
        break
    elif result["status"] == "failed":
        print(f"Error: {result.get('error', 'Unknown error')}")
        break

    time.sleep(5)

Run this code with your EvoLink API key. Sign up is free — no credit card required.

Common Mistakes & How to Fix Them

Not specifying the @tag's purpose

Bad: @Image1 @Video1 generate a video of a dancer

Good: @Image1 as the dancer's appearance reference. @Video1 for dance choreography and camera movement. Generate the dancer performing on a stage.

The model needs explicit role assignments. Without them, it guesses — and guesses wrong.

Low-resolution input files

If your @Image1 is 480p, your output will look soft. Always use:

Images: 2K or higher resolution
Videos: 720p, clean footage without compression artifacts
Audio: 128kbps+ MP3 or lossless WAV

Trying to use all 12 file slots

More references doesn't mean better output. Start with 3-5 files and add only if needed. Too many conflicting references confuse the model.

Uploading realistic human face photos

Platform limitation: Seedance 2.0 currently does not support uploading images or videos containing realistic human faces. The system will automatically block these uploads. Use illustrated, anime-style, or stylized character references instead.

Mixing up asset numbering

When you upload 3 images and 2 videos, they are numbered independently:

Images: @Image1, @Image2, @Image3
Videos: @Video1, @Video2

Don't write @File3 or @Asset5 — use the type-specific numbering.

Setting wrong duration for video extensions

When extending an existing video by 5 seconds, set the generation duration to 5s (the new portion), not the total length. The extension is appended to the original.

FAQ

How many files can I upload in a single generation?

Up to 12 files total: maximum 9 images, 3 videos, and 3 audio clips. Videos must have a combined duration between 2 and 15 seconds. Audio clips can total up to 15 seconds.

Can I use @tags through the API?

Yes. When calling the API, pass image_urls, video_urls, and audio_urls arrays in the JSON request body. Each array contains direct URLs to your reference files. The @tag numbering (@Image1, @Image2...) corresponds to the order of URLs in each array. The prompt text uses the same @tag syntax as the UI.

What happens if I don't assign a role to an @tag?

The model will attempt to infer the purpose based on the file content and your prompt context. However, this is unreliable. Always explicitly state each tag's role — e.g., @Image1 as the first frame rather than just mentioning @Image1 without context.

Every @tag needs a role. Don't just upload files — tell the model what each one does.
Start small, add precision. Begin with 3-4 references. Add more only if the output needs it.
Be specific about what to copy. "Reference @Video1's camera movement" beats "reference @Video1."

Ready to direct your own AI videos? Start free on EvoLink — one API key for Seedance 2.0 and all major AI video models, with smart routing that saves you 20-70%.

Continue learning:

Seedance 2.0 Prompt Guide — Master prompt writing fundamentals

Last updated: February 20, 2026 | Written by J, Growth Lead at EvoLink

Seedance 2.0 Multimodal Reference: The Ultimate Guide to @Tags

Ready to get started?