Seedance 2.0: prompt patterns and six use cases for AI video
A creator's guide to Seedance 2.0 — multi-reference inputs, native 2K output, multi-shot timelines, and six battle-tested prompts with results inside OmniArt.

Seedance 2.0 is the model creators reach for when the brief reads like a director's brief. ByteDance shipped it in February 2026 as a unified multimodal diffusion Transformer that accepts text, up to nine images, three reference videos, and three audio files in a single prompt — all addressable with @image1 and @video1 syntax. The result is the cleanest path to character consistency across multi-shot timelines available today. This guide covers the prompt grammar that respects the model and six tested use cases with the prompts and results.
What Seedance 2.0 is
Seedance 2.0 generates 4–15-second clips at up to 2K with native stereo audio. The headline isn't resolution — it's the multi-reference architecture and the timeline-style multi-shot prompting.
| Spec | Value |
|---|---|
| Architecture | Unified multimodal diffusion Transformer |
| Max resolution | 2K |
| Duration | 4–15 seconds |
| Image inputs | up to 9 (@image1–@image9) |
| Video inputs | up to 3 (@video1–@video3) |
| Audio inputs | up to 3 (@audio1–@audio3) |
| Native audio output | Yes — dialogue, SFX, ambience, music |
| Lip-sync languages | 7+ |
| Modes | Standard, Fast |
Why the multi-reference system matters
Most video models accept one reference, or none. Seedance 2.0 accepts a stack and lets you bind each reference to a role inside the prompt. Use @image1 for the character's face, @image2 for the costume, @image3 for the location, @video1 for the camera move you want, @audio1 for the music bed. The output respects each as a discrete instruction instead of averaging them into noise.
That's the practical reason character likeness holds across shots: the same @image reference goes into every shot in the timeline, and the model uses it as the identity anchor rather than re-inferring the character from the prompt each time.
Prompt structure that works
Seedance 2.0 rewards a six-part structure.
- Subject — who or what is on screen
- Action / movement — what they do
- Setting / environment — where it happens
- Visual style — film references, palette, era
- Camera direction — specific cinematography terms
- Lighting — direction, quality, time of day
A good template prompt:
"Subject (with
@image1reference if applicable). Action. Setting. Visual style. Camera direction (specific cinematography term). Lighting detail."
Multi-shot timeline notation
For multi-shot work, write the timeline directly into the prompt.
0–4s: wide establishing shot, character (in @image1) walks into the scene
4–8s: medium tracking shot follows them across the room
8–12s: 360-degree orbit around the table they reach
Pin the same @image1 across every segment. Likeness stays consistent through the cut.
Reference tagging discipline
A short rulebook that pays off:
- Use
@image1,@image2for face photos and product shots. - Use
@video1for the camera move you want copied. - Use
@audio1when the audio bed matters more than the model's default. - Reference each tag explicitly in the text. Don't rely on the model to infer which reference is which role.
Six tested use cases with prompts
Each prompt below is one we've run on Seedance 2.0. The results column is what we got, with generation time measured on Standard 720p.
1. Cinematic film scene
"A retired detective in a long dark coat walks through a rain-soaked alley at night. Neon signs reflect red and blue on the wet cobblestones. He pauses, lights a cigarette, and glances over his shoulder. Slow push-in from wide shot to medium close-up. Film noir style, anamorphic lens flare, teal-orange color grading, film grain."
Result. Smooth camera push-in. Convincing rain reflections, natural coat movement. Cigarette lighting renders without hand distortion. Rain and city ambient audio generated in sync. ~70 seconds.
2. Product commercial
"A luxury perfume bottle rotates slowly on a black marble surface. Golden liquid catches the light as it turns. Soft particles of gold dust float in the air around it. Macro close-up, slow 360-degree orbit camera. Studio lighting with warm rim light, high-end commercial photography style."
Result. Glass refraction and liquid behavior accurate. Particle drift natural. Smooth full rotation, correct light angles, marble texture visible. ~65 seconds.
3. Music video
"A female singer in a flowing red silk dress performs on a rooftop at sunset. City skyline stretches behind her. Wind blows her hair and dress dramatically. She sings with emotional intensity, arms spread wide. Dynamic tracking shot circling around her. Golden hour backlighting, lens flare, vibrant warm tones."
Result. Realistic dress physics. Fluid tracking orbit. Face stays consistent through the rotation. Hair movement matches wind direction. Generated ambient musical track. ~75 seconds.
4. Character portrait in motion
"An elderly Japanese craftsman in a traditional wooden workshop, morning light streaming through paper screens. He slowly lifts a hand-forged ceramic tea bowl, examining it with quiet pride. His weathered hands rotate the bowl gently. Close-up of his hands, then slow tilt up to reveal his face. Wabi-sabi aesthetic, warm natural light, documentary portrait quality."
Result. Correct finger count. Natural joint movement. Smooth tilt from hands to face. Realistic light through screens. Faint workshop ambient sounds. Realistic skin texture. ~80 seconds.
5. Nature and landscape
"Aerial drone shot gliding over a misty mountain valley at sunrise. Layers of fog roll between emerald green peaks. A winding river reflects the golden morning light below. Eagles soar through the frame at eye level. Smooth forward tracking with slight descent. Epic landscape, volumetric fog, golden hour lighting."
Result. Independent fog layers create convincing depth. River reflections update with camera position. Strong palette balance. Volumetric fog renders cleanly. Wind and bird call audio. ~55 seconds — the fastest of the six.
6. Anime and fantasy
"An anime warrior princess stands atop a cliff overlooking a burning medieval city at night. Her long silver hair and crimson cape billow in the wind. She draws a glowing blue katana, electricity crackling along the blade. Cherry blossom petals swirl around her. Dynamic low-angle shot with slow push-in. Cel-shading style, vibrant neon accents, dramatic speed lines."
Result. Consistent cel-shading throughout. Fluid katana draw. Electricity effect integrates naturally. Independently moving cherry blossoms. Firelight interaction with cape. Dramatic sword swoosh audio. ~70 seconds.
Common errors and fixes
| Problem | Cause | Fix |
|---|---|---|
| Prompt rejected | Face keywords or ambiguous phrasing | Remove explicit face descriptions; use @image references instead |
| Black frames | Overly complex prompt | Cut to one action per 4–5 seconds; lower resolution for the test |
| Character face changes between shots | No consistent reference | Pin the same @image1 in every shot of the timeline |
| Audio out of sync | Joint diffusion mismatch | Regenerate with audio disabled, add the bed separately |
| Hand or finger distortion | Complex hand interaction without reference | Add a reference image of the desired hand pose |
| "AI-generated" texture | Over-reliance on style keywords | Add physical details — materials, lighting, lens type |
Seedance 2.0 vs Seedance 1.0
If you've used 1.0, the gap to 2.0 is wider than the version number suggests.
| Feature | 1.0 | 2.0 |
|---|---|---|
| Architecture | Separate pipelines | Unified diffusion Transformer |
| Image input | 1 optional | up to 9, addressable via @tag |
| Video input | None | up to 3 |
| Audio input | None | up to 3 |
| Native audio output | No | Yes |
| Max resolution | 1080p | 2K |
| Duration | 5–10s | 4–15s |
| Multi-shot | Basic | Timeline storyboard with cross-shot consistency |
| Hand quality | Frequent artifacts | Noticeably improved |
| In-video editing | No | Yes — character / object swap |
| First-attempt usable | ~60% | 90%+ |
When to choose something else
Seedance 2.0 isn't the right tool for every brief.
| Need | Better choice |
|---|---|
| 4K at 60fps for broadcast | Veo 3 |
| Frame-level motion direction | Runway Gen-4.5 |
| Cheapest 720p social with audio | Grok Imagine |
| Fastest iteration loop | HappyHorse 1.0 |
| Heavy parameterized lens control | PixVerse V6 |
| Long single-take scene | Sora 2 |
Pricing on OmniArt
Seedance 2.0 is credit-priced inside the OmniArt video workspace. Standard 720p runs at 30 credits per second; Fast 720p at 20 credits per second. Ultra members get a 40% credit discount across both modes. As a rough check on the iteration math: a 5-second Standard 720p clip is 150 credits, a 5-second Fast 720p clip is 100.
Warning
ByteDance has not published explicit commercial usage rights for Seedance 2.0 outputs as of this writing. For high-stakes commercial work, double-check the platform license terms before delivery.
Getting started on OmniArt
Seedance 2.0 sits inside the OmniArt video workspace next to PixVerse V6, BACH, Sora 2, Veo 3, Kling 3.0, HappyHorse 1.0, and Grok Imagine. Same credit balance, same reference upload, same prompt grammar.
Start with the cinematic film scene prompt above to feel out the multi-reference workflow, then move to the music-video brief once you want to test face consistency across motion.
If you're choosing between Seedance 2.0 and HappyHorse 1.0, the HappyHorse 1 vs Seedance 2 comparison walks through the trade-offs shot by shot. For longer narrative sequences, the BACH cinematographer guide is the stronger starting point.