Dialogue-driven scene design with emotion direction

Seedance 2.0 Short Drama & Dialogue Prompts

Short drama prompts combine visual staging with spoken dialogue, character emotion tags, and sound design. Seedance 2.0's native lip-sync (8+ languages) makes it possible to generate dialogue scenes without post-production audio alignment. This template covers the dialogue prompt structure, emotion tagging, two-person staging, and the dramatic reveal pattern.

Статус щоденних оновлень: Незабаром

Останнє оновлення: 2026-03-27

Поточний статус

Шаблони, приклади та майбутні медіапідтвердження мають бути тут, а не всередині загальних навчальних гайдів.

Медіапідтвердження

Підтвердження зображенням: незабаром

Підтвердження відео: незабаром

Dialogue prompt structure

A dialogue prompt has two layers: Visuals and Dialogue. The visual layer describes framing, camera, lighting, and action as in any video prompt. The dialogue layer adds spoken lines in quotes, a speaker tag, and an emotion modifier. Format each line as: [Character A, emotion]: "Dialogue line." Place dialogue lines at the timestamp where they occur. The model reads the dialogue text to drive lip-sync timing and facial expression.

Emotion tags and character direction

Emotion tags tell the model how to perform a line — not just what is said but how. Common tags: [whisper], [angry outburst], [tearful], [sarcastic], [nervous laugh], [deadpan], [joyful]. Place the tag before the dialogue line. You can also add physical direction: [Character A, tearful, gripping the table edge]: "I can't do this anymore." The model uses these combined cues to generate matching facial micro-expressions, posture shifts, and gesture timing.

Two-person staging and shot planning

Two-person dialogue benefits from classic cinematic staging: establish with a medium two-shot showing both characters, then cut to over-the-shoulder (OTS) shots for each speaker, and use reaction close-ups for emotional beats. Describe the staging pattern in your prompt: '0-2s: Medium two-shot establishing both characters at the table. 2-5s: OTS from behind Character B focusing on Character A speaking. 5-7s: Reverse OTS on Character B's reaction.' This gives the model clear spatial and temporal structure.

Sound design for drama

Ambient mood sets emotional context before dialogue begins: rain on windows for melancholy, cafe chatter for casual warmth, total silence for tension. Dramatic stings — a single piano note, a low bass hum — punctuate reveal moments. Strategic silence (cutting all ambient sound) right before a key line amplifies its impact. Use the SFX tag to specify these audio design choices at their timestamp positions.

Приклади Input / Output

Breakup scene with emotional dialogue

A 10-second two-person dialogue scene using emotion tags, OTS staging, and ambient sound design.

10-second short drama, naturalistic cinematic look, warm desaturated palette, shallow depth of field.
0-3s: Medium two-shot of a man and woman sitting across a small cafe table, late afternoon light through the window, ambient cafe sounds softly present, tense silence between them.
3-6s: Over-the-shoulder from behind the woman, focusing on the man's face. [Man, struggling to hold composure]: "I think we both know this isn't working anymore." Slight tremor in his jaw, eyes avoiding direct contact.
6-8s: Reverse OTS on the woman's reaction, her eyes glisten but she does not cry. [Woman, quiet resolve]: "I was waiting for you to say it first." She picks up her coffee cup with steady hands.
8-10s: Wide two-shot pulling back slowly, both characters sit in silence, cafe ambient sound returns louder than before, the space between them feels larger.
SFX: Cafe ambient drops to silence at 3s, returns at 8s.
Negative: any text, subtitles, logos or watermarks

A naturalistic breakup scene with accurate lip-sync on both dialogue lines, matching facial emotions (suppressed pain vs quiet resolve), and ambient sound design that drops to silence during the exchange and returns after.

Evidence: multimodal guide

Late-night confession monologue

A 8-second single-character monologue shot with intimate framing and minimal staging.

8-second short drama, intimate handheld aesthetic, cool blue-orange nighttime palette.
0-2s: Close-up of a young woman sitting on apartment stairs at night, phone screen glow illuminating her face from below, shallow depth of field, city ambient hum in background.
2-6s: Slow push-in to extreme close-up as she speaks into the phone. [Woman, vulnerable, half-smiling through tears]: "I know it's 2 AM and I shouldn't be calling, but I just needed to hear your voice. Even if you don't pick up." Natural breath pauses between phrases, subtle lip quiver.
6-8s: She lowers the phone and looks up at the night sky, camera holds on her profile, a distant siren fades, her expression settles into quiet acceptance.
SFX: City ambient contracts to near-silence during dialogue, distant siren at 7s.
Negative: any text, subtitles, logos or watermarks

An intimate single-character scene with precise lip-sync to the monologue, matching emotional progression from vulnerability to acceptance, and environmental sound design that isolates the character's voice.

Comedic workplace misunderstanding

A 10-second two-person comedy dialogue with timing-driven humor and reaction shots.

10-second short comedy, bright sitcom lighting, slightly warm color grade, medium depth of field.
0-3s: Medium shot of a modern office, Character A stands holding a folder, Character B sits at desk looking up expectantly. Bright fluorescent office lighting, ambient keyboard and printer sounds.
3-6s: Medium close-up on Character A. [Character A, overly confident]: "I finished the entire Henderson report. Every. Single. Page." Punctuates with a proud folder tap on the desk.
6-8s: Cut to Character B examining the folder. Beat of silence. [Character B, deadpan]: "This is the lunch menu from the Italian place downstairs." Holds up a page with an incredulous look.
8-10s: Reverse to Character A, frozen smile slowly collapsing. No dialogue, just the slow realization. A phone rings in the background breaking the silence.
SFX: Comedic beat of silence at 6-7s, phone ring at 9s.
Negative: any text, subtitles, logos or watermarks

A comedy scene with precise dialogue timing where humor comes from the deadpan delivery and reaction shot, with accurate lip-sync and a well-timed beat of silence before the punchline.

Drama trailer teaser with hook ending

A 10-second trailer-style teaser that hooks viewers with a mystery cliff-hanger, designed for social media promotion.

10-second drama trailer, cinematic thriller aesthetic, teal-orange color grade, anamorphic lens flare.
0-2s: Rapid montage — three quick cuts: a hand slamming a door, eyes widening in shock, a phone screen lighting up in the dark. Each cut lasts 0.6s, sharp sound design stings between cuts.
2-5s: Slow-motion medium shot of a woman walking down an empty hospital corridor at night, flickering fluorescent lights, her heels echoing, camera tracking backwards in front of her, tension building.
5-8s: She stops at a door, reaches for the handle. Reverse angle close-up of her face — fear mixed with determination. [Woman, tense whisper]: "You said no one would get hurt."
8-10s: Door swings open to blinding white light, camera pushes into the light, screen whites out, title card hold frame with clean center space.
SFX: Staccato stings for montage, corridor echo, silence before dialogue, bass drop on door open.
Negative: any text, subtitles, logos or watermarks

A trailer-style teaser with rapid montage opening, tension-building tracking shot, one dialogue hook line with lip-sync, and a white-out cliff-hanger ending suitable for title card overlay.

Поширені запитання

How many dialogue lines can I include in a single generation?

Two to four lines per 10-second generation works reliably. Each line needs enough screen time for lip-sync and reaction. More than four lines in 10 seconds forces rushed delivery that looks unnatural. For longer conversations, generate in segments and join them using the video extension workflow.

Does Seedance 2.0 lip-sync work for non-English dialogue?

Yes. Seedance 2.0 supports native lip-sync across 8+ languages including English, Chinese, Japanese, Korean, Spanish, French, German, and Portuguese. Write the dialogue in the target language and the model matches mouth shapes to that language's phonemes.

How do I handle scenes where characters talk over each other?

Overlapping dialogue is difficult for current generation models. Instead, stage the overlap as one character starting to speak while the other is visibly reacting — describe the listener's mouth opening to interrupt. Generate the visual overlap and layer the actual overlapping audio tracks in post-production for the cleanest result.

Пов'язані гайди

Пов'язані шаблони промптів

Переглянути більше шаблонів промптів