以对话驱动的场景设计与情绪指导

Seedance 2.0 短剧与对话提示词

短剧提示词将视觉调度与口语对话、角色情绪标签和声音设计相结合。Seedance 2.0 的原生唇形同步（支持 8 种以上语言）使得无需后期音频对齐即可生成对话场景。本模板涵盖对话提示词结构、情绪标签、双人调度和戏剧性揭示模式。

每日更新状态: 即将上线

最后更新: 2026-03-27

当前状态

模板、示例和后续媒体证明应集中放在这里，而不是散落在宽泛的教程指南中。

媒体证明

图片证明：即将补充

视频证明：即将补充

对话提示词结构

对话提示词包含两个层面：画面和对话。画面层描述构图、镜头、灯光和动作，与常规视频提示词相同。对话层添加引号中的台词、说话人标签和情绪修饰词。每行格式为：[角色A, 情绪]: "对话内容。" 在对应时间戳处放置对话行。模型会读取对话文本来驱动唇形同步时序和面部表情。

情绪标签与角色指导

情绪标签告诉模型如何呈现台词——不仅是说什么，更是怎么说。常用标签：[whisper]（耳语）、[angry outburst]（愤怒爆发）、[tearful]（含泪）、[sarcastic]（讽刺）、[nervous laugh]（紧张笑）、[deadpan]（面无表情）、[joyful]（欢快）。将标签放在台词前。还可添加肢体指导：[角色A, tearful, gripping the table edge]: "I can't do this anymore." 模型利用这些组合线索生成匹配的面部微表情、姿态变化和手势时机。

双人调度与镜头规划

双人对话场景适合经典电影调度：先用中景双人镜头建立两人位置关系，然后切到各说话人的过肩镜头（OTS），情绪高点使用反应特写。在提示词中描述调度模式：「0-2s: 中景双人镜头，建立桌前两人。2-5s: 从角色B背后过肩镜头聚焦角色A说话。5-7s: 反转过肩镜头捕捉角色B反应。」这为模型提供了清晰的空间和时间结构。

戏剧场景的声音设计

环境音在对话开始前设定情绪基调：窗外雨声表达忧郁，咖啡馆嘈杂声营造温馨，完全静默制造紧张。戏剧性音效——一个钢琴单音、一声低沉嗡鸣——在揭示时刻进行点缀。在关键台词前策略性地切断所有环境音，放大台词的冲击力。使用 SFX 标签在对应时间戳位置指定这些音频设计选择。

Input / Output 示例

带有情感对话的分手场景

一段 10 秒的双人对话场景，使用情绪标签、过肩镜头调度和环境声音设计。

10-second short drama, naturalistic cinematic look, warm desaturated palette, shallow depth of field.
0-3s: Medium two-shot of a man and woman sitting across a small cafe table, late afternoon light through the window, ambient cafe sounds softly present, tense silence between them.
3-6s: Over-the-shoulder from behind the woman, focusing on the man's face. [Man, struggling to hold composure]: "I think we both know this isn't working anymore." Slight tremor in his jaw, eyes avoiding direct contact.
6-8s: Reverse OTS on the woman's reaction, her eyes glisten but she does not cry. [Woman, quiet resolve]: "I was waiting for you to say it first." She picks up her coffee cup with steady hands.
8-10s: Wide two-shot pulling back slowly, both characters sit in silence, cafe ambient sound returns louder than before, the space between them feels larger.
SFX: Cafe ambient drops to silence at 3s, returns at 8s.
Negative: any text, subtitles, logos or watermarks

自然主义分手场景，两句对话均有准确唇形同步，匹配的面部情绪（压抑的痛苦 vs 安静的释然），以及在对话中降为静默、结束后恢复的环境声音设计。

Evidence: multimodal guide

深夜独白告白

一段 8 秒的单人独白镜头，采用亲密构图和极简调度。

8-second short drama, intimate handheld aesthetic, cool blue-orange nighttime palette.
0-2s: Close-up of a young woman sitting on apartment stairs at night, phone screen glow illuminating her face from below, shallow depth of field, city ambient hum in background.
2-6s: Slow push-in to extreme close-up as she speaks into the phone. [Woman, vulnerable, half-smiling through tears]: "I know it's 2 AM and I shouldn't be calling, but I just needed to hear your voice. Even if you don't pick up." Natural breath pauses between phrases, subtle lip quiver.
6-8s: She lowers the phone and looks up at the night sky, camera holds on her profile, a distant siren fades, her expression settles into quiet acceptance.
SFX: City ambient contracts to near-silence during dialogue, distant siren at 7s.
Negative: any text, subtitles, logos or watermarks

亲密的单人场景，独白唇形同步精准，情绪从脆弱到接受的渐进匹配，环境声音设计在对话时孤立角色声音。

职场喜剧误会

一段 10 秒的双人喜剧对话，依靠节奏驱动幽默和反应镜头。

10-second short comedy, bright sitcom lighting, slightly warm color grade, medium depth of field.
0-3s: Medium shot of a modern office, Character A stands holding a folder, Character B sits at desk looking up expectantly. Bright fluorescent office lighting, ambient keyboard and printer sounds.
3-6s: Medium close-up on Character A. [Character A, overly confident]: "I finished the entire Henderson report. Every. Single. Page." Punctuates with a proud folder tap on the desk.
6-8s: Cut to Character B examining the folder. Beat of silence. [Character B, deadpan]: "This is the lunch menu from the Italian place downstairs." Holds up a page with an incredulous look.
8-10s: Reverse to Character A, frozen smile slowly collapsing. No dialogue, just the slow realization. A phone rings in the background breaking the silence.
SFX: Comedic beat of silence at 6-7s, phone ring at 9s.
Negative: any text, subtitles, logos or watermarks

喜剧场景，对话时机精准，幽默来自冷面台词和反应镜头，唇形同步准确，金句前有恰到好处的静默节拍。

短剧预告片——悬念钩子结尾

10 秒预告片风格的悬疑钩子片段，适合社交媒体推广投放。

10 秒短剧预告片，电影级悬疑美学，青橙色调，变形宽银幕镜头光晕。
0-2秒：快速蒙太奇——三个快切：一只手猛推门、眼睛惊恐睁大、黑暗中手机屏幕亮起。每个镜头 0.6 秒，镜头间有锐利的音效刺点。
2-5秒：慢动作中景，一名女性深夜走过空荡医院走廊，荧光灯闪烁，高跟鞋回响，摄像机在她前方倒退跟拍，紧张感逐渐升级。
5-8秒：她在门前停下，伸手握住门把。反打特写她的面部——恐惧中带着决心。[女子，紧张低语]："你说过不会有人受伤的。"
8-10秒：门打开，刺眼白光涌入，镜头推入白光中，画面白化，留出中央干净空间用于标题卡。
SFX：蒙太奇段落锐利刺点，走廊回声，对白前静默，开门时低频音浪。
Negative: any text, subtitles, logos or watermarks

一段预告片风格的悬疑钩子：快速蒙太奇开场、紧张感跟拍镜头、一句对白钩子配口型同步，以及适合叠加标题的白化悬念结尾。

常见问题

单次生成中可以包含多少句对话？

10 秒生成中包含 2-4 句对话效果最佳。每句台词需要足够的画面时间用于唇形同步和反应。10 秒内超过 4 句会导致语速过快显得不自然。较长的对话建议分段生成，再使用视频延展工作流拼接。

Seedance 2.0 的唇形同步支持非英语对话吗？

支持。Seedance 2.0 原生支持 8 种以上语言的唇形同步，包括英语、中文、日语、韩语、西班牙语、法语、德语和葡萄牙语。用目标语言书写对话，模型会匹配该语言音素的口型。

如何处理角色抢话叠加的场景？

当前生成模型难以处理重叠对话。建议将重叠设计为一个角色开始说话时另一个角色明显做出反应——描述听者张嘴欲打断。生成视觉上的重叠效果，在后期叠加实际的重叠音轨以获得最佳效果。