Model family · Video

Seven references. One consistent shot.

Vidu is built around reference-to-video: hand it subjects, sets, costumes, props, and styles, and it composes them into one shot that holds — up to sixteen seconds with the audio already synced. In MML ONE, nine tiers answer your storyboard, episode after episode.

By ShengShu Technology

Start free See what it does best

What it's best at

Made for stories that come back next week.

Five things Vidu does for serialized work — from ShengShu's own releases and leaderboard placements.

Reference-to-video, ranked first

Combine up to 7 reference images — subjects, environments, costumes, props, styles — into one consistent shot. No.1 on SuperCLUE's first reference-to-video leaderboard, No.1 on Artificial Analysis at Q3 launch.

Sixteen seconds, sound synced

A single Q3 generation carries up to 16 seconds of video with synchronized native audio — dialogue, music, foley, ambience in layers.

Serialized by design

Multi-shot composition, camera control, spatial continuity across shots, and a subject library that carries characters episode to episode.

VFX without a plate

Six classes of cinematic effects generated natively — particles, fluids, lighting, transitions, dynamic motion, camera moves.

Performance lineage

Q2 introduced micro-expression acting and multiple-entity consistency, with a global API since October 2025 — the acting arrived before the spectacle.

In MML ONE

Nine Vidu tiers, ready to route.

Exactly what our catalog serves today — Q3 line first, then Q2.

Vidu Q3 ProHighest picture quality in the Q3 line; the only tier that takes audio references.

Vidu Q3The current generation: reference-to-video with synchronized audio, realism-tuned.

Vidu Q3 TurboThe Q3 speed tier — faster turnarounds for coverage passes.

Vidu Q3 MixThe Q3-Mix variant, launched publicly via partner channels in June 2026.

Vidu Q3 — drama-tuned configurationA drama-oriented Q3 configuration via our provider channel: 8–12 second episodes at up to 4K with up to 14 reference images. Not a ShengShu-announced tier.

Vidu Q3 — ad-tuned configurationAn advertising-oriented Q3 configuration via our provider channel: 3–16 seconds, up to 4K, all five aspect ratios. Not a ShengShu-announced tier.

Vidu Q2 ProThe Q2 quality tier — and the one tier that accepts reference video clips.

Vidu Q2The reference-to-video generation that took Vidu global, October 2025.

Vidu Q2 TurboThe Q2 speed tier.

Tier lineup as served in MML ONE on the day this page was written. The drama/ad-tuned rows are provider-channel configurations of Q3, not ShengShu-official tiers — the in-app catalog is the live truth.

Paper-cut collage of a film strip unspooling across a paper stage, a little projector beam lighting a stack of take cards

In the MML ONE flow

Where Vidu earns its place.

Takes land in the storyboard as versioned assets. Recurring characters, anime and stylized work, and episode-to-episode continuity are what you route to Vidu.

Storyboard → Shots → Cut

The board syncs from the script; shots arrive with context attached.

Explore

Animated shorts

Style locked, characters on model, from first board to last frame.

Explore

Music videos

A place, a performer, a rhythm — built, captured and cut to the track.

Explore

The honest part

Know before you route.

Speech output covers English, Japanese, and Chinese only — narrower than the language lists of some competing families, per Vidu's own FAQ.

Reference video input is narrow: one tier (Q2 Pro), one to two clips of at most five seconds. Everything else works from image references.

The drama/ad-specialized configurations are China-channel offerings without vendor-published spec pages — their 16-second / 4K headline specs do not apply to the standard Q2/Q3 tiers (≤10 seconds, 1080p).

Public Alpha

Bring one film into the graph.

Start with a premise, a screenplay, or a folder of references. We'll set up your provider keys and walk through the first scene with you.