Lip-sync: The basics

Lip-Sync is Dubly.AI’s feature that re-animates the speaker’s mouth to match the translated audio. The result: a dubbed video that looks like it was filmed in the target language.

When to use Lip-Sync

Lip-Sync is recommended when:

The speaker is visible on camera (interview, selfie video, presenter shot)
The mouth is clearly visible for most of the shot
You want the highest-quality, “native-feeling” result
The video contains at least 10 seconds of clear, unobstructed footage of the speaker’s face for the AI to process the lip-sync accurately.

Lip-Sync is not needed for voice-over style content where the speaker is off-screen (screencasts, product tours, animations).

Requirements for best results

Resolution: 720p or higher
Face size: speaker fills ≥ 10% of the frame
Stable footage: avoid rapid zooms or motion blur on the face
Heavy head movement or hair covering the mouth can reduce quality.
Sufficient high-quality footage is available for at least 10s.