Skip to content

Lip-sync: The basics

Lip-Sync is Dubly.AI’s feature that re-animates the speaker’s mouth to match the translated audio. The result: a dubbed video that looks like it was filmed in the target language.

Lip-Sync is recommended when:

  • The speaker is visible on camera (interview, selfie video, presenter shot)
  • The mouth is clearly visible for most of the shot
  • You want the highest-quality, “native-feeling” result
  • The video contains at least 10 seconds of clear, unobstructed footage of the speaker’s face for the AI to process the lip-sync accurately.

Lip-Sync is not needed for voice-over style content where the speaker is off-screen (screencasts, product tours, animations).

  • Resolution: 720p or higher
  • Face size: speaker fills ≥ 10% of the frame
  • Stable footage: avoid rapid zooms or motion blur on the face
  • Heavy head movement or hair covering the mouth can reduce quality.
  • Sufficient high-quality footage is available for at least 10s.