Deepfake videos enable an Asian news broadcaster to bolster its expansion strategy with minimal investment
Speech Synthesis and Lip Sync model
The Akaike Edge
Experienced ML and DL Ops teams
Efficient Deployment 0%
Ongoing Maintenance 0%
TTS and Video Synthesis
The broadcaster had more than 260,000 hours of video in its archives. Focusing on the reusability of the client’s media assets, from the available video footage, a few of the newsroom’s panel of anchors were selected.
Post video selection
Post video selection, an AI recipe was whipped up for image synthesis and automated lip synchronization which blended Computer Vision, Deep Learning, and GAN (Generative Adversarial Networks) technology.
Custom Speech Solution built as per the speaking face video
The custom solution converted written text to natural-sounding speech. This was achieved by using deep neural networks trained on human speech to create human-like expressive speech. The target speech segment was then accurately adapted to a video with a speaking face using GAN.
Research shows that 87% of Vision AI
projects do not yield expected results
either owing to training data insufficiencies stalling the project, or being too slow to deploy. Our AI experts can help you accelerate in data sparse environments.