Veo 3: Google’s AI capable of effortlessly creating ultra-realistic videos – with sound

At its I/O 2025 conference, Google unveiled Veo 3, an artificial intelligence capable of automatically generating realistic videos with an integrated soundtrack. This innovation marks a major breakthrough in the field of assisted audiovisual creation, combining image quality, synchronized sound and creative control.

Introducing Flow | Google’s New AI Filmmaking Tool

A technological leap

At the end of January 2024, via its Google Research firm , the giant Google presented “Lumiere” (in French, in reference to the Lumière brothers) its AI generating videos from a simple text query. This prototype could transform text into very short video clips, but without sound and often incoherent.

But with Veo 3, Google’s AI goes one step further: it produces both image and audio (dialogues, atmospheres, music), perfectly synchronizing the voice with lip movements. A first on this scale. Even animals can be made to speak realistically.

What Veo 3 does (and others don’t)

Unlike competing models, such as OpenAI’s Sora, which currently generate videos without sound, Veo 3 integrates sound natively. From a simple text, it can generate an entire scene – image, sound effects, dialogue, music – with a level of coherence rarely achieved by an AI.

https://twitter.com/medhini_n/status/1924915630656168159

Example: type in “A fox crosses a snowy forest during a snowstorm, BBC documentary style”, and Veo 3 creates a tracking shot scene, with the sound of paws on snow, the fox, low-angled light… and even a documentary-style narration.

Google relies on its Lyria (audio) and Chirp (voice) models and a lip-sync system to match image and dialogue.

https://twitter.com/jerrod_lew/status/1924934440486371589

This breakthrough marks the end of AI’s “silent cinema” and brings automated generation a step closer to professional audiovisual production.

The videos produced by Veo 3 are so realistic that it’s difficult to distinguish them from real shoots, as we’ve seen with videos released in recent days. Internet users and professionals alike testify to the fluidity of movement, fidelity to real physics, lighting management and consistency of the characters – right down to details like … five-fingered hands, traditionally problematic for AIs.

Here are some examples of videos created with Veo 3:

"Freelancers" by Dave Clark | Google I/O 2025 | Made with Flow
Dear "Stranger" | 浮生若夢 by Junie Lau · Google I/O 2025
"Electric Pink" by Henry Daubrez • Google I/O 2025
Google Veo 3 Demo - Movie Scenes and Character Voices

Functions designed with designers in mind

Veo 3 doesn’t just automate video. It allows users to easily manipulate the tool.

In this way, “cinematic commands” such as focal length and camera movement can be entered in the prompt. The creation tool offered by Google also allows you to create camera movements.

To match the script and atmosphere as closely as possible, you can create “ingredients” to be used in the video. For example, a certain type of vehicle, a photo of a character, a stylistic detail, etc.

The tool allows you to create each shot individually, but also to design quick transitions. And to get you started more quickly, you can give the tool images as shot references.

These functionalities are accessible via two Google interfaces: Flow, a simplified application for content creators or educators; and Vertex AI, for image professionals.

Aware of the risks of deepfakes and misinformation, Google says it has built safeguards into its solution, with a SynthID digital watermark to authenticate each video generated, as well as filters to prevent the creation of illicit or copyright-infringing content.

Veo 3 price and availability

For the time being, Veo 3 is available in limited access to subscribers to the Google One AI Premium package ($199/month) in the United States only.

https://twitter.com/TheoMediaAI/status/1925210469133877286

To find out more, visit the Google website.

A new stage reached

Veo 3 marks a major advance in AI-based video generation. By combining image and native sound, Google offers a tool that redefines audiovisual creation. Its potential applications are numerous: education, advertising, prototyping… But above all, the professional quality of the short extracts generated suggests an unprecedented revolution in the world of video.

As with AI-generated images, this technology risks upsetting our relationship with reality. The line between true and false is becoming increasingly blurred, making distinctions ever harder to make.

Presumably, Google has made extensive use of the millions of videos available on YouTube to train its AI, once again raising the sensitive issue of copyright. A potentially worrying precedent, even if most creators have probably consented to this use by accepting, often without reading them, the platform’s terms of use.