Google’s generative AI filmmaking program Flow has reached a milestone. The tech giant confirmed exclusively to CNET that Flow creators have produced over 100 million AI videos in the program. Thanks in part to its advanced AI video model, Veo 3, Flow allows users to generate video clips and edit them together to create scenes.
It’s been 90 days since Google surprised us with Flow at its annual I/O developers conference. According to Elias Roman, senior director of product management for Flow in Google Labs, much of the time since has been spent “hustling just to keep up with the demand.”
Flow is a departure from Google’s previous generative AI work. For years, the company’s AI efforts have been focused on Gemini, its all-in-one chatbot. It has flooded its products with AI, like with Search’s AI overviews and Gmail’s AI-generated summaries. Its research assistant tool, NotebookLM, with its AI audio generator that can transform documents into personal podcasts, continually rolls out new features.
The industry leader has spent billions of dollars trying to win the race to develop the most advanced AI for average Google searchers, developers and, yes, even artists and creators. 100 million AI videos is a significant milestone for the company, and it helps show us what the future of AI-enabled creation might look like.
Getting in the AI Flow
To compete with Midjourney and Stable Diffusion, Google created a crop of AI image models, originally named ImageFX and now known as Imagen (pronounced “imagine”). Its previous generative media models were better suited for amateur or enthusiast creators, not professionals, and they didn’t dominate the AI creative space. That all changed with Veo 3.
Google dropped Veo 3, its newest AI video model, at May’s I/O conference. Veo 3 leapfrogged the competition with a somewhat obvious but first-in-industry advancement: AI videos with synchronized, AI-generated audio. The model garnered a ton of attention online, and Google reported over 40 million AI videos just seven weeks later.
“What Veo 3 allowed was a much wider set of people to create very compelling videos, engaging all the senses out of the box. You didn’t have to stitch together a toolkit,” said Roman. “To be able to do the Foley [ambient sounds], the sound effects, the soundtrack, the dialog, all of that, and not make the user think about each of those modalities in a specific way, I think, is a big unlock, too.”
Veo 3 is one of several AI models you can use in the filmmaker tool. Flow was built for professional creators and filmmakers, a step beyond simple image and video generation available with Gemini. Google intentionally moved away from its original ImageFX nomenclature and built off its interface, Roman said, and wanted Flow to combine the most advanced Imagen and Veo models with Gemini, which was used in the training of Veo and “basically speaks native Veo.”
Flow is one way to combine all those AI models and pieces, uniting Google’s different generative AI models for seamless video creation and editing.
What makes Flow different from Veo and Imagen
Flow was built to focus on consistency, that is, the ability to maintain visual identity from one clip to the next. If you have a 90-second video of your character drinking coffee in a cafe, you don’t want their hair length or eye color changing every 8 seconds between scenes. That consistency is important for professional projects, and it’s also difficult to attain. Roman called it the “Achilles heel of AI video.”
Flow has several tools to help you maintain that consistency, and in my testing, they do give you a new level of control over your work that was previously lacking from Google AI tools. The best way I can describe Flow is an upscaled version of simple video generator interfaces, with the option to export multiple clips into a simplified version of a Premiere Pro-like timeline.
AI tools often get upgraded with the hope that they’ll become more useful for professional creators, though the target audience isn’t automatically drawn to using them. Generative AI is a contentious issue in creative industries, especially when it comes to wholesale creation of text, image and video. AI enthusiasts might laud the creativity and speed of AI models, but creators continue to voice legitimate concerns about how AI is trained and deployed. It’s why publishers and artists have filed lawsuits against AI companies alleging copyright infringement. It’s why workers in data-rich industries face job security concerns as executives look to cut costs.
Another issue with AI is the type of imagery it can create. Last year, users found Gemini could produce images of people of color in Nazi soldier uniforms. Google apologized for what the company called “inaccuracies in some historical image generation depictions” and said it was working to improve those depictions immediately.
(Google’s guidelines prohibit the creation of abusive and illegal AI content. Roman said that improving the enforcement of its safety policies is aided by technological updates and real-life usage and reports.)
Going forward, Roman said Flow is working on expanding Veo 3’s capabilities, improving consistency, and adding new features like bespoke voices for character work. The project’s north star is making creation more accessible to people.
“We can lower the barriers that prevent a much wider set of people from telling stories through video, and we can raise the ceiling on what kind of stories can be told through video,” said Roman. “Some of them are going to be funny and silly, like wild street interviews or Yeti ASMR bloggers, and some of them are gonna be really powerful.”
How to use Google’s Flow for AI videos
Flow, which is part of Google Labs and accessible via its AI Test Kitchen, is available to paying Google AI subscribers in its $20 per month Pro plan and $250 per month Ultra plan (currently discounted for $125 for three months). Google Labs’s privacy notice says that “human reviewers read, annotate and process” your Labs interactions and tool outputs to improve its AI models. (Your Labs data is stored for up to 18 months by default, and the company advises you not to upload or submit confidential information. Google’s general privacy center has more info.)
I spent some time testing Flow, generating clips and stitching them together using its scenebuilder. Several tools are only available to Flow users.
Ingredients-to-video: There are a couple of ways you can prompt to generate video clips, including the self-explanatory text-to-video and image/frames-to-video. Ingredients-to-video is a new one worth exploring. With this method, you upload specific pictures and add a text prompt, and Flow will piecemeal the parts together. For example, you can upload a picture of a man, a product photo of a specific jacket, and a scenic background, and then Flow can combine them and animate the video.
Extending clips and smoothing transitions: Extend can help you lengthen clips. In the scenebuilder timeline, drag the end of one clip’s frame out to your desired length. If you’re going to generate a new video and want a smooth transition, I recommend going to the end of the first clip and hitting the plus button at the top of the marker to save the final frame to your library. You can then use that image in a frames-to-video prompt to maintain that consistency from clip to clip.
Doodling and making edits: If you’re editing a frame or image in a separate document, you can upload your marked-up image to Flow and instruct the model to implement the changes. You can also do that with images you’ve drawn on, and it can bring those doodles to life. This is a developing feature — a new prototype for this is in the works now — but it’s definitely fun to stretch Flow’s capabilities like that.
Prompting with Gemini: There’s no way to have Gemini automatically create and/or improve your prompts directly in Flow (something I hope changes in a future update), but you can use the chatbot to help you craft the perfect prompt. If you’re struggling to bring more detail-oriented ideas to life, try letting Gemini help you out.
For more, check out the top AI image generators and a guide to writing the best AI image prompts.