Here are the new features in Veo 3.1, Google’s latest AI-powered video model

Did you know that you can customize Google to filter out junk data? Take these steps To get better search results, including adding my work at Lifehacker as a preferred source.

OpenAI’s new Sora app has been the main focus when it comes to hyper-realistic AI over the past few weeks. Sora makes it very easy for users to create short videos that look real enough for most people, including videos that feature images of real people.

But before Sora dropped, it was Google that raised concerns about real-life AI-powered videos. With Veo 3, Google has launched an AI model that produces not only lifelike videos, but also realistic audio in sync with the action. Sound effects, environments, and even dialogue can be created alongside the video itself, selling the entire effect with one simple prompt.

I see 3.1

Now, Google is back with an upgrade to Veo, appropriately named Veo 3.1, which the company announced In a blog post on Wednesday. This isn’t necessarily a complete overhaul or revolutionary new video model. Instead, the Veo 3.1 builds on the Veo 3, adding “richer sound” and “augmented realism” that Google says generates “realistic” textures. The new model is also said to support new narrative controls, which are coupled with new upgrades to Flow, Google’s AI video editor. Flow users now have more precise controls when editing, and can add audio to existing features such as Video Components, Video Frames, and Expand.

What does that mean in practice? According to Google, video components with Veo 3.1 allow users to add reference images to their scenes, such as a specific person, clothing items, or environment. The new Flow Editor can then insert these elements into the final product, as you can see in the demo video below:

Building on this new feature, Flow now lets you add new elements to an existing scene as well. Using Insert, you can tell Veo 3.1 to add new characters, details, lighting effects, and more to your clip. Google says it’s also doing the opposite as well, to allow users to remove any elements they don’t like from a generation.

Google now also has a new way for users to dictate how they want the scene to be generated, called “First and Last Frame.” Users can choose frames of reference for the beginning and end of the scene. Flow with Veo 3.1 will then fill in the gap and create a scene that starts and ends based on those images.

There’s also now a way to create longer videos than previous iterations of Flow. The new “Expand” feature lets you either continue the action of the current clip, or jump to a new scene that follows it, though Google says the feature is most useful for creating a longer shot. According to the company, Extend can create videos that last more than a minute.

Veo 3.1 is available to users in the Gemini app as well as Vertex AI, as long as you have a Google AI Pro subscription. Developers can access it via Gemini API. Google says the Video, First and Last Frame, and Stretch components are coming to the Gemini API, but “Add Object” and “Remove Object” are not available. “Expansion” is also not yet available in the Vertex AI API.

Is this really a good thing?

Google sees all these developments as a boon for innovators and creativity, but I’m very skeptical. I can see Veo 3.1 and Flow as a good tool for visualizing footage before shooting or animating it (i.e. a storyboarding tool), or even a way for new and budding filmmakers to learn editing by seeing their ideas in a more realized form. However, overall, I don’t think AI-generated content is the future, or at least not the future most of us want. Sure, there’s humor or novelty in some AI-generated videos, but I bet most people who enjoy them do so ironically, or exclusively on social media.

The idea of replacing human filmmakers and actors with artificial intelligence generations seems ridiculous, especially when we are all at risk of being misled. Is it really important for companies like Google and OpenAI to make it easy to create hyper-realistic, fully rendered scenes, when these videos could easily be used to trick audiences? This may be nonsense from someone resistant to change, but I don’t think most of us would like to see our favorite shows and movies made with passion and emotion, replaced by realistic-looking people giving silent, robotic performances.

I see 3.1

Is this really a good thing?

Leave a Comment Cancel reply