英语轻松读发新版了,欢迎下载、更新

The Best AI Video Generators (and How They Compare to Each Other)

2025-04-28 14:30:00 英文原文
Dinosaur AI movie

Credit: Google Veo 2/Lifehacker


AI video generators are rapidly improving and becoming more widely available, with Google's Veo 2 now built into the Gemini app for anyone paying for a Google One AI Premium plan. Like OpenAI's Sora, Runway, Adobe's Firefly, and others, Veo 2 enables you to create a professional-looking video from nothing more than a text prompt.

With Veo 2 now available to paying users, it seems like a good opportunity to test these different AI video generators against each other, and compare their strengths and weaknesses—and to assess where we're at with AI video in general. We keep being told that these tools will transform movie-making, or at least fill the internet with AI slop, but are they actually practically useful?

Microsoft seems to think so, having used it in a recent ad. However, only parts of the clip were AI-made—shots with quick cuts and limited motion, where hallucinations are less likely to happen or be noticed.

For the purposes of this guide I'm going to take a look at Google Veo 2 and put it up against Sora, Runway, and Firefly. Other video generators are available, but these are four of the most prominent: They all cost money to access (starting from $20 a month), so you'll need to sign up for a month at least to play around with them.

Bouncing balls

If you're as old as I am, you'll remember an incredible ad Sony made to promote its new 1080p Bravia televisions in 2005 (above). More than 100,000 bouncy balls were dropped on the steep streets of San Francisco while the cameras rolled, and it was a compelling watch (the behind-the-scenes story is pretty fun, too).

This is a real challenge for AI, involving a lot of physics and movement. The prompt I used was: "Thousands of individual, brightly colored balls bouncing down a steep, sunny street in San Francisco, in slow-motion. The camera moves carefully down the street as the balls bounce downwards, passing trees and parked cars."

The Google Veo 2 attempt isn't bad. There's some weird physics going on here, but it looks reasonably natural, and could work as a short clip if you're not looking too closely. The background elements are well-rendered, and the instructions in my original prompt were followed pretty closely.

Sora seems confused about the scene it's supposed to be rendering. There are colored balls for sure, but they move as a confusing mush, and defy gravity. The pace of the video is OK, even if it's going in the opposite direction to the one I requested, and the background parts of the video look fine on the whole.

Runway gets the vibes pretty close, if you compare it to the original Sony clip, but again, there are several problems: The balls aren't at all consistent, the movement isn't what I asked for, and it looks as though there's an alien watching from a window in the top right corner. The street does look pretty cool though.

Firefly is probably the worst of the bunch, here. Most of the balls are stationary, and those that are moving aren't very well-rendered. The street looks OK but it's nothing special—there's definitely a retro video game feel to it. As with the Sora clip, the camera is taking me up the street when I really wanted to go down.

"Jurassic Park" scene

If AI is going to replace the actual people who make movies, then it needs to be able to create scenes as powerful as the "welcome to Jurassic Park" one in Spielberg's 1993 movie: the moment where Richard Attenborough as John Hammond reveals the dinosaurs to his visitors for the first time (above).

I was curious to see what AI would make of the scene. The prompt was: "At the top of a hill, two paleontologists slowly stagger along through the grass. As they do so the camera pulls back for a wider shot, revealing a wide clearing and a lake below. There are dinosaurs slowly walking through the lake and the trees."

The clip from Google Veo 2 looks pretty good. The camera isn't really moving in the way I described, and the paleontologists aren't really staggering (and they're not on a hill either), but the scenery looks good and the dinosaurs look OK. It's rather generic overall, but it's a decent effort.

Sora goes a little bit crazy with this prompt. The camera movements are jerky and don't follow the instructions I made, and the dinosaurs look like weird shape-shifting creatures. The best I can say about this effort is that all the elements I described are included, and the surrounding scenery is reasonably well done.

As for Runway, it's probably the closest to what I wanted when it comes to the camera movements and the overall feel of the scene. The lake and the dinosaurs look realistic enough, but it's by no means a perfect rendering—where does the red-shirted paleontologist disappear to?

It's another poor effort from Firefly. I'm not sure it knows what paleontologists are, and the dinosaurs are very small. The lake and the surrounding forest are done to an OK standard, though, even if there's a noticeable AI sheen to everything in the frame. The camera movements have been translated well here.

"The Living Daylights" scene

One more: the memorable Bond and Kara border-crossing scene in The Living Daylights, where they scoot down a snowy mountain on a cello case (above). I don't need to hire Timothy Dalton or Maryam d'Abo, learn how to operate a camera, or travel to Austria, because AI can make the whole scene for me.

The prompt for this one was: "A man and a woman in winter clothing are sliding down a snow-covered road on a cello case. There is a barrier on the road, and as they reach it, both characters duck under it."

What do you think so far?

Google Veo 2 manages this pretty well, everything considered—the scene looks mostly realistic and fun, and that does look a bit like a cello case. We do have to ignore the two people going through the road barrier as if it isn't there, but at least there is a barrier there (something the other AI models couldn't grasp).

Over to Sora, and again, it's not terrible. OK, that's not really a cello case, and surely the two people would be facing forward, but the snowy road and the surrounding trees look good—it's an immersive scene. Where's my road barrier, Sora? I want to see these people ducking under it.

As for Runway, whatever videos it was trained on, they sure weren't videos of people riding cello cases down mountains. The people are blending into each other, elements in the shot are shifting shape, and it just looks weird. The snowy scenery and the actual live snow effect do look good, though.

Who knows what Adobe Firefly is thinking here. The physics in this one make absolutely no sense, the characters aren't consistent, and there's no road barrier to duck under. It's actually unsettling to watch. We do get a snowy road, a cello case, and two people in the clip, however.

There's no clear winner

I think the Veo 2 videos impressed me most overall, though Runway seems good for realism more often than not. Across the board we have a lot of problems with physics, realism, and prompt interpretation. These are all clearly AI videos, with numerous weird quirks and inconsistencies.

Now, I wasn't expecting these AI generators to get anywhere near the quality of professional ads or movies: It's just not possible to recreate those scenes with only a text prompt and a few minutes of time and effort. I'm not trying to take a cheap shot at these tools, which are obviously very clever, but rather point out some of the fundamental issues with AI video.

Bouncing balls

These balls aren't bouncing. Credit: Adobe Firefly/Lifehacker

With more careful work and expertise, I could probably get something that looked a lot better, and clearly these video generators are going to improve over time. Who knows what they'll be able to produce in five or 10 years? If you check out the showcased videos on these platforms, you can see that great results are possible.

Personally, though, I'm not convinced these AI tools will ever fully replace traditional film work, no matter how well they're trained. To get something like the Sony ad in AI, you'd have to write reams and reams of incredibly detailed prompts, and even then you might not get what you wanted. Would AI think up the frog jumping out of the drain? Results are quick and easy, sure, but you're offloading most of the creative decisions to AI. These videos feel computer-generated.

People walking

One of these people is about to disappear. Credit: Runway/Lifehacker

AI doesn't really know how a ball bounces, or what a dinosaur looks like, or which way people should face as they slide down a snowy road on a cello case. It approximates and calculates based on all the videos it's previously seen, and those shortcomings show up a lot more in video than they do with images or text. You'll notice most AI videos, including the examples above, don't include elements that come in and out of shot, because the AI is likely to forget what they look like if they're not visible.

And I haven't even had space here to cover the copyright issues or the energy cost to the planet. No doubt we'll see an increasing number of AI-made ads and shorts as time goes on and the technology improves, but it's worth going back to the famous warning in Jurassic Park: Being so preoccupied with whether we can do it, we don't stop to think about whether we should.

Disclosure: Lifehacker’s parent company, Ziff Davis, filed a lawsuit against OpenAI in April, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.

Photo of David Nield

David Nield

David Nield is a technology journalist from Manchester in the U.K. who has been writing about gadgets and apps for more than 20 years.

Read David's full bio

关于《The Best AI Video Generators (and How They Compare to Each Other)》的评论


暂无评论

发表评论

摘要

The article provides an insightful analysis of current AI video generation capabilities, particularly focusing on popular tools like Adobe Firefly, Runway (formerly LuminarAI), and Google Veo 2. Here are the key takeaways from your summary: ### Key Insights: 1. **Quality Limitations**: - Current AI video generators struggle with realistic physics, detailed prompts, and consistent scene elements. - The tools often produce videos that feel "computer-generated" rather than lifelike. 2. **Prompt Complexity**: - Producing high-quality output requires highly detailed text prompts, which can be cumbersome and time-consuming. - Even with precise instructions, the AI may fail to deliver expected results due to its training limitations. 3. **Realism vs. Approximation**: - AI models approximate scenes based on their training data rather than having a deep understanding of real-world physics or objects (e.g., how balls bounce, dinosaurs look, people slide down snowy roads). - This leads to inconsistencies and unrealistic elements in the generated videos. 4. **Scene Inconsistencies**: - Objects that enter and exit the frame are often poorly represented due to AI's difficulty in maintaining consistency over time. - Elements like road barriers or specific objects (e.g., cello cases) may be omitted entirely. 5. **Creative Control**: - The process of generating videos with AI offloads much creative decision-making to the algorithm, which can result in a loss of unique human creativity and spontaneity. ### Examples Provided: 1. **Snowy Mountain Scene (Timothy Dalton & Maryam d'Abo)**: - **Adobe Firefly**: Physics are incorrect; characters blend into each other. - **Runway**: No road barrier, but overall scene looks good with snowy surroundings. - **Google Veo 2**: Most realistic portrayal of the scene, though characters pass through barriers. 2. **Bouncing Balls**: - The AI struggles to accurately depict how balls should bounce in a natural manner due to lack of detailed understanding or training data on such dynamics. 3. **Dinosaur Scenes and Barriers**: - Similar issues arise where the AI fails to understand elements that need to be present (e.g., barriers) or maintain consistency with realistic objects like dinosaurs. ### Future Prospects: - **Improvement Over Time**: The quality of AI-generated videos will likely improve as models are trained on more extensive and diverse datasets. - **Ethical Considerations**: There's a growing debate about the ethical implications, including copyright issues and environmental costs associated with training large-scale AI systems. ### Conclusion: While AI video generation holds promise for rapid prototyping and creating quick visual concepts, it currently falls short of producing high-quality professional content without significant manual intervention. The article cautions against over-reliance on these tools for complex or detailed creative work until the technology matures further. Overall, while AI video generators are useful tools in certain contexts, they still face fundamental limitations that prevent them from fully replacing traditional film production methods anytime soon.