The Best AI Video Generators (and How They Compare to Each Other)

AI video generators are rapidly improving and becoming more widely available, with Google's Veo 2 now built into the Gemini app for anyone paying for a Google One AI Premium plan. Like OpenAI's Sora, Runway, Adobe's Firefly, and others, Veo 2 enables you to create a professional-looking video from nothing more than a text prompt.

With Veo 2 now available to paying users, it seems like a good opportunity to test these different AI video generators against each other, and compare their strengths and weaknesses—and to assess where we're at with AI video in general. We keep being told that these tools will transform movie-making, or at least fill the internet with AI slop, but are they actually practically useful?

Microsoft seems to think so, having used it in a recent ad. However, only parts of the clip were AI-made—shots with quick cuts and limited motion, where hallucinations are less likely to happen or be noticed.

For the purposes of this guide I'm going to take a look at Google Veo 2 and put it up against Sora, Runway, and Firefly. Other video generators are available, but these are four of the most prominent: They all cost money to access (starting from $20 a month), so you'll need to sign up for a month at least to play around with them.

Bouncing balls

If you're as old as I am, you'll remember an incredible ad Sony made to promote its new 1080p Bravia televisions in 2005 (above). More than 100,000 bouncy balls were dropped on the steep streets of San Francisco while the cameras rolled, and it was a compelling watch (the behind-the-scenes story is pretty fun, too).

This is a real challenge for AI, involving a lot of physics and movement. The prompt I used was: "Thousands of individual, brightly colored balls bouncing down a steep, sunny street in San Francisco, in slow-motion. The camera moves carefully down the street as the balls bounce downwards, passing trees and parked cars."

The Google Veo 2 attempt isn't bad. There's some weird physics going on here, but it looks reasonably natural, and could work as a short clip if you're not looking too closely. The background elements are well-rendered, and the instructions in my original prompt were followed pretty closely.

Sora seems confused about the scene it's supposed to be rendering. There are colored balls for sure, but they move as a confusing mush, and defy gravity. The pace of the video is OK, even if it's going in the opposite direction to the one I requested, and the background parts of the video look fine on the whole.

Runway gets the vibes pretty close, if you compare it to the original Sony clip, but again, there are several problems: The balls aren't at all consistent, the movement isn't what I asked for, and it looks as though there's an alien watching from a window in the top right corner. The street does look pretty cool though.

Firefly is probably the worst of the bunch, here. Most of the balls are stationary, and those that are moving aren't very well-rendered. The street looks OK but it's nothing special—there's definitely a retro video game feel to it. As with the Sora clip, the camera is taking me up the street when I really wanted to go down.

"Jurassic Park" scene

If AI is going to replace the actual people who make movies, then it needs to be able to create scenes as powerful as the "welcome to Jurassic Park" one in Spielberg's 1993 movie: the moment where Richard Attenborough as John Hammond reveals the dinosaurs to his visitors for the first time (above).

I was curious to see what AI would make of the scene. The prompt was: "At the top of a hill, two paleontologists slowly stagger along through the grass. As they do so the camera pulls back for a wider shot, revealing a wide clearing and a lake below. There are dinosaurs slowly walking through the lake and the trees."

The clip from Google Veo 2 looks pretty good. The camera isn't really moving in the way I described, and the paleontologists aren't really staggering (and they're not on a hill either), but the scenery looks good and the dinosaurs look OK. It's rather generic overall, but it's a decent effort.

Sora goes a little bit crazy with this prompt. The camera movements are jerky and don't follow the instructions I made, and the dinosaurs look like weird shape-shifting creatures. The best I can say about this effort is that all the elements I described are included, and the surrounding scenery is reasonably well done.

As for Runway, it's probably the closest to what I wanted when it comes to the camera movements and the overall feel of the scene. The lake and the dinosaurs look realistic enough, but it's by no means a perfect rendering—where does the red-shirted paleontologist disappear to?

It's another poor effort from Firefly. I'm not sure it knows what paleontologists are, and the dinosaurs are very small. The lake and the surrounding forest are done to an OK standard, though, even if there's a noticeable AI sheen to everything in the frame. The camera movements have been translated well here.

"The Living Daylights" scene

One more: the memorable Bond and Kara border-crossing scene in The Living Daylights, where they scoot down a snowy mountain on a cello case (above). I don't need to hire Timothy Dalton or Maryam d'Abo, learn how to operate a camera, or travel to Austria, because AI can make the whole scene for me.

The prompt for this one was: "A man and a woman in winter clothing are sliding down a snow-covered road on a cello case. There is a barrier on the road, and as they reach it, both characters duck under it."

What do you think so far?

Google Veo 2 manages this pretty well, everything considered—the scene looks mostly realistic and fun, and that does look a bit like a cello case. We do have to ignore the two people going through the road barrier as if it isn't there, but at least there is a barrier there (something the other AI models couldn't grasp).

Over to Sora, and again, it's not terrible. OK, that's not really a cello case, and surely the two people would be facing forward, but the snowy road and the surrounding trees look good—it's an immersive scene. Where's my road barrier, Sora? I want to see these people ducking under it.

As for Runway, whatever videos it was trained on, they sure weren't videos of people riding cello cases down mountains. The people are blending into each other, elements in the shot are shifting shape, and it just looks weird. The snowy scenery and the actual live snow effect do look good, though.

Who knows what Adobe Firefly is thinking here. The physics in this one make absolutely no sense, the characters aren't consistent, and there's no road barrier to duck under. It's actually unsettling to watch. We do get a snowy road, a cello case, and two people in the clip, however.

There's no clear winner

I think the Veo 2 videos impressed me most overall, though Runway seems good for realism more often than not. Across the board we have a lot of problems with physics, realism, and prompt interpretation. These are all clearly AI videos, with numerous weird quirks and inconsistencies.

Now, I wasn't expecting these AI generators to get anywhere near the quality of professional ads or movies: It's just not possible to recreate those scenes with only a text prompt and a few minutes of time and effort. I'm not trying to take a cheap shot at these tools, which are obviously very clever, but rather point out some of the fundamental issues with AI video.

Bouncing balls

These balls aren't bouncing. Credit: Adobe Firefly/Lifehacker

With more careful work and expertise, I could probably get something that looked a lot better, and clearly these video generators are going to improve over time. Who knows what they'll be able to produce in five or 10 years? If you check out the showcased videos on these platforms, you can see that great results are possible.

Personally, though, I'm not convinced these AI tools will ever fully replace traditional film work, no matter how well they're trained. To get something like the Sony ad in AI, you'd have to write reams and reams of incredibly detailed prompts, and even then you might not get what you wanted. Would AI think up the frog jumping out of the drain? Results are quick and easy, sure, but you're offloading most of the creative decisions to AI. These videos feel computer-generated.

People walking

One of these people is about to disappear. Credit: Runway/Lifehacker

AI doesn't really know how a ball bounces, or what a dinosaur looks like, or which way people should face as they slide down a snowy road on a cello case. It approximates and calculates based on all the videos it's previously seen, and those shortcomings show up a lot more in video than they do with images or text. You'll notice most AI videos, including the examples above, don't include elements that come in and out of shot, because the AI is likely to forget what they look like if they're not visible.

And I haven't even had space here to cover the copyright issues or the energy cost to the planet. No doubt we'll see an increasing number of AI-made ads and shorts as time goes on and the technology improves, but it's worth going back to the famous warning in Jurassic Park: Being so preoccupied with whether we can do it, we don't stop to think about whether we should.

Disclosure: Lifehacker’s parent company, Ziff Davis, filed a lawsuit against OpenAI in April, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.

The Best AI Video Generators (and How They Compare to Each Other)

Bouncing balls

"Jurassic Park" scene

"The Living Daylights" scene

There's no clear winner

关于《The Best AI Video Generators (and How They Compare to Each Other)》的评论

发表评论

摘要

相关新闻

相关讨论