Sora Is Out – Kind Of… Capabilities Evaluated
Demand for Sora catalytic computing power is continuing to soar, further benefiting infrastructure ... [+] vendors in Suqian, Jiangsu Province, China, on February 19, 2024. (Photo Illustration by Costfoto/NurPhoto via Getty Images)
NurPhoto via Getty ImagesThe world is getting a sneak peek at a major transformative technology of our time – one that’s probably going to displace more resources than some of the other multimedia models that we’ve seen in the past.
With the capability to create full video from prompts, Sora is going to make all kinds of casting, staging and expensive film operations obsolete. And it’s supposed to be available right now! At least to ChatGPT Plus and Pro users.
A Big ShiftThink about all of the work involved in manufacturing, using and developing physical film for pictures. Think about what the digital camera did to that industry.
Sora is going to be so much more transformative, on such a scale, that it’s really worth paying a lot of attention as it gets into the hands of a wider group of users.
Marques Brownlee is a YouTuber who has quite a lot of influence and got a front row seat to early Sora exploration.
In a YouTube critique, Brownlee points out how capable and adapt the program is at generating realistic looking footage, while admitting that it still has a way to go.
Was it Gandhi or Stalin?For those used to play the game where you guess which person said something, Brownlee shows us a series of short videos, and asks us to figure out whether they are real or AI-generated.
What you see is that it can be pretty hard to tell.
There aren’t a lot of blatant cues to let us know which videos were made with AI and which weren’t
Not Many CluesThere are a few obvious tells when it comes to AI video, but they’re mostly based on our knowledge of the world around us.
Here are three that popped out to me:
Not factual – one way you can tell that video is AI-generated is if it has factually wrong elements in a picture, such as a landscape that you’ve seen before.
If you know, for example, that there is not a dilapidated shack on the top of some hill, you’ll note the difference when the AI shows you an aerial view.
Not ‘ugly’ enough – it seems that Sora products still adhere to the same tendency as still image and General AI technologies in general – they tend to produce produce glitzy, beautiful results. In other words, one of the few tells that you have is that the program doesn’t tend to create mediocre-looking film with actual composition or lighting problems, or subjects that aren’t telegenic and prepped for film.
However, you could presumably generate this kind of thing with more prompting.
Not credible – another way to tell AI video is if you see tentacle monsters or other unreal things cropping up in the margins. But again, that has more to do with our knowledge of the world than anything we can assess visually. If you see a tentacle god arise from a lake in a Sora video, it’s going to look real.
Now, after I wrote these down, I watched the rest of Brownlee’s video, and he lays out some additional lapses in Sora hyper-realism, so check these out:
First, there is the lack of object permanence, where items or characters can appear and disappear randomly. There’s also a sort of ‘ghost image’ phenomenon where an object is going to have a lack of substance – for example, in Brownlee’s video, we see cars passing through another car in what is supposed to be a real-life street video.
In general, Brownlee points out, Sora struggles with physics. It doesn’t always know how objects behave in motion, or what direction they were going in, if it’s working based off of a still image.
Then there is are some issues with speed, where a video may slow down and speed up for no reason.
All of this aside, some of the videos will be so realistic that we have trouble telling them apart from live footage.
Access to Sora Right NowOver at OpenAI’s site, it looks like Sora access has been temporarily paused due to high traffic volumes.
Brownlee talks about this eventuality in his video:
“I kind of wonder how long it will take when it’s open for everyone to use,” he says, noting that a 1080 P film of about 10 seconds takes a couple of minutes for him to generate.
Use Cases for SoraAs Brownlee points out, Sora may be more useful for a lot of people who want to create cartoon or claymation features.
That’s because the physics is really hard to get right in real life, but cartoons and stop action footage are more forgiving. They’re more abstract, and that’s going to be one of the first realms that Sora becomes most useful in, although Brownlee also cites fake CCTV camera footage as a desirable way to use the platform.
He shows off a ‘Santa versus Frosty’ Mortal Kombat game video completely made up by the AI…and a job interview scene, in which Sora shows some of the details that the model can provide without any additional prompting.
However, he suggests there are some big unknowns with this technology, and we’re moving through uncharted waters.
When we all start using Sora, what will we use it for? How will it affect our lives?
People are wondering about what’s going to happen to those major industries that are often centered around places in Southern California like Burbank and Hollywood.
But the effect of the technology will likely go far beyond that. Stay tuned as I continue to document what’s coming out now at the end of what’s been a banner year for large language models.