OpenAI will now let you create videos from verbal cues

16 Feb 2024

CNN

Something isn't loading properly. Please check back later.

CNN —

Artificial intelligence leader OpenAI introduced a new AI model called Sora which it claims can create “realistic” and “imaginative” 60-second videos from quick text prompts.

In a blog post on Wednesday, the company said Sora is capable of generating videos up to 60 seconds in length from text instructions, with the ability to serve up scenes with multiple characters, specific types of motion, and detailed background details.

“The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world,” the blog post said.

OpenAI said it intends to train the AI models so it can “help people solve problems that require real-world interaction.”

This is the latest effort from the company behind the viral chatbot ChatGPT, which continues to push the generative AI movement forward. Although “multi-modal models” are not new and text-to-video models already exist, what sets this apart is the length and accuracy that OpenAI claims Sora to have, according to Reece Hayden, a senior analyst at market research firm ABI Research.

Hayden said these types of AI models could have a big impact on digital entertainment markets with new personalized content being streamed across channels.

“One obvious use case is within TV; creating short scenes to support narratives,” Hayden said. “The model is still limited though, but it shows the direction of the market.”

At the same time, OpenAI said Sora is still a work in progress with clear “weaknesses,” particularly when it comes to spatial details of a prompt – mixing up left and right – and cause and effect. It gave the example of creating a video of someone taking a bite out of a cookie but it not having a bite mark right after.

For now, OpenAI’s messaging remains focused on safety. The company said it plans to work with a team of experts to test the latest model and look closely at various areas including misinformation, hateful content and bias. The company said it is also building tools to help detect misleading information.

Sora will first be made available to cybersecurity professors, called “red teamers,” who can assess the product for harms or risks. It is also granting access to a number of visual artists, designers and filmmakers to collect feedback on how creative professionals could use it.

The latest update comes as OpenAI continues to advance ChatGPT.

Earlier this week, the company said it is testing a feature in which users can control ChatGPT’s memory, allowing them to ask the platform to remember chats to make future conversations more personalized or tell it to forget what was previously discussed.