Google AI Gemini beats OpenAI, human experts in tests

7 Dec 2023

The Australian Financial Review

“We’ll explore what monetisation might look like, but we don’t have anything specific on that right now,” Sissie Hsiao, vice president in charge of Assistant and Bard at Google, said.

A third, cut-down version of the AI, Gemini Nano, will appear in Android phones starting with Google’s Pixel 8 Pro phone, to answer complex voice, video, photo and written questions on the phone itself, without the need for an internet connection.

In a video demonstration, Gemini identifies that the photo is homework, marks it, and explains the errors.

In a global launch event, the company showed off Gemini Ultra performing a range of tasks which, until now, were generally reserved for humans.

In one pre-recorded demonstration, Gemini Ultra was shown a photo of a child’s physics homework, and was able to read it, mark it, and explain the maths and physics errors the child had made, going into levels of detail far beyond what most parents would be capable of.

In another demonstration, two objects were held up in front of a webcam – an orange and a fidget spinner – and the AI was able to identify them both and explain that citrus and the spinner had something in common: they both could be “calming”.

Eli Collins, the vice president in charge of product at Google DeepMind, which developed Gemini, said one of the main features of Gemini was it was less likely to “hallucinate” than other AIs.

“Improving the accuracy of responses was one of the core training objectives of the model. When we talk about getting a better score on these benchmarks, it’s often a result of improving Gemini’s ability to reason and to answer questions factually,” he said.

(And, indeed, the Google search engine does contain plenty of references to the “calming” effects of both citrus and fidget spinners.)

When the orange was replaced by a Rubik’s Cube, the AI identified they were both examples of toys that adults, as well as children, play with.

Coding edge

Google also showed off Gemini figuring out what a complex join-the-dots puzzle was depicting, before anyone even joined the dots (“This is a picture of a crab,” the AI pre-empted.) The AI watched as someone performed a simple sleight of hand with a ball and three cups, and correctly predicted the ball would be in the left cup.

But Gemini would not just be used for homework, puzzles and party tricks, Google said.

In tests where it was pitted against (presumably talented) human software developers attending a coding competition, Gemini was better than 85 per cent of them, Mr Collins said.

Starting on Wednesday, Gemini would be used to power Google’s software-writing platform, AlphaCode, where it would be able to solve “nearly twice as many problems” as the previous AI, he said.

Not only was new AI inherently multimodal and able to program in a number of programming languages, it was also inherently multilingual, having been trained on more than 100 human languages.

But, as with Google’s previous AIs, it will speak only one language at first: English. Other languages would quickly follow, Mr Collins said.