Gemini 1.5 Flash: Is it worth the hype?

Demis Hassabis says Gemini 1.5 Flash excels at summarisation, chat applications, image and video captioning, and more
An undated image of Gemini Flashs logo. — Google
An undated image of Gemini Flash's logo. — Google

What stirred significant sensation in the world of artificial intelligence (AI), Google broadened its Gemini family by adding a new variant called Gemini 1.5 Flash.

Gemini 1.5 Pro vs Gemini 1.5 Flash

With a range of potential features such as handling complex tasks, images videos and speech, it stands between the on-device Gemini Nano and Gemini Pro, the cloud-based model.

Being a built-in multimodal model like OpenAI’s newly launched GPT-4o, Gemini Flash 1.5 has been developed for smoother outputs in order to offer real-time conversations. 

Read more: Google I/O 2024 — Gemini 1.5 Flash unveils the power of compact AI

Gemini Flash 1.5

As of now, the AI chatbot is exclusively accessible for developers to employ the generative technology in their own applications, which might cause a surge in third-party live chat apps developed with the help of Gemini Flash 1.5.

Besides giving swift and efficient responses backed by its accurate comprehension of text, images, video and speech, Gemini Flash 1.5 is a more affordable option than the Gemini Pro which costs around 20 times more.

“We know from user feedback that some applications need lower latency and a lower cost to serve. This inspired us to keep innovating,” he added, unveiling Flash as a “model that’s lighter-weight than 1.5 Pro, and designed to be fast and efficient to serve at scale,” said Google DeepMind CEO Demis Hassabis.

The only AI tool which can be said to battle with Gemini 1.5 Flash is OpenAI’s recently disclosed GPT-4o model which delivers fast responses as it is, too, a natively multimodal AI model which is developed for real-time conversations. Given that, the based-on-facts assessment shows that the Gemini Flash 1.5, in terms of reasoning, is less capable model to some extent.

“1.5 Flash excels at summarisation, chat applications, image and video captioning, data extraction from long documents and tables, and more,” Hassabis said.