Google I/O 2024: Unveiling the future with AI and Android 15

"We want everyone to benefit from what Gemini can do," Google CEO Sundar Pichai says

Tech Desk - May 15, 2024

Google CEO Sundar Pichai addresses the developers at the Google I/O 2024 event held on May 14, 2024. — YouTube/Google

Artificial intelligence took the centre stage from the get-go at the highly anticipated Google I/O 2024 conference on Tuesday. A captivating video showcasing Google's advancements in AI, particularly the much-discussed Gemini model, set the tone for the event.

Google CEO Sundar Pichai took the stage first, signaling a potential deep dive into the world of AI innovations. "We want everyone to benefit from what Gemini can do," Pichai declared, hinting at its widespread integration across Google's services.

Pichai also addressed the evolving landscape of Google Search. He revealed that users are increasingly employing longer keyword searches, necessitating advancements in search capabilities. To address this, Pichai announced the nationwide rollout of "AI Overviews" in the US, with a global launch on the horizon.

'Ask Photos with Gemini'

Pichai wasn't done showcasing Gemini's 1.5 Pro's potential. Google Photos will leverage its power for smarter searches. A new feature called "Ask Photos with Gemini" allows users to find specific photos based on contextual inquiries.

The reach of Gemini extends beyond personal applications. It's poised to revolutionise Google Workspace. Imagine using Gemini to search your Gmail messages, or to automatically summarize key takeaways from a Google Meet recording — these are just a few of the exciting possibilities on the horizon.

'Making AI helpful for everyone'

In the realm of mobile search, Gemini 1.5 Pro can become your personal assistant, helping you unearth receipts, schedule package pickups, and more. Gemini can act as your travel companion, suggesting fun and relevant activities based on your destination.

Living up to its goal of "Making AI helpful for everyone," Google is weaving Gemini's capabilities into the very fabric of our digital lives.

Breakdown of the latest developments:

Gemini 1.5 Pro goes public: This version, boasting a one-million token capacity, is now available for everyone, including developers and consumers.
Enhanced context for developers: In private preview, developers can now utilise Gemini with a doubled context window of two million tokens.
Multilingual Gemini Advanced: The one-million token variant of Gemini Advanced is now accessible in 35 languages, breaking down language barriers for global users.
Gemini 1.5 Pro arrives in Workspace Labs: Developers can now experiment with Gemini 1.5 Pro within the Google Workspace development environment.

Project Astra

Moreover, Google introduced Project Astra, a new initiative that utilises video to deliver smarter answers. Imagine pointing your phone's camera at an object and having it instantly identified — Project Astra goes beyond that.

It can analyse even complex things like code and propose relevant modifications. Project Astra boasts impressive contextual awareness as well. By analysing visual clues in videos, it can even assist you in finding misplaced belongings, eliminating the need for lengthy descriptions.

Imagen 3

Google wasn't done innovating in the realm of AI-powered creativity. Google's Doug Eck unveiled Imagen 3, their most powerful image generation model yet. Built from the ground up, Imagen 3 leverages generative AI to create even more realistic and intricate images. This innovative tool can also be used to render text descriptions into stunning visuals.

A still taken from the Google I/O event livestreamed on May 14, 2024. — YouTube/Google

Veo — generative video model

Google is also making waves in the world of AI-powered music creation with Music AI Sandbox, a YouTube tool designed to inspire creators. This innovative platform allows users to experiment by blending various musical styles and crafting original compositions.

For the realm of video editing, Google introduced Veo, a groundbreaking generative video model capable of producing high-resolution (1080p) videos based on user prompts. This technology is integrated into a tool called VideoFX, empowering video editors and creators with a whole new dimension of creative possibilities.

Notably, acclaimed filmmaker Donald Glover has already harnessed Veo's potential for upcoming projects. Veo's ability to generate video from scratch marks a significant leap forward in the advancement of AI.

Google's hardware update

Furthermoe, Google unveiled a three-pronged approach to solidify their position at the forefront of technology:

Trilllium: This next-generation Tensor Processing Unit (TPU) promises a significant leap in processing power and will be available to Cloud customers later in 2024.
Axion Processor: Google has designed a custom CPU based on the ARM architecture, hinting at potential performance improvements for future devices.
AI Hypercomputer: This groundbreaking supercomputer architecture utilises liquid cooling technology within Google's data centers, enabling the development of even more powerful AI capabilities.

Google Search in Gemini Era

Google I/O event brought exciting news for search enthusiasts with the unveiling of AI Overviews powered by Gemini. This AI integration promises a more streamlined search experience.

Gone are the days of piecing together information from multiple searches — multi-step reasoning allows Gemini to tackle complex queries. Imagine asking about a yoga studio and receiving a comprehensive review analysis, all within a single search. Planning meals becomes effortless as Gemini compiles recipes and suggests restaurants when needed.

Search results are also getting a makeover. Forget generic listings; Gemini personalises your results based on your preferences.

Craving live music with your meal? Searching for a seasonal rooftop dining experience? Gemini tailors your options accordingly. Finally, Google Search is embracing visual input. Troubleshooting a broken record player? Simply record a video and let Gemini analyze the footage frame-by-frame to diagnose the issue and provide solutions.

Smarter Gmail experience

Google I/O also revealed a significant upgrade for Gmail on mobile devices, powered by the AI marvel, Gemini. Get ready for a more streamlined and intelligent email experience. Imagine initiating an email and having Gemini appear instantly, offering relevant suggestions based on the content.

Need a roof repair? Simply search your emails — Gemini will not only locate the relevant messages but also present a comparison of repair services with links and prices. Even automatic replies are getting smarter. These will incorporate recommendations from Gemini, including helpful information like links and pricing for relevant services (based on the email content).

But that's not all. Gemini can become your personal receipt management assistant. It can gather receipts from various messages and organize them into a convenient spreadsheet, eliminating the need for manual tracking. On top of that, Gemini can analyse your receipt data to uncover spending patterns and identify areas for potential savings.

Gemini on Android

Google's Sameer Samat took centre stage to discuss exciting advancements in Android, all centered around the power of AI. Three key breakthroughs are slated for this year:

Enhanced search: Android search is getting a boost. Users can expect a more intuitive and powerful search experience.

Gemini as your AI assistant: Get ready to interact with your Android device in a whole new way. Gemini is poised to become your personal AI assistant, seamlessly integrated within the Android ecosystem.

On-Device AI for unmatched experiences: On-device AI capabilities are expanding, unlocking a new wave of innovative and personalised experiences on your Android device.

As part of these advancements, Circle to Search, a feature that allows users to search information based on what's on their screen, will be available on a wider range of Android phones, going beyond Samsung Galaxy and Google Pixel devices.

It should be noted that Gemini can now appear as an overlay on top of your current app, providing seamless assistance without interrupting your workflow. Furthermore, a new drag-and-drop feature allows you to effortlessly transfer images from Gemini to other apps. Finally, say goodbye to tedious PDF searches. Gemini can now analyse entire documents to answer your questions, saving you valuable time and frustration.

To conclude, Google stated that it prioritises responsible AI development by adhering to strict principles for service usage. This includes marking AI-generated content like images, audio, and video with watermarks.