A day after OpenAI impressed with a startlingly improved ChatGPT AI model, Google showed off an equally stunning vision for how AI will improve the products that billions of people use every day.
The updates, announced at its annual Google I/O developer conference, come as the company is trying to push beyond its core advertising business with new devices and AI-powered tools. Artificial intelligence was so top of mind during the event, Google CEO Sundar Pichai said at the end of the presentation the term "AI" was said 120 times - as counted by none other than its AI platform Gemini.
During the keynote, Google showed how it wants its AI products to become a bigger part of users' lives, such as by sharing information, interacting with others, finding objects around the house, making schedules, shopping and using an Android device. Google essentially wants its AI to be part of everything you do.
Pichai kicked off the event by highlighting various new features powered by its latest AI model Gemini 1.5 Pro. One new feature, called Ask Photos, allows users to search photos for deeper insights, such as asking when your daughter learned to swim or recall what your license plate number is, by looking through saved pictures.
He also showed how users can ask Gemini 1.5 Pro to summarize all recent emails from your child's school by analysing attachments, and summarising key points and spitting out action items.
Meanwhile, Google executives took turns demonstrating other capabilities, such as how the latest model could "read" a textbook and turn it into a kind of AI lecture featuring natural-sounding teachers that answer questions.
Just one day before, OpenAI - one of the tech industry's leaders in artificial intelligence - unveiled a new AI model that it says will make chatbot ChatGPT smarter and easier to use. GPT-4o aims to turn ChatGPT into a digital personal assistant that can engage in real-time, spoken conversations and interact using text and "vision." It can view screenshots, photos, documents or charts uploaded by users and have a conversation about them.
Google also showed off Gemini's latest abilities to take different kinds of input - "multimodal" capabilities to take in text, voice or images - as a direct response to ChatGPT's efforts. A Google executive also demoed a virtual "teammate" that can help stay on top of to-do lists, organize data and manage workflow.
The company also highlighted search improvements by allowing users to ask more natural or more focused questions, and providing various versions of the responses, such as in-depth or summarized results. It can also make targeted suggestions, such as recommending kid friendly restaurants in certain locations, or note what might be wrong with a gadget, such as a camera, by taking a video of the issue via Google Lens. The goal is to take the legwork out of searching on Google, the company said.
The company also briefly teased Project Astra, developed by Google's DeepMind AI lab, which will allow AI assistants to help users' everyday lives by using phone cameras to interpret information about the real world, such as identifying objects and even finding misplaced items. It also hinted at how it would work on augmented reality glasses.
Google said that later this year it will integrate more AI functions into phones. For example, users will be able to drag and drop images created by AI into Google Messages and Gmail and ask questions about YouTube videos and PDFs on an Android device.
And in a move that will likely appeal to many, a new built-in tool for Android will help detect suspicious activity in the middle of a call, such as a scammer trying to imitate a user's bank.
According to analyst Jacob Bourne, from market research firm Emarketer, it's no surprise AI took centre stage at this year's Google developer conference.
"By showcasing its latest models and how they'll power existing products with strong consumer reach, Google is demonstrating how it can effectively differentiate itself from rivals," he said.
He believes the reception of the new tools will be an indicator of how well Google can adapt its search product to meet the demands of the generative AI era.
"To maintain its competitive edge and satisfy investors, Google will need to focus on translating its AI innovations into profitable products and services at scale," he said.
As the company grows its AI footprint, it said it will introduce more protections to cut down on potential misuse. Google is expanding its existing SynthID feature to detect AI-generated content. Last year, the tool added watermarks to AI-generated images and audio.
Google said it is also partnering with experts and institutions to test and improve the capabilities in its new models.
Although the company has doubled down on artificial intelligence in the past year, it also met significant roadblocks. Last year, shortly after introducing its generative AI tool - then called Bard and since renamed Gemini - Google's share price dropped after a demo video of the tool showed it producing a factually inaccurate response to a question about the James Webb Space Telescope.
More recently, the company hit pause in February on Gemini's ability to generate images of people after it was blasted on social media for producing historically inaccurate images that largely showed people of colour in place of White people.
Gemini, like other AI tools such as ChatGPT, is trained on vast troves of online data. Experts have long warned about the shortcomings around AI tools, such as the potential for inaccuracies, biases and the spreading of misinformation. Still, many companies are forging ahead on AI tools or partnerships.
Apple may be interested in licensing and building Google's Gemini AI engine, which includes chatbots and other AI tools, into upcoming iPhones and its iOS 18 features, Bloomberg reported in March. The company is also reportedly talking to ChatGPT creator OpenAI.
CNN