Google I/O is upon us next week, and as always, there’s sure to be an avalanche of new products, features, and upgrades unveiled amid the hoopla on stage. Google Gemini and its many AI offshoots will certainly be a part of them. Confirmed previews and rumors of more abound, but there are some that I’m particularly keen to hear. I’ve put together a set of highlights I anticipate coming from Google as it shares the next stage of its plans for Gemini.
Project Mariner’s AI agents
Project Mariner is Google’s answer to the growing prevalence of AI agents like Manus and Browser Use. Rather than simply surfacing links, Mariner is designed to interact with the web the way a human does, using an invisible mouse and keyboard to fill in forms, find things within websites, and click the necessary buttons to complete tasks.
Mariner might fill out your tax forms, book a trip, and send in any complaints you have to a company’s customer service. Though not strictly Gemini, Google DeepMind’s creation is very much part of the story of Gemini helping automate digital activities for people. In fact, Mariner is supposedly going to integrate with Gemini Advanced and Google Chrome. This would be especially impactful for people who manage repetitive admin tasks, navigate government or insurance websites, or simply want a more efficient way to handle online chores.
Gemini’s personal memory
Persistent memory is a constant, but usually an imperfectly realized dream of generative AI assistants. Google is expected to unveil an upgrade to Gemini’s memory that will mean no longer needing to remind the AI of your preferences. Gemini could remember that you dislike morning meetings, prefer metric units, or always book aisle seats on flights.
Like ChatGPT’s memory system, Gemini is expected to both remember things from interactions with you as well as offer a custom instructions setting where you can manually add things you want it to remember. Of course, Google is likely to assure users that the persistent memory feature is opt-in and that it includes controls allowing users to view, edit, and delete what Gemini remembers.
Imagen 4 and Veo 3
Imagen and Veo are Google’s generative AI image and video creation tools, respectively. Google is expected to debut the latest versions of both at I/O. Imagen 4 is supposed to be much better at photorealistic images and matching the actual prompts. It should also be better at staying consistent in whatever style you request. Veo 3 is also going for a more consistent style from clip to clip. They’ll also be integrated with Gemini for easy access to content creators, students, and really anyone who wants a quick picture or video.
Sharing Gems
Gemini Gems, the customized and focused Gemini models any user can create, are useful for all kinds of activities. You can make your own motivational coach, a meal-planning nutritionist, or an art critic for your latest drawings. What you can’t do right now is share them with other people.
Gems are basically like the custom GPTs available from ChatGPT, except GPTs are shareable and findable in the GPT Store. Google is expected to match that and start allowing users to share their Gems with others. You might see everything from a classroom-specific tutoring Gem, tools for coding for different outlets, or just a bunch of Gems designed to recommend movies. And a Gem marketplace isn’t just a benefit to users. Google would love to build up the community around Gemini like it has with apps on the Play Store. Shareable Gems might be the best gateway to that kind of community-driven ecosystem.
You might also like
https://cdn.mos.cms.futurecdn.net/B4SbcMCKgUddNH4WBWykSL.jpg
Source link
erichs211@gmail.com (Eric Hal Schwartz)