Google’s SIMA 2 agent uses Gemini to reason and act in virtual worlds

6 months ago 78

Google DeepMind shared connected Thursday a probe preview of SIMA 2, the adjacent procreation of its generalist AI cause that integrates the connection and reasoning powers of Gemini, Google’s ample connection model, to determination beyond simply pursuing instructions to knowing and interacting with its environment.

Like galore of DeepMind’s projects, including AlphaFold, the first mentation of SIMA was trained connected hundreds of hours of video crippled information to larn however to play aggregate 3D games similar a human, adjacent immoderate games it wasn’t trained on. SIMA 1, unveiled successful March 2024, could travel basal instructions crossed a wide scope of virtual environments, but it lone had a 31% occurrence complaint for completing analyzable tasks, compared to 71% for humans.

“SIMA 2 is simply a measurement alteration and betterment successful capabilities implicit SIMA 1,” Joe Marino, elder probe idiosyncratic astatine DeepMind, said successful a property briefing. “It’s a much wide agent. It tin implicit analyzable tasks successful antecedently unseen environments. And it’s a self-improving agent. So it tin really self-improve based connected its ain experience, which is simply a measurement towards much general-purpose robots and AGI systems much generally.”

SIMA 2 is powered by the Gemini 2.5 flash-lite model, and AGI refers to artificial wide intelligence, which DeepMind defines arsenic a strategy susceptible of a wide scope of intelligence tasks with the quality to larn caller skills and generalize cognition crossed antithetic areas.

Working with alleged “embodied agents” is important to generalized intelligence, DeepMind’s researchers say. Marino explained that an embodied cause interacts with a carnal oregon virtual satellite via a assemblage – observing inputs and taking actions overmuch similar a robot oregon quality would – whereas a non-embodied cause mightiness interact with your calendar, instrumentality notes, oregon execute code.

Jane Wang, a probe idiosyncratic astatine DeepMind with a inheritance successful neuroscience, told TechCrunch that SIMA 2 goes acold beyond gameplay.

“We’re asking it to really recognize what’s happening, recognize what the idiosyncratic is asking it to do, and past beryllium capable to respond successful a common-sense mode that’s really rather difficult,” Wang said.

Techcrunch event

San Francisco | October 13-15, 2026

By integrating Gemini, SIMA 2 doubled its predecessor’s performance, uniting Gemini’s precocious connection and reasoning abilities with the embodied skills developed done training.

Marino demoed SIMA 2 successful No Man’s Sky, wherever the cause described its surroundings – a rocky satellite aboveground – and determined its adjacent steps by recognizing and interacting with a distress beacon. SIMA 2 besides uses Gemini to crushed internally. In different game, erstwhile asked to locomotion to the location that’s the colour of a ripe tomato, the cause showed its reasoning – ripe tomatoes are red, truthful I should spell to the reddish location – past recovered and approached it.

Being Gemini-powered besides means SIMA 2 follows instructions based connected emojis: “You instruct it 🪓🌲, and it’ll spell chop down a tree,” Marino said.

Marino besides demonstrated however SIMA 2 tin navigate recently generated photorealistic worlds produced by Genie, DeepMind’s satellite model, correctly identifying and interacting with objects similar benches, trees, and butterflies.

Gemini besides enables self-improvement without overmuch quality data, Marino added. Where SIMA 1 was trained wholly connected quality gameplay, SIMA 2 uses it arsenic a baseline to supply a beardown archetypal model. When the squad puts the cause into a caller environment, it asks different Gemini exemplary to make caller tasks and a abstracted reward exemplary to people the agent’s attempts. Using these self-generated experiences arsenic grooming data, the cause learns from its ain mistakes and gradually performs better, fundamentally teaching itself caller behaviors done proceedings and mistake arsenic a quality would, guided by AI-based feedback alternatively of humans.

DeepMind sees SIMA 2 arsenic a measurement toward unlocking much general-purpose robots.

“If we deliberation of what a strategy needs to bash to execute tasks successful the existent world, similar a robot, I deliberation determination are 2 components of it,” Frederic Besse, elder unit probe technologist astatine DeepMind, said during a property briefing. “First, determination is simply a high-level knowing of the existent satellite and what needs to beryllium done, arsenic good arsenic immoderate reasoning.”

If you inquire a humanoid robot successful your location to spell cheque however galore cans of beans you person successful the cupboard, the strategy needs to recognize each of the antithetic concepts – what beans are, what a cupboard is – and navigate to that location. Besse says SIMA 2 touches much connected that high-level behaviour than it does connected lower-level actions, which helium refers to arsenic controlling things similar carnal joints and wheels.

The squad declined to stock a circumstantial timeline for implementing SIMA 2 successful carnal robotics systems. Besse told TechCrunch that DeepMind’s precocious unveiled robotics instauration models – which tin besides crushed astir the carnal satellite and make multi-step plans to implicit a ngo – were trained otherwise and separately from SIMA.

While there’s besides nary timeline for releasing much than a preview of SIMA 2, Wang told TechCrunch the extremity is to amusement the satellite what DeepMind has been moving connected and spot what kinds of collaborations and imaginable uses are possible.