Image Credits:Justin Sullivan / Getty Images4:18 PM PST · December 11, 2025
Google released connected Thursday a “reimagined” mentation of its probe cause Gemini Deep Research based connected its much-ballyhooed state-of-the-art instauration model, Gemini 3 Pro.
This caller cause isn’t conscionable designed to nutrient probe reports – though it tin inactive bash that. It present allows developers to embed Google’s SATA-model probe capabilities into their ain apps. That capableness is made imaginable done Google’s caller Interactions API, which is designed to springiness devs much power successful the coming agentic AI era.
The caller Gemini Deep Research instrumentality is an cause equipped to synthesize mountains of accusation and grip a ample discourse dump successful the prompt. Google says it’s utilized by customers for tasks ranging from owed diligence to cause toxicity information research.
Google besides says it volition soon beryllium integrating this caller heavy probe cause into services, including Google Search, Google Finance, its Gemini App and its fashionable NotebookLM. This is different measurement towards preparing for a satellite wherever humans don’t Google thing anymore, their AI agents do.
The tech elephantine says that Deep Research benefits from Gemini 3 Pro’s presumption arsenic its “most factual” exemplary that is trained to minimize hallucinations during analyzable tasks.
AI hallucinations – wherever the LLM conscionable makes worldly up – are an particularly important contented for long-running, heavy reasoning agentic tasks, successful which galore autonomous decisions are made implicit minutes, hours, oregon longer. The much choices an LLM has to make, the greater the accidental that adjacent 1 hallucinated prime volition invalidate the full output.
To beryllium its advancement claims, Google has besides created yet different benchmark (as if the AI satellite needs different one). The caller benchmark is unimaginatively named DeepSearchQA, and is intended to trial agents connected complex, multi-step information-seeking tasks. Google has unfastened sourced this benchmark.
Techcrunch event
San Francisco | October 13-15, 2026
It besides tested Deep Research connected Humanity’s Last Exam, a much-more interestingly named, autarkic benchmark of wide cognition filled with impossibly niche tasks; and BrowserComp, a benchmark for browser-based agentic tasks.
As you mightiness expect, Google’s caller cause bested the contention connected its ain benchmark, and Humanity’s. However, OpenAI’s ChatGPT 5 Pro was a amazingly adjacent 2nd each the mode astir and somewhat bested Google connected BrowserComp.
But those benchmark comparisons were obsolete astir the infinitesimal Google published them. Because connected the aforesaid day, OpenAI launched its highly anticipated GPT 5.2 — codenamed Garlic. OpenAI says its newest exemplary bests its rivals — particularly Google — connected a suite of the emblematic benchmarks, including OpenAI’s homegrown one.
Perhaps 1 of the astir absorbing parts of this announcement was the timing. Knowing that the satellite was awaiting the merchandise of Garlic, Google dropped immoderate AI quality of its own.















English (US) ·