Google Unveils Thinking Robot Models But Experts Say They're Too Slow for Real World Use

By
CTOL Editors - Lang Wang
5 min read

Google’s Robot Revolution: Bold New AI Models Hint at Thinking Machines, But Reality Still Trails the Hype

Google’s Gemini Robotics AI Dazzles on Stage, While Experts Warn It’s Not Ready for Prime Time

Google DeepMind pulled back the curtain on Wednesday with what it described as a major step toward “solving AGI in the physical world.” The company showcased two new artificial intelligence models that don’t just execute commands but appear to reason, plan, and carry out robotic tasks in ways that look astonishingly human.

The polished demos turned heads. Robots folded laundry, sorted trash, and explained their decisions out loud as if they were thinking through the process. Yet behind the spotlight, experts urge caution. They argue that while the breakthroughs are exciting, the road to reliable, everyday intelligent machines remains long and full of obstacles.

Benchmark Performance of Gemini Robotics Models
Benchmark Performance of Gemini Robotics Models

A New Breed of Robots

The stars of Google’s announcement were Gemini Robotics 1.5 and its sibling Gemini Robotics-ER 1.5. Unlike older robotics systems that acted more like autopilot software, these models aim to think before they act. They can reason about their surroundings, break down multi-step tasks, and even adapt when something unexpected happens.

Carolina Parada, a researcher on the project, summed up Google’s ambition: “We’re powering an era of physical agents — enabling robots to perceive, plan, think, use tools, and act to better solve complex, multi-step tasks.”

Here’s how it works. The Gemini Robotics-ER 1.5 model acts like the robot’s “high-level brain.” It figures out what needs to be done, using spatial awareness, natural language, and online tools. For example, if asked to sort waste, it can search Google for local recycling rules before deciding where each item belongs. The standard Gemini Robotics 1.5 then takes those plans and converts them into precise movements, all while keeping up its own reasoning process.

In Google’s demo, a robot received the command to sort objects into compost, recycling, and trash bins. Without extra training, it researched local guidelines, analyzed each item, and carried out the task — narrating its thought process along the way.

Perhaps most impressive, the models can perform “cross-embodiment learning.” Skills gained on one robot design transfer seamlessly to completely different machines. A task learned on Google’s ALOHA 2 research robot carried over to Apptronik’s humanoid Apollo and the Franka bi-arm robot with no extra coaching. That kind of generalization has long been a holy grail for roboticists.

Flashy Demos, But Not the Full Story

Despite the jaw-dropping demonstrations, industry veterans advise a more sober view. Engineering team at CTOL.digital praised the technology as “impressive in demos but slow and early-stage” in real-world trials.

The ability to “think before acting” is genuinely novel, they said, and could reduce the painstaking fine-tuning usually needed for different robots. But in practice, the models showed noticeable lag and shaky reliability in messy, unpredictable environments.

Latency emerged as a big issue. The reasoning processes, or what Google calls its “thinking budget,” demand heavy computation. That slows down performance — a dealbreaker for robots expected to work quickly in the real world.

“The preview limitations include shifting APIs, compute costs, and heavy dependency on prompt quality and visual inputs,” the CTOL.digital team noted. In other words, these models are ideal for experimentation but far from ready for factories, hospitals, or homes.

Benchmarks vs. Real Life

Google didn’t come empty-handed. The company boasted that Gemini Robotics-ER 1.5 set records across 15 academic benchmarks, including tests of spatial reasoning, video analysis, and embodied question-answering. On paper, the model looks like a star pupil.

But benchmarks rarely capture the chaos of daily life. A robot might ace sorting colorful blocks in a pristine lab, only to freeze up when confronted with dim lighting, cluttered countertops, or oddly shaped objects in a real kitchen. That gulf between theory and practice remains one of robotics’ toughest hurdles.

Safety in the Spotlight

With machines that can reason more autonomously, safety is no longer a side issue — it’s central. Google says it has built in layers of protection, including high-level checks on safety before any action, alignment with broader AI safety policies, and low-level systems for collision avoidance.

The company also rolled out a new version of its ASIMOV benchmark, a dataset designed to test how well robots handle semantic safety. Early trials showed Gemini Robotics-ER 1.5 handled safety rules fairly well, thanks in part to its ability to think about context before moving.

Still, CTOL.digital engineers flagged concerns. They stressed that “safety layers are required” and warned that tradeoffs between safety and speed will continue to dog the system in its current form.

Why It Matters

Google’s unveiling highlights a shift in how the tech world sees AI’s future. Instead of just automating repetitive tasks, the focus now is on creating machines that can reason and adapt like people. If it works, the payoff could be enormous. Smarter robots could revolutionize industries from manufacturing and logistics to healthcare and home assistance.

For developers, the Gemini Robotics-ER 1.5 model is already available through Google AI Studio. The more advanced Gemini Robotics 1.5 is limited to select partners for now. That staggered release suggests Google knows the technology still has limitations, even as it drums up excitement.

CTOL.digital captured the mood best: “There’s genuine excitement around unified planning and the ‘think before act’ framing. But there’s also skepticism about whether this represents genuine ‘thinking’ or sophisticated marketing.”

The Long Road Ahead

Google’s announcement lands in the middle of an arms race among tech giants to prove their large language models can do more than churn out text. By grounding AI in physical tasks, Google is trying to claim an edge.

Even so, independent evaluators predict the technology is “still years away from household adoption” though it may prove useful sooner in enterprise pilots where conditions can be tightly controlled.

For now, Gemini Robotics 1.5 feels less like a polished product and more like a moonshot — a glimpse of what’s possible, not what’s ready today. As robots begin to plan, reason, and act in ways that feel startlingly human, the question isn’t whether they’ll reshape daily life, but when.

History tells us revolutions don’t happen overnight. They unfold in small, almost invisible steps. One day, a robot might quietly sort your recycling or fold your clothes without fuss. That’s when you’ll know the age of thinking machines has truly arrived.

NOT INVESTMENT ADVICE

You May Also Like

This article is submitted by our user under the News Submission Rules and Guidelines. The cover photo is computer generated art for illustrative purposes only; not indicative of factual content. If you believe this article infringes upon copyright rights, please do not hesitate to report it by sending an email to us. Your vigilance and cooperation are invaluable in helping us maintain a respectful and legally compliant community.

Subscribe to our Newsletter

Get the latest in enterprise business and tech with exclusive peeks at our new offerings

We use cookies on our website to enable certain functions, to provide more relevant information to you and to optimize your experience on our website. Further information can be found in our Privacy Policy and our Terms of Service . Mandatory information can be found in the legal notice