For decades, the dream of having a humanoid robot perform everyday tasks—folding laundry, sorting recycling, or packing bags—has fueled the imagination of scientists and sci-fi fans alike. Today, companies like Apptronik and DeepMind are bringing us closer to that reality, showcasing robots that respond to natural language commands and perform complex, multi-step tasks. But how advanced are these robots really, and how far are we from machines that think like humans?
Read More: Cleaning: The Revolutionary Rise of Roomba and How It Conquered the World of Cleaning in 2025
Apptronik’s Apollo: A Peek into the Future
Apptronik recently released a blog post and video series highlighting their humanoid robot, Apollo. In the clips, Apollo folds clothes, sorts items into bins, and even packs objects into a person’s bag—all guided by simple voice instructions.
These demonstrations are part of a showcase for Apptronik’s latest AI models: Gemini Robotics 1.5 and Gemini Robotics-ER 1.5. The company emphasizes that large language models can enable robots to “perceive, plan, and think” in order to complete complex, multi-step tasks.
The excitement is understandable—seeing a humanoid robot carry out human-like chores is visually impressive and sparks imaginations about the future of household robotics.
Understanding Vision-Language Action Models
According to Ravinder Dahiya, a Northeastern University professor of electrical and computer engineering, it’s important to remain cautious about claims that robots are capable of independent thought. Dahiya co-authored a Nature Machine Intelligence report on integrating AI with robotics and stresses that these systems are not thinking in the human sense.
- Gemini Robotics 1.5 and Gemini Robotics-ER 1.5 are vision-language action models, meaning they analyze the environment using a combination of vision sensors, image data, and language instructions.
- Gemini Robotics 1.5 converts visual input and instructions into motor commands.
- Gemini Robotics-ER 1.5 focuses on understanding physical spaces, planning actions, and making logistical decisions within its environment.
The integration of these models allows Apollo to interpret human commands and perform complex tasks in a structured way, but the underlying process is heavily reliant on predefined algorithms and extensive training data.
How “Smart” Are These Robots?
At first glance, Apollo’s abilities might seem magical. But according to Dahiya, what appears to be independent thinking is really the product of sophisticated scenario planning and algorithmic instructions.
“It’s easy to iterate visual and language models because there’s a large amount of data available,” he explains. “Vision in AI is not new—it’s been around for decades.”
The innovation lies in combining visual AI with large language models, enabling users to issue instructions in plain language instead of programming commands. This fusion represents an important step in making humanoid robots more intuitive and user-friendly.
The Limits of Current Humanoid Robots
Despite these advances, experts agree that we are still far from creating humanoid robots with human-level sensing or reasoning capabilities. Apollo can follow instructions and manipulate objects, but it cannot feel, think independently, or understand context like a human.
Researchers, including Dahiya, are exploring advanced sensing technologies that could give robots a sense of touch. Dahiya is developing electronic robot skins designed to provide tactile feedback, helping machines manipulate both soft and hard objects more accurately.
Unlike vision data, tactile feedback requires far more specialized training data, which is currently limited. Other human senses—like pain perception and smell—remain largely unreplicated in robots, highlighting the challenges of creating truly human-like machines.
Multi-Sensor Integration: The Next Step
Dahiya emphasizes that relying solely on vision is insufficient for robots operating in unpredictable environments.
“For uncertain environments, robots need to integrate multiple sensor modalities—not just vision,” he explains. This includes touch, pressure, and potentially chemical sensing to navigate real-world scenarios effectively.
Developing these capabilities is crucial for household applications. For example, folding laundry is one thing, but handling fragile items, sensing heat, or navigating crowded spaces requires complex sensor fusion that goes beyond current AI and robotic models.
The Role of Large Language Models
Large language models are central to making robots like Apollo accessible and functional for non-experts. By interpreting everyday language, these models enable robots to:
- Understand task instructions without coding knowledge.
- Plan multi-step operations based on context.
- Adjust actions dynamically according to environmental inputs.
This combination of AI and robotics makes humanoid robots more adaptable, but it still depends on high-quality datasets and well-structured programming rules rather than genuine understanding or thought.
Industry Implications and Future Outlook
The advancements in humanoid robotics have significant implications for homes, industries, and healthcare. Imagine robots assisting the elderly, helping with chores, or performing repetitive industrial tasks with precision.
However, widespread adoption depends on solving key limitations:
- Sensing capabilities: Developing tactile and multi-sensor feedback systems.
- Contextual awareness: Allowing robots to understand environments dynamically.
- Human-like reasoning: Integrating AI models that can make ethical and safety decisions.
While Apollo and similar robots represent progress, experts like Dahiya caution that claims of autonomous thinking are overstated. The current generation of humanoid robots is a proof of concept, showcasing potential rather than delivering fully autonomous solutions.
Frequently Asked Questions:
What is a humanoid robot?
A humanoid robot is a machine designed to resemble and mimic human actions. It can perform tasks like picking up objects, folding clothes, or interacting with people using sensors, AI, and motor controls.
Are humanoid robots ready for home use?
Not yet. While robots like Apptronik’s Apollo can perform specific tasks, experts caution that they lack human-level reasoning, touch, and adaptability required for safe, everyday household use.
How do these robots understand commands?
They use large language models and vision-language action models to process visual input and natural language instructions, converting them into precise motor actions.
Can humanoid robots think like humans?
No. Current robots follow pre-programmed algorithms and training data. They don’t possess consciousness or independent reasoning despite appearing intelligent.
What tasks can robots like Apollo currently perform?
Apollo can fold clothes, sort items into bins, and pack objects into bags based on voice commands. However, these tasks are limited to controlled environments.
How far are we from having fully autonomous humanoid robots?
Experts say we are decades away. While progress in AI and robotics is rapid, creating robots with human-like sensing, reasoning, and decision-making remains a major challenge.
Are humanoid robots safe for households now?
Currently, they are mostly safe in controlled demonstrations, but widespread household use is not recommended until advanced sensing and safety protocols are fully developed.
Conclusion
Humanoid robots are making remarkable strides, blending AI, vision, and language models to perform household tasks with increasing efficiency. Robots like Apollo showcase the potential of this technology, offering a glimpse of a future where machines can assist in everyday chores. However, experts caution that we are still far from achieving human-level sensing, reasoning, and adaptability in robots. Current models rely heavily on structured data and algorithms, lacking true understanding or independent thought.

