Loading
Loading
Models, simulation, edge compute, and cloud training are all part of the same system. The hard part is where they fail to meet cleanly.
Core concept
A VLA takes images and task context as input, then outputs actions. The attraction is simpler software and broader generalization across tasks.
The unresolved question is reliability across new environments, tools, bodies, and failure cases.
Models, simulation, digital twins, and edge compute are increasingly being sold as one integrated path.
A general model layer for robots. If builders adopt it, more of robotics starts to look like a software platform problem.
Synthetic data and scenario generation aimed at reducing the cost of learning from scarce physical traces.
Training and testing environments that try to move more work upstream before a robot reaches the field.
Virtual copies of factories, warehouses, and workcells for planning, layout testing, and operations.
On-robot compute for low-latency inference, perception, and safety-critical loops.
The model layer matters because it determines how much robotics can inherit the economics of software.
VLA
The strongest ecosystem push so far around humanoid and general robot control.
VLA
A bet that one policy family can generalize across many robot bodies and tasks.
Vision-language to action
An early proof that language-conditioned models can output robot actions rather than only descriptions.
Multimodal
Google’s push toward stronger spatial reasoning and dexterity in the Gemini family.
Open source
A generalist open policy trained on Open X-Embodiment data.
On-device
Small open models matter in robotics because many workloads cannot assume a persistent cloud connection.
The boundary between robot and cloud is mostly a question of latency, safety, and cost.
Reality gap
95%
Success in sim
60%
Success in reality
That gap is the real technical problem. Better simulation matters only if it improves what survives once the robot reaches real people, inventory, and environments.
World models
World models matter because they may let teams learn more from each unit of real-world data. They do not remove the need for verification.
World modelsCapture, simulate, expand, verify.
Real-world capture
Fleet sensors, teleoperation traces, and egocentric video record what actually happened in the field.
World models
Physics-aware models turn those traces into replayable environments and counterfactuals.
Synthetic expansion
Generated scenarios widen coverage before scarce robot-hours are spent in reality.
Verification
Reality remains the gate. Policies still need real-world validation before they earn deployment.