Why do Sora 2 and Veo 3.1 fail at physics simulations for jumping, throwing, and falling?
Both Sora 2 and Veo 3.1 struggle with physics simulations because they predict visual patterns rather than understanding actual physical laws governing motion, gravity, and momentum.
Fundamental architectural limitation: According to research from MIT's Computer Science and Artificial Intelligence Laboratory, current video generation models lack explicit physics engines and instead rely on learned correlations from training data. These systems approximate what falling or jumping "looks like" without calculating trajectories, acceleration due to gravity (9.8 m/s²), or conservation of momentum. When generating complex motions like a basketball being thrown, the models frequently produce unrealistic arcs, inconsistent velocities, or objects that defy basic Newtonian mechanics.
Real-world failure patterns: Users consistently report specific glitches including objects that pause mid-air, trajectories that curve impossibly, characters whose jump heights don't match their takeoff velocity, and thrown objects that accelerate or decelerate unnaturally. The temporal consistency problem becomes especially apparent in sequences longer than 3-4 seconds, where the models lose track of object momentum and position.
Important context: While both systems excel at static or slow-motion scenarios, dynamic physics remains an acknowledged frontier challenge. The issue isn't computing power—it's the fundamental approach of predicting pixels rather than simulating physics equations.