If you follow the AI world on X, you’ve probably seen in recent weeks clips of video game-like worlds where users are playing as a dog on a sunny beach, a vial of poison in the play Hamlet or a pack of cigarettes on the floor of Penn Station.
These videos, generated with Google’s Genie 3 model, highlight the promise of so-called world models, which simulate real-world environments and aim to approximate the physics of how objects move and humans interact with their surroundings.
There’s only one issue: these kinds of world models are expensive to run. Odyssey, another developer of world models, has to use an entire H200 chip for each user that accesses its Odyssey 2 model through its application programming interface, CEO Oliver Cameron told me.
That can cost Odyssey several dollars per hour, Cameron said. In comparison, running a 70-billion parameter text model like Llama 3 costs just a few cents per hour.
It’s even more expensive to power Odyssey’s more advanced Odyssey-2 Pro model, which takes several H200 chips to run per user, Cameron said.