Tesla’s been working on a feature an end to end world model for autonomous driving and according to Phil Duan, the company’s Autopilot Vision Lead, it could be the missing piece needed to achieve fully self-driving cars.
Duan discussed the World Model during a presentation at the Computer Vision and Pattern Recognition (CVPR 2023) conference. He described it as a “basic model that can be fine-tuned for downstream tasks”, kinda like how language models are pre-trained and then tweaked for specific applications. The downstream tasks Duan mentioned include detecting the volume, surface, and objects around the car, recognizing lane markings and traffic lights. Right now, Tesla handles these tasks separately, with different networks for lanes and lights. The World Model could unify them.
The key to building this World Model? Data and lots of it. “We spend more time on data than the model itself” Duan said. With a fleet of 4 million vehicles capturing rare and challenging scenarios, Tesla’s got a ton of data to feed into their “data engine.”
While Duan focused mostly on perception, Ashok Elluswamy from Tesla also discussed prediction and control. Compared to Wayve’s World Model concept, Tesla’s seems more advanced.
When asked about next steps, Duan said the World Model’s “crucial for end-to-end autonomous driving.” Though he didn’t share too many details yet. To handle the huge amount of data the World Model needs, Tesla’s experimenting with “dynamic resolution.” They’re also using custom neural network accelerators and chips to process everything fast enough for real-time perception in the vehicle.
The World Model gives Tesla a “4D understanding of the world,” according to Duan. The next challenges are figuring out “how to derive different applications in a very lightweight way” to represent complex driving scenarios.
While Transformers have been hot in AI lately, deploying them in vehicles isn’t easy. But with homegrown accelerators, the Full Self-Driving computer (Tesla Dojo Supercomputer to Kick Off Production in 2023, Set to Radically Advance AI), and a skilled deployment team, Tesla’s equipped to combine CNN and Transformers as needed to get the job done. Performance and completing the actual driving tasks are higher priorities than theoretical debates over approaches.
The World Model could be the breakthrough that leads to fully autonomous Tesla. But we’ll have to wait for more details to know for sure. Hints suggest Tesla’s autonomous progress is accelerating.