Spatial Intelligence

Sam Gebhardt
Sep 21, 2024
1 min read

Updated: Sep 24

The Next Leap: From Language Models to Large World Models

For the past few years, most of our breakthroughs in AI have been built on language. Large Language Models (LLMs) learn from vast amounts of text, turning words into knowledge and patterns. Even diffusion models—famous for generating breathtaking images—still rely on essentially one-dimensional input. Language, after all, is synthetic and linear, no matter how expansive the output feels.

But the real frontier is moving beyond words. Large World Models (LWMs) aim to learn from the same reality we live in: a three-dimensional world shaped by time. While early work with neural radiance fields (NeRFs) has shown what’s possible—capturing scenes in striking 3D detail—it’s still just scratching the surface. What lies ahead is an evolution in training AI: systems that don’t just read about the world, but experience and model it in its full dimensionality.

Fascinating podcast on this topic HERE

https://www.youtube.com/watch?v=vIXfYFB7aBI&t=2044s

Fei-Fei and Justin's links

https://x.com/drfeifei

https://x.com/jcjohnss

https://www.worldlabs.ai/about

Spatial Intelligence

Recent Posts

Comments