EWM: European World Models

90,000 Awarded Resources (in node hours)

Leonardo BOOSTER System Partition

May 2026 - November 2026 Allocation Period

World models enable AI systems to learn internal representations for understanding, prediction, and planning. EWM advances the Joint-Embedding Predictive Architecture (JEPA) framework to create scalable, multimodal world models trained on diverse data including internet-scale video, images, text, and robotic trajectories.

The project delivers three core contributions: enhanced JEPA encoders achieving state-of-the-art performance on vision tasks, Vision-Language Models (VLMs) integrating JEPA representations with European sovereign large language models (LLMs) for multilingual multimodal reasoning, and Vision-Language-Action models (VLAs) enabling language-conditioned robot control.

All models, code, and weights will be released under permissive open-source licenses. By developing sovereign AI capabilities on European infrastructure, EWM establishes competitive alternatives to proprietary systems, ensuring European researchers, industries, and citizens benefit from cutting-edge foundational models with transparent governance.

Principal Investigator, Institution and Country

Sebastian Houben, Bonn-Rhein-Sieg University of Applied Sciences, Germany