Towards European Digital Sovereignty: Building Efficient Multilingual LLMs Aligned with the AI Act

300,000 Awarded Resources (in node hours)

Leonardo BOOSTER System Partition

Oct 2025 - 2 months Allocation Period

This project aims to develop a Large Language Model (LLM) with a strong focus on Italian and other European languages, counterbalancing the current dominance of English in AI systems. Existing open models that comply with the European Union’s Artificial Intelligence Act (AI Act) are very few, and those that do exist rely on dense, energy-intensive architectures.

In contrast, our approach combines open and controlled datasets with efficient model designs and training strategies, significantly reducing energy consumption and costs while ensuring high levels of trustworthiness and transparency. In particular, we aim to train a medium-sized Mixture of Experts (MoE) model, following the current state-of-the-art trend that achieves performance comparable to much larger dense models while using only a fraction of their computational and energy resources.

These models are increasingly recognised as strategic assets for the future of AI, as they combine efficiency, adaptability, and sustainability. By working directly with these frontier technologies, we seek to bring advanced know-how to Europe and Italy, strengthening the region’s capacity to design, train, and deploy cutting-edge generative models in compliance with European values and regulations.

The project builds on our ongoing work on the CINECA–LEONARDO supercomputing infrastructure, where we have already successfully conducted large-scale experiments. However, as we are approaching the end of our current computational allocation, we plan to extend this work on the MareNostrum supercomputer, which provides access to the latest NVIDIA H100 GPUs and improved energy efficiency — a crucial step to scale and refine our model.

The resulting model will be fully aligned with the AI Act, advancing sustainable and reliable AI, strengthening linguistic inclusivity, and reinforcing Europe’s digital sovereignty. Moreover, it will be released as open weights, contributing to the European open-source ecosystem and enabling researchers, public institutions, and businesses to build, audit, and adapt trustworthy AI solutions locally.

By fostering transparency, interoperability, and community-driven innovation, this initiative seeks to empower Europe to lead in the responsible development of multilingual, sustainable, and regulation-compliant language technologies.

Principal Investigator, Company and Country

Cipolla Salvatore, INGEGNERIA INFORMATICA - S.P.A., Italy