PixelGen: High-Resolution Pixel-Space Generation with Diffusion and Autoregression

40000 Awarded Resources (in node hours)

Leonardo BOOSTER System Partition

October 2026 - April 2027 Allocation Period

AI Technology: Vision (image recognition, image generation, text recognition OCR, etc.); Generative Language Modeling

The team proposes to advance pixel-level generative models that generate images directly in the pixel domain, targeting high-fidelity details and scalable training.

While latent generative models offer efficiency, pixel-space modeling avoids decoding bottlenecks and preserves fine structure, but is limited by long sequences and training instability.

Using EuroHPC resources, we will train and benchmark pixel-level diffusion-based baselines (e.g., PixelDiT, JiT-style training) and develop improved training/evaluation protocols/AR paradigm through large-scale ablations across resolutions and model sizes, enabling reproducible, open research on scalable pixel generation.

Principal Investigator, Institution and Country

Matthew Blaschko, KU Leuven, Belgium