Recent 3D generative models such as TRELLIS and Hunyuan3D achieve high-quality asset generation, yet fine-grained stylistic control remains challenging. State-of-the-art methods such as StyleSculptor introduced texture-geometry disentanglement via Style-Disentangled Attention (SD-Attn), enabling partial control, but two critical limitations persist:
- incomplete disentanglement, failing for geometry-only style transfer in subtle or localized cases, and
- absence of color control, as existing methods treat style holistically, coupling color, texture, and geometry.
This project proposes Tri-StyleGen, a novel framework combining
- Image-Prompt Additivity for training-free decomposition of 3D embeddings and
- learnable attribute tokens fine-tuned to explicitly separate geometry, color, and texture in TRELLIS’s latent space.
This approach overcomes StyleSculptor’s coupling limitations and enables independent control of visual attributes. Tri-StyleGen will deliver open-source models and benchmarks, advancing controllable 3D generation for applications in gaming, VR/AR, and digital content creation.
Joost Van de Weijer, Universitat Autonoma de Barcelona / Computer Vision Center, Spain