AI Technology: Deep Learning; Vision (image recognition, image generation, text recognition OCR, etc.)
The proliferation of photorealistic AI-generated images poses urgent challenges for content authenticity, misinformation prevention, and EU AI Act compliance.
Existing watermarking methods for diffusion models suffer from fundamental limitations: sampling-based approaches (e.g., Tree-Ring, Shallow Diffuse) require costly DDIM inversion for detection (~10 s per image) and support only zero-bit detection, while fine-tuning-based methods (e.g., Stable Signature, AquaLoRA) permanently modify model weights and are limited to 48-bit payloads.
The team proposes DiffMark, a differentiable multi-bit watermarking framework that resolves all three limitations by leveraging Latent Consistency Models. DiffMark injects a learned perturbation at every denoising timestep and extracts the embedded message via a lightweight decoder in a single forward pass, achieving detection latency under 1 second.
The method supports 256-bit message capacity sufficient for user identification, timestamping, and metadata encoding, while operating on frozen, unmodified diffusion models as a universal plug-in.
Expected outcomes include robust watermarked models exceeding 99% bit accuracy (64-bit) and 95% (256-bit), a comprehensive benchmark suite, open-source code release, and a submission to a top-tier venue (CVPR or AAAI).
Principal Investigator, Institution and Country
Nhien-An Le-Khac, University College Dublin, Ireland