Shanghai, China – In a collaborative effort, Bilibili, a leading video-sharing platform in China, and Shanghai Jiao Tong University have announced the launch of MT-Color, a novel controllable image colorization framework powered by diffusion models. This innovative tool allows users to precisely colorize images at the instance level, leveraging both instance-aware text prompts and masks.
The development of MT-Color addresses a significant challenge in image colorization: accurately applying colors to specific objects within an image while avoiding unwanted color bleeding. The framework achieves this through several key features:
-
Precise Instance-Level Colorization: MT-Color enables users to target specific objects within an image for colorization, ensuring that each object’s color aligns with its textual description. This level of control is a significant advancement over previous methods that often struggle to differentiate between closely positioned objects.
-
Pixel-Level Mask Attention Mechanism: To prevent color spillover, MT-Color employs a sophisticated pixel-level mask attention mechanism. This mechanism effectively restricts color application to the intended object, maintaining clean color boundaries and preventing unwanted color contamination.
-
Instance Mask and Text Guidance Module: The framework incorporates an instance mask and text guidance module to address color binding errors. This module ensures that the correct color is associated with the corresponding object, even in complex scenes with multiple objects.
-
Multi-Instance Sampling Strategy: To enhance instance awareness, MT-Color utilizes a multi-instance sampling strategy. This strategy improves the framework’s ability to recognize and differentiate between various objects in the image, leading to more accurate and visually appealing colorization results.
Furthermore, the collaboration has resulted in the creation of the GPT-Color dataset, a high-quality resource providing instance-level annotations. This dataset is specifically designed to support more refined image colorization tasks and serves as a valuable resource for future research and development in the field.
According to the developers, MT-Color outperforms existing methods in both color accuracy and visual quality. The generated images are said to be more consistent with human visual perception, offering a more natural and realistic colorization experience. The framework produces high-resolution (512×512) colorized images with rich, natural colors and clear details.
The potential applications of MT-Color are vast, ranging from restoring old black and white photographs to enhancing the visual appeal of digital content. Its flexible user control, facilitated by text descriptions and masks, allows for fine-tuned adjustments to the colorization process, catering to a wide range of user needs.
The release of MT-Color marks a significant step forward in the field of controllable image colorization, demonstrating the power of collaboration between academia and industry. As AI continues to evolve, tools like MT-Color will undoubtedly play an increasingly important role in shaping the future of visual content creation.
References:
- MT-Color – 上海交大联合哔哩哔哩推出的可控图像着色框架. (n.d.). Retrieved from [Insert URL of the AI tool website mentioned in the prompt]
Views: 0