SenseNova V6 SenseTime Unveils Powerful Multimodal AI Suite

作者智能小编

4 月 12, 2025 #aitools, #每日AI快讯

Beijing, China – SenseTime, a leading artificial intelligence company, has announced the release of its SenseNova V6 series, the sixth generation of its multimodal foundation models. This new series, built on a massive 600 billion parameter Mixture-of-Experts (MoE) architecture, achieves native fusion of text, images, and video, pushing the boundaries of AI capabilities.

The SenseNova V6 series boasts exceptional performance in both pure text and multimodal tasks, with benchmarks exceeding those of leading models like GPT-4.5 and Gemini 2.0 Pro, according to SenseTime.

The series comprises four distinct versions, each tailored for specific applications:

SenseNova V6 Pro: This flagship model, powered by 620 billion parameters, is designed for native fusion of text, images, and video, aiming to compete with leading international models.
SenseNova V6 Reasoner Pro: Equipped with advanced reasoning capabilities, this version is designed to assist in solving complex problems.
SenseNova V6 Video: Specializing in video understanding, this model is ideal for applications in education, tourism, and other scenarios requiring in-depth video analysis.
SenseNova V6 Omni: This lightweight, all-modal interactive model provides a real-time interactive experience.

SenseTime highlights the key features of SenseNova V6 as strong reasoning, strong interaction, and long-term memory. The models are capable of performing reasoning and analysis on medium-to-long videos, providing accurate answers in real-time audio-visual interactions, and offering emotionally nuanced expressions.

Applications and Implications

The SenseNova V6 series has significant implications for various industries. SenseTime envisions applications in education, providing personalized tutoring and support. Furthermore, the models are designed to power embodied intelligence, serving as the brain, eyes, ears, and mouth for robots.

Conclusion

The release of SenseTime’s SenseNova V6 series marks a significant advancement in multimodal AI. By natively fusing text, images, and video, these models offer enhanced capabilities for understanding and interacting with the world. The diverse range of applications, from education to robotics, positions SenseNova V6 as a key enabler for future AI innovation. The competition in the AI model landscape is heating up, and SenseTime’s entry with SenseNova V6 promises to push the boundaries of what’s possible.

References