智源研究院近日发布了一款名为BGE-M3的通用向量模型。该模型支持超过100种语言,具备领先的多语言、跨语言检索能力。BGE-M3能够全面且高质量地支撑不同粒度的输入文本,包括句子、段落、篇章和文档。模型的最大输入长度为8192,并一站式集成了稠密检索、稀疏检索、多向量检索三种检索功能。在多个评测基准中,BGE-M3达到了最优水平。
BGE-M3的发布标志着智源研究院在自然语言处理领域取得了重要突破。这款模型的跨语言检索能力为全球范围内的研究人员和开发者提供了强大的工具。其支持的多语言特性使得不同语言背景的用户都能够受益于其强大的检索功能。
With the release of the BGE-M3 general-purpose vector model, the Beijing Academy of Artificial Intelligence has made significant breakthroughs in natural language processing. This model, which supports over 100 languages, boasts leading multi-lingual and cross-lingual retrieval capabilities. BGE-M3 is capable of handling input texts at various granularities, including sentences, paragraphs, articles, and documents. It has a maximum input length of 8192 and integrates three retrieval functions – dense retrieval, sparse retrieval, and multi-vector retrieval – in one package, achieving optimal levels in multiple evaluation benchmarks.
The cross-lingual retrieval capabilities of BGE-M3 provide a powerful tool for researchers and developers worldwide, benefiting users with diverse linguistic backgrounds. The release of this model represents an important milestone in the field of natural language processing for the Beijing Academy of Artificial Intelligence.
【来源】https://mp.weixin.qq.com/s/y-c-EelxbSUMmrZNCeqeAA
Views: 3