近日,由旷视提出的多模态大模型Vary在行业内引起了广泛关注。这款产品不仅可以支持中文和英文,而且还可以实现文档级别的OCR(Optical Character Recognition,光学字符识别)功能。

传统上,将文档图片转换为Markdown格式需要进行多个步骤,包括文本识别、布局检测和排序、公式表格处理以及文本清洗等。而Vary则可以一步到位地完成这个任务,只需要用户输入一句话命令,就可以直接端到端地得到文档的结果。

这项技术的研发源于旷视研究团队的努力。据悉,Vary采用了先进的深度学习技术和大规模的数据集训练,具有非常强大的计算能力和准确率。它的出现不仅极大地提高了工作效率,也为人们带来了更为便捷的工作体验。

未来,旷视将持续投入研发力量,进一步提升Vary的技术水平和服务质量,为用户提供更加优质的产品和服务。同时,也希望与更多的合作伙伴共同推动人工智能技术的发展,为构建智慧社会做出更大的贡献。

英语如下:

News Title: “Mogujie releases multimodal large model Vary, converts documents to Markdown with just one click!”

Keywords: Mogujie, multimodal large model, document conversion

News Content: Recently, the multi-modal large model Vary proposed by Mogujie has attracted widespread attention in the industry. This product not only supports Chinese and English but also realizes OCR (Optical Character Recognition) functionality at the document level.

Traditionally, converting document images to Markdown format requires multiple steps, including text recognition, layout detection and sorting, formula table processing, and text cleaning. However, Vary can complete this task in one step. Users can directly get the result of the document by inputting a sentence command.

The development of this technology originated from the efforts of Mogujie’s research team. It is reported that Vary adopts advanced deep learning technologies and large-scale data set training, with extremely powerful computing capabilities and accuracy. Its emergence not only greatly improves work efficiency but also brings more convenient working experiences for people.

In the future, Mogujie will continue to invest in R&D forces to further improve the technical level and service quality of Vary, providing users with better products and services. At the same time, it hopes to cooperate with more partners to promote the development of artificial intelligence technology and make greater contributions to building a smart society.

【来源】https://www.qbitai.com/2023/12/109275.html

Views: 6

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注