上海的陆家嘴

华为团队近日推出了一种新型大语言模型架构——盘古-π,该架构在传统Transformer的基础上进行了改进,增强了非线性,有效降低了特征塌陷问题,使得模型输出表达能力得到显著提升。

在相同数据训练的情况下,盘古-π(7B)在多任务上超越了同规模的大模型LLaMA 2,并且还能实现10%的推理加速。在1B规模上,盘古-π的性能已经达到了当前的最先进水平(State-of-the-Art, SOTA)。

此外,华为还基于盘古-π架构炼出了一个专门针对金融法律领域的的大模型“云山”。这一系列的成果都来自于华为诺亚方舟实验室等机构的研究人员的努力。

英文标题:Huawei Unveils Pengpu-π Architecture, Outperforming LLaMA
关键词:Huawei, Pengpu-π, Artificial Intelligence

英文新闻内容:
Huawei’s team has recently introduced a new large language model architecture – Pengpu-π. This architecture builds on the traditional Transformer structure by enhancing non-linearity, effectively reducing the issue of feature collapse, and significantly enhancing the model’s expressive power in output.

When trained with the same dataset, the Pengpu-π (7B) surpasses its counterpart, the same-scale LLaMA 2, in multi-tasking, and also achieves a 10% acceleration in inference. At the 1B scale, the performance of Pengpu-π has reached the current state-of-the-art (SOTA).

Furthermore, Huawei has also developed a specialized model for the financial and legal sectors based on the Pengpu-π architecture, named “Yunshan”. This series of achievements is the result of the efforts of researchers from Huawei Noah’s Ark Lab and other institutions.

【来源】https://mp.weixin.qq.com/s/Beg3yNa_dKZKX3Fx1AZqOw

Views: 2

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注