news studionews studio

南加州大学的一项最新研究揭示了 OpenAI 的 GPT-3.5-turbo 模型的参数规模或许只有 70 亿。这一研究由南加大团队的三位作者完成,他们成功破解了未公布的 GPT-3.5-turbo 嵌入向量维度(embedding size)。据研究,GPT-3.5-turbo 的嵌入向量维度可能为 4096 或 4608。

这一发现对于当前开源大模型如 Llama 和 Mistral 有着重要的参考意义。已有研究显示,当嵌入向量维度为 4096 时,这些开源大模型的参数规模约为 70 亿。如果嵌入向量维度过大或过小,都会导致网络过宽或过窄,这对模型性能是有害的。

因此,南加大团队推测 GPT-3.5-turbo 的参数规模也可能在 70 亿左右。他们指出,除非是 MoE(Multi-Head Output-Only Feed-Forward Networks)架构可能有所不同,否则这一推测应该是准确的。

这一研究成果由量子位报道,引发了业界对 GPT-3.5-turbo 模型参数规模的重新思考。此前,外界普遍猜测 GPT-3.5-turbo 的参数规模超过了 1000 亿,但南加大团队的研究为我们提供了一个新的视角。

这一发现对于大模型研究和开发有着重要的指导意义。它不仅有助于我们更好地理解 GPT-3.5-turbo 模型的性能,也为未来大模型的研究和应用提供了新的方向。

英语如下:

Certainly, here is the translation in English using Markdown format:

“`markdown
# USC Unveils the Parameter Size of GPT-3.5: Approximately 7 billion

Keywords: Parameter Size, USC Research, GPT-3.5 Model.

## News Content

A recent study from the University of Southern California (USC) has revealed that the parameter size of OpenAI’s GPT-3.5-turbo model may only be around 7 billion. This research, completed by three authors from the USC team, successfully cracked the unpublished embedding vector dimension (embedding size) of GPT-3.5-turbo. The study suggests that the embedding vector dimension of GPT-3.5-turbo might be 4096 or 4608.

This discovery holds significant reference value for current open-source large models such as Llama and Mistral. Previous research has shown that when the embedding vector dimension is 4096, the parameter size of these open-source large models is approximately 7 billion. If the embedding vector dimension is too large or too small, it can lead to the network being too wide or too narrow, which is harmful to model performance.

Therefore, the USC team speculates that the parameter size of GPT-3.5-turbo might also be around 7 billion. They point out that unless the MoE (Multi-Head Output-Only Feed-Forward Networks) architecture could differ, this speculation should be accurate.

This research achievement was reported by Quantum, prompting a re-evaluation of the parameter size of the GPT-3.5-turbo model within the industry. Previously, it was widely speculated that the parameter size of GPT-3.5-turbo exceeded 100 billion. However, the USC team’s research provides a new perspective.

This finding has significant guiding implications for the research and development of large models. It not only helps us better understand the performance of the GPT-3.5-turbo model but also offers new directions for the research and application of large models in the future.
“`

This translation maintains the structured format of the original text in Markdown and conveys the scientific findings accurately to an English-speaking audience.

【来源】https://mp.weixin.qq.com/s/y0RQ0aOrHGLzLJKxbyGxMw

Views: 4

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注