苹果研究人员推出自回归视觉模型 AIM，学习表征能力超越 LLM

标题：苹果研究人员推出自回归视觉模型 AIM，或为大规模图像学习开启新纪元

近日，苹果公司的研究人员在一篇名为《Scalable Pre-training of Large Autoregressive Image Models》的论文中，推出了一种新型的视觉模型——自回归视觉模型（AIM）。这一模型的提出，可能会为大规模图像学习开启新的篇章。

在这篇论文中，苹果的研究者探讨了用自回归目标训练 ViT（Vision Transformer）模型是否能在学习表征方面获得与大型语言模型（LLMs）相同的扩展能力。研究结果显示，这种自回归模型的容量可以轻松扩展到数十亿个参数，并且能够有效利用大量未经整理的图像数据。

这一发现对于图像处理和机器学习领域来说具有重要意义。首先，这意味着未来的计算机模型可以处理更大规模的图像数据，从而提高处理效率和准确性。其次，这也为大规模图像数据的利用提供了新的可能，有助于推动图像识别、分类等技术的发展。

此外，AIM 的出现也可能对人工智能领域产生影响。如果这种模型能够在学习表征方面与 LLMs 达到相同的扩展能力，那么它可能会成为未来大规模预训练模型的一个重要方向。这将有助于进一步推动人工智能的发展，提高其在各个领域的应用效果。

总的来说，苹果研究人员推出的自回归视觉模型 AIM，无论是对图像处理领域，还是对整个人工智能领域，都具有重要的意义。我们期待这种新型模型能够在未来的研究中发挥更大的作用，推动相关技术的发展。

英语如下：

Title: Apple Researchers Launch Autoregressive Visual Model AIM, Outperforming LLMs in Learning Representation

Keywords: 1. Autoregressive Visual Model (AIM)

Content:

Title: Apple Researchers Unveil Autoregressive Visual Model (AIM), Possibly Ushering in a New Era for Large-Scale Image Learning

Recently, researchers at Apple unveiled a new type of visual model in a paper titled “Scalable Pre-training of Large Autoregressive Image Models.” The introduction of this model could potentially open up a new chapter for large-scale image learning.

In this paper, the Apple researchers explored whether training ViT (Vision Transformer) models with autoregressive objectives could achieve the same scalability in learning representation as large language models (LLMs). The results showed that this autoregressive model could easily scale to billions of parameters and effectively utilize vast amounts of unstructured image data.

This discovery holds significant implications for the fields of image processing and machine learning. Firstly, it means that future computer models can handle larger-scale image data, thereby improving processing efficiency and accuracy. Secondly, it offers new possibilities for the utilization of large-scale image data, contributing to the development of technologies such as image recognition and classification.

Moreover, the emergence of AIM may also have an impact on the field of artificial intelligence. If this model can achieve the same scalability in learning representation as LLMs, it could become an important direction for large-scale pre-training models in the future. This would help further advance the development of artificial intelligence and improve its application effects in various fields.

In conclusion, the launch of the autoregressive visual model AIM by Apple researchers is of great significance not only for the field of image processing but also for the entire field of artificial intelligence. We look forward to seeing how this new model can play a bigger role in future research and drive the development of related technologies.

【来源】https://www.jiqizhixin.com/articles/2024-01-18-7