AI21 Labs Open-Sources Jamba, a Novel Large Language Model witha Hybrid Architecture
Tel Aviv, Israel – AI21 Labs,a leading artificial intelligence research company, has announced the open-source release of Jamba, a groundbreaking large language model (LLM) that combines the best ofboth worlds: the Transformer architecture and a novel Structured State Space Model (SSM) called Mamba. This hybrid approach aims to deliver high-quality outputs, exceptionalthroughput, and low memory consumption, setting a new standard for LLMs.
Unlike most existing LLMs, such as GPT, Gemini, and Llama, which rely solely on the Transformer architecture, Jamba leverages the strengths of both Mamba and Transformer. This unique combination allows Jamba to achieve a remarkable 256K context window, significantly boosting its ability to process long texts efficiently.
Jamba represents a significant leap forward in the evolution of LLMs,said Yoav Shoham, CEO of AI21 Labs. By integrating Mamba’s SSM with the proven Transformer architecture, we’ve created a model that delivers exceptional performance across a range of tasks, while also being highly efficient and scalable.
Key Features of Jamba:
- SSM-Transformer Hybrid Architecture: Jamba is the first production-ready model to utilize Mamba SSM in conjunction with the Transformer architecture. This innovative approach enhances the model’s performance and efficiency.
- Large Context Window: Jamba boasts a 256K context window, enabling it to handle longer text sequences andtackle more complex natural language processing tasks.
- High Throughput: Compared to similarly sized models like Mixtral 8x7B, Jamba achieves a 3x throughput improvement when processing long contexts, making it ideal for large-scale data handling.
- Single-GPU Large Capacity Processing: Jamba can process up to 140K contexts on a single GPU, significantly enhancing its accessibility and deployment flexibility.
- Open-Source Weight License: Jamba’s weights are released under the Apache 2.0 license, granting researchers and developers the freedom to use, modify, and optimize the model, fostering collaboration and innovation.
- NVIDIA API Integration: Jamba will be available as an NVIDIA NIM inference microservice within the NVIDIA API catalog, allowing enterprise developers to seamlessly deploy the model using the NVIDIA AI Enterprise software platform.
- Optimized MoE Layers: Jamba employs Mixture-of-Experts (MoE) layers within its hybrid structure, activating only a subset of parameters during inference, enhancing efficiency and performance.
Technical Architecture:
Jamba’s architecture utilizes a block-and-layer approach, allowing for the successful integration of both Mamba SSM and the Transformer. Each Jamba block comprises an attentionlayer or a Mamba layer, followed by a multi-layer perceptron (MLP), resulting in an overall ratio of one Transformer layer per eight layers.
Furthermore, Jamba utilizes MoE to increase the total number of model parameters while simplifying the number of active parameters used during inference. This strategy allows for greater modelcapacity without a corresponding increase in computational demands. To maximize model quality and throughput on a single 80GB GPU, AI21 Labs has optimized the number of MoE layers and experts used, leaving sufficient available memory for common inference workloads.
Performance Comparison:
According to AI21 Labs, the Jamba model demonstrates excellent results across various benchmarks, including HellaSwag, ArcChallenge, and MLLU. It performs comparably or even surpasses state-of-the-art models in its size class, such as Llama2 13B, Llama2 70B, Gemma 7B, andMixtral 8×7B, across a wide range of tasks, including language understanding, scientific reasoning, and common sense reasoning.
Open-Sourcing Jamba:
AI21 Labs has made the decision to open-source Jamba to foster research and innovation within the AI community. The company believesthat by sharing this powerful model, it can accelerate the development of new and improved LLMs, ultimately benefiting the entire field.
We believe that open-sourcing Jamba will empower researchers and developers to push the boundaries of what’s possible with LLMs, said Ori Ram, Chief Scientist at AI21Labs. We’re excited to see what the community will create with this powerful new tool.
Jamba’s open-source release is a significant step forward in the democratization of AI research. It provides researchers and developers with access to a cutting-edge LLM, enabling them to explore new possibilities andcontribute to the advancement of the field. With its unique hybrid architecture, impressive performance, and open-source license, Jamba is poised to become a valuable resource for the AI community, driving innovation and accelerating the development of next-generation LLMs.
【source】https://ai-bot.cn/ai21-jamba/
Views: 2
