World's First Open-Source
Hybrid Attention Reasoning Model

MiniMax-M1 delivers breakthrough performance with 456B parameters, 1M token context length, and 25% reduced compute costs. Fully open-source under Apache 2.0 license.

456B
Total Parameters
1M
Token Context
25%
FLOPs vs DeepSeek R1

Revolutionary Architecture

MiniMax-M1 introduces groundbreaking hybrid attention mechanisms and MoE architecture, setting new standards for open-source AI models.

Hybrid Attention

Combines Lightning Attention with Softmax Attention for optimal balance of efficiency and precision in reasoning tasks.

Mixture of Experts

456B total parameters with only 45.9B activated per token, delivering exceptional performance with reduced computational overhead.

Open Source Excellence

Released under Apache 2.0 license, enabling unrestricted commercial use and community-driven innovation.

Technical Features

Advanced capabilities designed for real-world applications

Hybrid Attention Mechanism

Combines Lightning Attention with Softmax Attention for optimal efficiency and precision

🧠

Mixture of Experts (MoE)

456B total parameters with only 45.9B activated per token for efficient computation

📊

Long Context Support

Native 1M token context window, expandable to 4M tokens during inference

⚙️

Efficient Compute

Consumes only 25% FLOPs compared to DeepSeek R1 at 100K token generation

Benchmark Comparison

Comprehensive performance analysis against leading models

MiniMax-M1 Benchmark Performance Comparison

Left: Benchmark performance comparison of leading commercial and open-weight models across competition-level mathematics, coding, software engineering, agentic tool use, and long-context understanding tasks. We use the MiniMax-M1-80k model here for MiniMax-M1. Right: Theoretical inference FLOPs scaling with generation length (# tokens).

Performance Benchmarks

Leading performance across mathematics, coding, reasoning, and long-context tasks

Mathematics

AIME 2024 86.0
AIME 2025 76.9
MATH-500 96.8

Coding

LiveCodeBench 65.0
FullStackBench 68.3

Reasoning

GPQA Diamond 70.0
MMLU-Pro 81.1
ZebraLogic 86.8

Software Engineering

SWE-bench Verified 56.0

Long Context

OpenAI-MRCR (128k) 73.4
OpenAI-MRCR (1M) 56.2
LongBench-v2 61.5

Deployment Options

Multiple deployment methods to suit your needs

HuggingFace

Download directly from HuggingFace repository

MiniMax-M1-40k
MiniMax-M1-80k
from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("MiniMax-AI/MiniMax-M1-40k")

vLLM

Recommended for production deployment with excellent performance

Outstanding throughput Efficient memory management Batch processing
pip install vllm vllm serve MiniMax-AI/MiniMax-M1-40k

Transformers

Direct deployment using Transformers library

Development and experimentation
pip install transformers torch python -c "from transformers import pipeline; model = pipeline('text-generation', model='MiniMax-AI/MiniMax-M1-40k')"

Frequently Asked Questions

Common questions about MiniMax-M1 and detailed answers

What makes MiniMax-M1 different from other AI models?

+

MiniMax-M1 is the world's first open-weight, large-scale hybrid-attention reasoning model. It features a unique hybrid Mixture-of-Experts (MoE) architecture combined with Lightning Attention mechanism. With 456B total parameters and only 45.9B activated per token, it offers superior efficiency while supporting 1M token context length natively, which is 8x larger than many competing models.

How efficient is MiniMax-M1 compared to other models?

+

MiniMax-M1 is highly efficient thanks to its Lightning Attention mechanism. Compared to DeepSeek R1, M1 consumes only 25% of the FLOPs at a generation length of 100K tokens. This makes it particularly suitable for complex tasks requiring extensive reasoning and long input processing.

What are the two versions available and their differences?

+

MiniMax-M1 comes in two versions: M1-40K and M1-80K, referring to their thinking budgets. The 80K version offers enhanced reasoning capabilities for more complex tasks, while the 40K version provides a good balance of performance and efficiency. Both versions share the same core architecture and 456B parameter count.

How does MiniMax-M1 perform on benchmarks?

+

MiniMax-M1 demonstrates strong performance across various benchmarks. On AIME 2024, it achieves 86.0% (80K) and 83.3% (40K). For software engineering tasks like SWE-bench Verified, it scores 56.0% and 55.6% respectively. It particularly excels in long-context tasks, achieving 73.4% on OpenAI-MRCR (128k) and 56.2% on 1M token tasks.

How can I deploy and use MiniMax-M1?

+

MiniMax-M1 can be deployed using multiple methods: vLLM (recommended for production), Transformers, or HuggingFace. The model is available on HuggingFace Hub and GitHub. For general use, you can try the online chatbot at chat.minimaxi.com or use the API for development purposes.

What is the Lightning Attention mechanism?

+

Lightning Attention is MiniMax-M1's innovative attention mechanism that enables efficient scaling of test-time compute. It combines with traditional Softmax Attention in a hybrid architecture, allowing the model to process extremely long contexts (up to 4M tokens during inference) while maintaining computational efficiency.

Does MiniMax-M1 support function calling?

+

Yes, MiniMax-M1 supports function calling capabilities. The model can identify when external functions need to be called and output function call parameters in a structured format. This makes it suitable for agentic applications and tool use scenarios.

What is the training methodology behind MiniMax-M1?

+

MiniMax-M1 is trained using large-scale reinforcement learning (RL) on diverse problems ranging from traditional mathematical reasoning to sandbox-based, real-world software engineering environments. The training utilizes CISPO, a novel algorithm that clips importance sampling weights instead of token updates, which outperforms other competitive RL variants.

What license is MiniMax-M1 released under?

+

MiniMax-M1 is released under the Apache 2.0 license, making it fully open-source and free for both commercial and research use. This allows developers and researchers to freely use, modify, and distribute the model while maintaining attribution.

How can I get support or contact the MiniMax team?

+

For technical support, questions, or feedback, you can contact the MiniMax team at [email protected]. You can also follow the project on GitHub for updates and community discussions.