MiniMax-M3

Summary

MiniMax-M3 is a MiniMax open-weight native multimodal MoE model release powered by MiniMax Sparse Attention. The linked Hugging Face model card describes about 428B total parameters, about 23B activated parameters, image/video/text input, 1M-context support, and coding/agentic usage.

Role In The Wiki

MiniMax-M3 is the release-level evidence that MSA is deployed in a production/open-weight multimodal model stack rather than only a standalone attention paper. It belongs near encoder-free/native multimodal releases and long-context serving efficiency sources.

For the foundation time-series agenda, MiniMax-M3 is not evidence for numeric time-series modeling. Its transferable lesson is that long-context, multimodal, agentic models increasingly make context length a serving-systems problem as much as an architecture problem.

Official Artifacts

Hugging Face model: MiniMaxAI/MiniMax-M3
GitHub model repo: MiniMax-AI/MiniMax-M3
Attention/kernel code: MiniMax-AI/MSA
Paper: MiniMax Sparse Attention

License Note

The model weights use the MiniMax Community License with commercial-use conditions. Treat MiniMax-M3 as open-weight, not as a standard Apache-2.0 or MIT model release.

Evidence

MiniMax Sparse Attention

Alex Open Research Wiki

Explorer

MiniMax-M3

MiniMax-M3

Summary

Role In The Wiki

Official Artifacts

License Note

Evidence

Graph View

Table of Contents

Backlinks

Alex Open Research Wiki

Explorer

MiniMax-M3

MiniMax-M3

Summary

Role In The Wiki

Official Artifacts

License Note

Evidence

Related Pages

Graph View

Table of Contents

Backlinks