How to Run Minimax M1 via API: A Complete Guide

Rebecca Kovács

Rebecca Kovács

19 June 2025

How to Run Minimax M1 via API: A Complete Guide

MiniMax M1, developed by a Shanghai-based AI startup, is a groundbreaking open-weight, large-scale hybrid-attention reasoning model. With a 1 million token context window, efficient reinforcement learning (RL) training, and competitive performance, it’s ideal for complex tasks like long-context reasoning, software engineering, and agentic tool use. This 1500-word guide explores MiniMax M1’s benchmarks and provides a step-by-step tutorial on running it via the OpenRouter API.

💡
Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demands, and replaces Postman at a much more affordable price!
button

MiniMax M1 Benchmarks: A Performance Overview

MiniMax M1 stands out due to its unique architecture and cost-effective training. Available in two variants—M1-40k and M1-80k, based on their “thinking budgets” or output lengths—it excels in multiple benchmarks. Below, we dive into its key performance metrics.

MiniMax M1-40k delivers above-average quality with an MMLU score of 0.808 and an Intelligence Index of 61. It outperforms many open-weight models in complex reasoning tasks. The M1-80k variant further enhances performance, leveraging extended computational resources. MiniMax M1 shines in benchmarks like FullStackBench, SWE-bench, MATH, GPQA, and TAU-Bench, surpassing competitors in tool-use scenarios and software engineering, making it ideal for debugging codebases or analyzing lengthy documents.

Minimax M1 Pricing

Source: Artificialanalysis.AI

MiniMax M1-40k is cost-competitive at $0.82 per 1 million tokens (3:1 input-to-output ratio). Input tokens cost $0.40 per million, and output tokens cost $2.10 per million, cheaper than the industry average. MiniMax M1-80k is slightly pricier due to its extended thinking budget. Volume discounts are available for enterprise users, enhancing affordability for large-scale deployments.

Minimax M1 Architecture and Training

MiniMax M1’s hybrid-attention design blends Lightning Attention (linear cost) with periodic Softmax Attention (quadratic but expressive) and a sparse MoE routing system, activating ~10% of its 456 billion parameters. Its RL training, powered by the CISPO algorithm, enhances efficiency by clipping importance sampling weights. MiniMax M1 was trained on 512 H800 GPUs in three weeks, a remarkable feat.

MiniMax M1 excels in long-context reasoning, cost-effectiveness, and agentic tasks, though its output speed lags. Its open-source Apache 2.0 license enables fine-tuning or on-premises deployment for sensitive workloads. Next, we explore running MiniMax M1 via the OpenRouter API.

Running MiniMax M1 via OpenRouter API

OpenRouter offers a unified, OpenAI-compatible API to access MiniMax M1, simplifying integration. Below is a step-by-step guide to running MiniMax M1 using OpenRouter.

MiniMax M1 - API, Providers, Stats
MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom “lightning attention” mechanism, allowing it to process long sequences—up to 1 million tokens—while m…

Step 1: Set Up an OpenRouter Account

  1. Visit OpenRouter’s website and sign up using email or OAuth providers like Google.
  2. Generate an API key in the “API Keys” section of your dashboard and store it securely.
  3. Add funds to your account via credit card to cover API usage costs. Check for promotions, as MiniMax M1 occasionally offers discounts.

Step 2: Understand MiniMax M1 on OpenRouter

MiniMax M1 on OpenRouter is optimized for:

It typically defaults to the M1-40k variant, with pricing at ~$0.40 per million input tokens and $2.10 per million output tokens.

Step 3: Make MiniMax M1 API Requests

OpenRouter’s API works with OpenAI’s SDK. Here’s how to send requests:

Prerequisites

Sample Code

Below is a Python script to query MiniMax M1:

python

from openai import OpenAI

# Initialize the client with OpenRouter's endpoint and your API key
client = OpenAI(
    base_url="<https://openrouter.ai/api/v1>",
    api_key="your_openrouter_api_key_here"
)

# Define the prompt and parameters
prompt = "Summarize the key features of MiniMax M1 in 100 words."
model = "minimax/minimax-m1"# Specify MiniMax M1
max_tokens = 200
temperature = 1.0# For creative responses
top_p = 0.95# For coherence# Make the API call
response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt}
    ],
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p
)

# Extract and print the response
output = response.choices[0].message.content
print("Response:", output)

Explanation

Step 4: Handle MiniMax M1 Responses

The API returns a JSON object with MiniMax M1’s output in choices[0].message.content. Ensure inputs don’t exceed 1 million tokens. If truncated, increase max_tokens or paginate output.

Step 5: Optimize MiniMax M1 for Specific Tasks

Step 6: Monitor MiniMax M1 Usage and Costs

Track usage and costs in OpenRouter’s dashboard. Optimize prompts to minimize token counts, reducing input and output expenses.

Step 7: Explore Advanced MiniMax M1 Integrations

Troubleshooting MiniMax M1

Conclusion

MiniMax M1 is a powerful, cost-effective AI model with unmatched long-context capabilities and strong reasoning performance. Its open-source nature and efficient training make it accessible for diverse applications. Using OpenRouter’s API, developers can integrate MiniMax M1 into projects like document summarization or code generation. Follow the steps above to get started and explore advanced deployment options for production. MiniMax M1 unlocks scalable, reasoning-driven AI for developers and enterprises alike.

💡
Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demands, and replaces Postman at a much more affordable price!
button

Explore more

Why Are KYC APIs Essential for Modern Financial Compliance Success

Why Are KYC APIs Essential for Modern Financial Compliance Success

Discover why KYC APIs are transforming financial compliance. Learn about document verification, AML checks, biometric authentication, and implementation best practices.

16 July 2025

What is Async API and Why Should Every Developer Care About It

What is Async API and Why Should Every Developer Care About It

Discover what AsyncAPI is and why it's essential for modern event-driven applications. Learn about asynchronous API documentation, real-time messaging, and how AsyncAPI differs from REST APIs.

16 July 2025

Voxtral: Mistral AI's Open Source Whisper Alternative

Voxtral: Mistral AI's Open Source Whisper Alternative

For the past few years, OpenAI's Whisper has reigned as the undisputed champion of open-source speech recognition. It offered a level of accuracy that democratized automatic speech recognition (ASR) for developers, researchers, and hobbyists worldwide. It was a monumental leap forward, but the community has been eagerly awaiting the next step—a model that goes beyond mere transcription into the realm of true understanding. That wait is now over. Mistral AI has entered the ring with Voxtral, a ne

15 July 2025

Practice API Design-first in Apidog

Discover an easier way to build and use APIs