AI Reasoning · AI Agent · mimo.xiaomi.com
MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second
Compiled by KHAO Editorial — aggregated from 1 source. See llms.txt for citation guidance.
◌ Single Source
The speed of AI reasoning is no different, it defines the boundaries of intelligence itself.
Key facts
- The MiMo-V2.5-Pro-UltraSpeed API launches simultaneously at a limited-time promotional price, 3× the cost of MiMo-V2.5-Pro, but delivering approximately 10× the generation speed
- [1] OCP Microscaling Formats (MX) v1.0 Spec: opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf
- Today, they are thrilled to release Xiaomi MiMo-V2.5-Pro-UltraSpeed in collaboration with TileRT, breaking the 1000 tokens/s decode speed on a 1-trillion-parameter model for the first time
- To ensure quality and fairness under resource constraints, the following rules apply: each account may enter the queue up to 10 times per day; each session is capped at 30 minutes; sessions idle
Summary
From the first roaring racer of the combustion age to the sonic boom that shattered the sound barrier, humanity's hunger for speed is written into their DNA. Today, they are thrilled to release Xiaomi MiMo-V2.5-Pro-UltraSpeed in collaboration with TileRT, breaking the 1000 tokens/s decode speed on a 1-trillion-parameter model for the first time! The MiMo-V2.5-Pro-UltraSpeed API launches simultaneously at a limited-time promotional price, 3× the cost of MiMo-V2.5-Pro, but delivering approximately 10× the generation speed! 3× the price, 10× the output experience. (API only; Token Plan not supported.) Due to limited high-speed inference resources, MiMo-V2.5-Pro-UltraSpeed will be available through an application-based, limited-time window. API platform: platform.xiaomimimo.com/ultraspeed. For standard model access, please follow the MiMo-V2.5 model series.