Xiaomi MiMo-v2.5-Pro-UltraSpeed Achieves 1000 TPS Generation

Original: MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second

Why This Matters

Breakthrough in AI inference speed could enable real-time applications previously limited by latency

Xiaomi released MiMo-v2.5-Pro-UltraSpeed, a 1-trillion-parameter AI model achieving 1000+ tokens per second generation speed. Developed with TileRT, the model offers 10x faster output at 3x the API cost during limited trial period June 9-23, 2026.

Xiaomi launched MiMo-v2.5-Pro-UltraSpeed, claiming to be the first 1-trillion-parameter model breaking 1000 tokens per second decode speed, reaching up to 1200 TPS. The model was developed in collaboration with TileRT. The API costs 3x more than standard MiMo-v2.5-Pro but delivers approximately 10x generation speed. Access requires application approval and is limited to June 9-23, 2026. Approved users receive free chat access with restrictions: 10 queue entries per day, 30-minute sessions, auto-release after 5 minutes idle. Xiaomi positions the speed breakthrough as enabling real-time AI decision loops for time-critical applications including quantitative trading, fraud detection, coding assistance, and medical analysis.

Source

mimo.xiaomi.com — Read original →