Google completes Gemini 3.1 tier with Flash-Lite: A high-speed, low-cost utility for enterprise scale
Google launches Gemini 3.1 Flash-Lite, offering 2.5X faster token delivery and 1/8th the cost of Gemini 3.1 Pro for enterprise-scale AI.
By: AXL Media
Published: Mar 5, 2026, 5:49 AM EST
Source: The information in this article was sourced from VentureBeat

Technical Performance and Benchmarks
Gemini 3.1 Flash-Lite is optimized for high-throughput enterprise tasks where latency is the primary barrier to user experience. It achieves an output speed of 363 tokens per second, significantly outpacing Gemini 2.5 Flash.
Despite its "Lite" designation, the model maintains high cognitive scores across several standardized benchmarks:
MMMLU (Multilingual Q&A): 88.9%
GPQA Diamond (Scientific Knowledge): 86.9%
MMMU-Pro (Multimodal Understanding): 76.8%
Categories
Topics
Related Coverage
- Google Transforms Gemini Into Interactive AI Workstation With 3D Simulations and Lyria 3 Music
- Multidisciplinary Study Outlines Shift Toward Multimodal AI Systems for Global Deception Detection
- NYU Researchers Discover Longer AI Processing Delays Increase Perceived Quality Among 240 Study Participants
- Canadian AI Leader Cohere Acquires Germany's Aleph Alpha to Spearhead European Sovereign Technology Alliance