Google completes Gemini 3.1 tier with Flash-Lite: A high-speed, low-cost utility for enterprise scale

Google launches Gemini 3.1 Flash-Lite, offering 2.5X faster token delivery and 1/8th the cost of Gemini 3.1 Pro for enterprise-scale AI.

By: AXL Media

Published: Mar 5, 2026, 5:49 AM EST

Source: The information in this article was sourced from VentureBeat

Google completes Gemini 3.1 tier with Flash-Lite: A high-speed, low-cost utility for enterprise scale - article image

Technical Performance and Benchmarks

Gemini 3.1 Flash-Lite is optimized for high-throughput enterprise tasks where latency is the primary barrier to user experience. It achieves an output speed of 363 tokens per second, significantly outpacing Gemini 2.5 Flash.

Despite its "Lite" designation, the model maintains high cognitive scores across several standardized benchmarks:

MMMLU (Multilingual Q&A): 88.9%

GPQA Diamond (Scientific Knowledge): 86.9%

MMMU-Pro (Multimodal Understanding): 76.8%

Topics

Enterprise AI
Multimodal AI
Google DeepMind
Gemini 3.1 Flash-Lite
Google Gemini 3.1
AI cost efficiency
Vertex AI
Gemini 3.1 Pro vs Flash-Lite

Related Coverage

Google Transforms Gemini Into Interactive AI Workstation With 3D Simulations and Lyria 3 Music
Multidisciplinary Study Outlines Shift Toward Multimodal AI Systems for Global Deception Detection
NYU Researchers Discover Longer AI Processing Delays Increase Perceived Quality Among 240 Study Participants
Canadian AI Leader Cohere Acquires Germany's Aleph Alpha to Spearhead European Sovereign Technology Alliance

Google completes Gemini 3.1 tier with Flash-Lite: A high-speed, low-cost utility for enterprise scale

Categories

Topics

Related Coverage