Beyond the Token: Google's Per-Minute Pricing and the Disruption of Real-Time AI Economics
May 14, 2026
The token has been the unit of account for AI inference since the first public OpenAI APIs launched in 2020. Every pricing page, every cost model, every engineering estimate in the industry has been denominated in tokens per million. In 2026, Google disrupted that convention with the Gemini Live API, priced not at the token level but at $0.005 per minute of audio interaction. This is not a minor pricing variant — it is a structural challenge to the assumptions that underpin every real-time AI application budget. Understanding when per-minute pricing is economically superior to per-token pricing, and when it is not, is now a required competency for any engineering leader deploying AI at scale.
