AI models' per-token price has stopped falling

Simon Willison points out something anyone running an AI workflow in production has already seen on the invoice: the three main labs, Google, OpenAI and Anthropic, raise their API prices with every release. GPT-5.5 costs twice as much as GPT-5.4. Opus 4.7 roughly 1.46 times the 4.6. The assumption many built their business case on, the per-token cost always falling, no longer holds. The provider moves the price out from under you, from one version to the next, and anyone with an agent in production has two choices: absorb the increase or redo the math.

Here is the point for TCO. The API cost isn't a line you fix in the budget for twelve months. It's variable, decided by someone else. It adds to maintenance, to the dedicated person, to the refactor when the model changes behavior. Anyone calculating the annual cost of a workflow on today's tokens alone is underestimating.

Why this matters for anyone building enterprise AI: the per-token cost is a variable the provider decides, not a fixed line in the budget.

◆ ◆ ◆

Source

https://simonwillison.net/2026/May/19/gemini-35-flash/