Shadow AI is any AI tool, model, or agent in use across an enterprise that hasn't been sanctioned, inventoried, or budgeted. It includes free-tier LLM accounts, IDE coding assistants, AI features auto-enabled inside sanctioned SaaS, browser extensions that route prompts to third-party models, and single-purpose agents employees provision on personal accounts.

How much bigger is shadow AI than the AI IT knows about?

Most AI use never reaches IT. Microsoft's Work Trend Index found 78% of employees who use AI at work bring their own tools rather than company-approved ones, so what IT can see is only a fraction of what is running.

What are the risks of shadow AI?

The primary risk is data exposure. When employees use personal AI accounts, company data leaves your environment for a third party you have no contract with. IBM found shadow-AI-related breaches cost $670,000 more on average, and 65% of them involved customer PII.

How do you measure AI ROI?

Measure AI ROI as the business value attributable to AI divided by its fully-loaded cost, sanctioned and shadow. This requires three measurements: adoption, attribution, and marginal utility. Spend alone is not a measure of ROI.

Why do most companies fail to get ROI from AI?

MIT research found 95% of enterprise GenAI pilots produce no measurable P&L impact. The common failure isn't model choice — it's that most organizations never instrument adoption or attribute outcomes.

What is the best metric for AI ROI?

Cost per outcome: cost per resolved ticket, per shipped commit, per booked lead, compared against the manual baseline.

What are the three stages of AI transformation?

The three stages are See (visibility into what AI you run, what it costs, and what it returns), Control (budgets, guardrails, and policy), and Optimize (model routing, vendor consolidation, and cost reduction). They are sequential — each stage depends on the one before it.

Why do most AI transformation efforts fail?

Most fail on sequencing, not technology. Enterprises try to control or optimize AI spend before they can see their full AI footprint. Skipping the See stage causes the later stages to collapse.

What is AI tool sprawl?

AI tool sprawl is the accumulation of unsanctioned, uncoordinated AI tools across an organization, each adopted by a team in isolation and none mapped to a shared standard, budget, or outcome. Most of these tools never reach IT's inventory.

What does 'AI tokens are the new currency' mean?

AI tokens have started to function like money: they are metered, spent, and increasingly substituted for things people used to pay for in dollars or time. Some companies have begun allocating token budgets to employees on top of salary.

How should companies measure AI token ROI?

By shifting the unit from tokens consumed to value produced per token: tokens per support ticket resolved, per deal closed, or per engineer-hour saved. The goal is a ledger that ties every token of spend to a business outcome.

Back to blogs

The AI Pricing Shift: From Fixed Seats to Variable Bills

Category:

AI Pricing Shift

AI Budget Forecasting

Published date:

Jun 2, 2026

Every major AI vendor you depend on changed how it charges you in the last twelve months. Cursor, Anthropic, GitHub Copilot. All three moved primarily to consumption billing. The AI budget your board signed off in Q4 was built on per-seat math. The invoice landing now runs on per-token math. Those two numbers no longer reconcile, and the gap is widening every quarter.

This is the AI pricing shift. And most CEOs are still managing AI like a subscription when it has quietly become a variable cost.

The flat-rate era is over

You budgeted for AI the way you budget for software: a number of seats times a price per seat, locked for the year. That model is gone.

GitHub Copilot moved every plan to usage-based billing on June 1, 2026. The Business plan still reads $19 per user, but now includes only $19 of AI credits, with everything above that billed at API rates. Anthropic stripped the bundled token discount from Claude Enterprise seats in April 2026, a change licensing experts told The Register could “potentially triple costs” for some customers. And when Cursor quietly swapped its Pro plan’s pricing in mid-2025, one user reported burning $350 in overages in a single week against a $20 mental model.

Three vendors. One direction. The seat fee is now a cover charge, and the real bill is metered.

What is consumption billing, and why does it break your budget?

Consumption billing is a pricing model where you pay for AI by the token consumed, not by the seat assigned. Your bill moves with usage, not with headcount.

That single change rewires the economics. A subscription is a fixed cost: predictable, flat, easy to forecast a year out. Token consumption behaves like cost of goods sold: it scales

with activity, spikes with demand, and varies in ways that have nothing to do with how many people you hired. You’re carrying AI on your P&L as a fixed cost. Your vendors have repriced it as a variable one.

And it’s not even a stable variable. A widely circulated industry essay, “The Token Budget Wars,” makes the point that two runs of the same workflow on the same input can differ in token cost by five to ten times, with nothing visibly going wrong. Retry loops, context inflation, and over-routing to frontier models for trivial tasks all compound silently. So the line item you set once a year is now driven by mechanics your finance team cannot see and your vendors can reprice overnight.

If Microsoft can’t forecast this, neither can your FP&A team

Here is the part that should reframe your next board conversation. Microsoft’s own engineering division cancelled most internal Claude Code licenses in 2026, six months after rollout, because engineers burned through the team’s entire annual AI budget. Uber exhausted its full 2026 AI coding budget by April, and its COO told staff he could not draw a clear line from token spend to consumer outcomes.

These are two of the most sophisticated technical organizations on earth, with every tooling and talent advantage available. They could not forecast their own AI consumption. That isn’t a process failure at Microsoft and Uber. It’s a structural property of consumption-billed AI. Which means the right question for a CEO is not “why can’t my CFO forecast this?” It’s “what layer do we need so the forecast becomes possible?”

The answer is the same one any business uses to manage a variable cost it cares about: real-time instrumentation. You don’t manage COGS off a quarterly invoice. You meter it as it’s incurred. A vendor-neutral view of consumption, updated continuously and attributed to teams and outcomes, is what turns an unforecastable line item back into a managed one. That visibility layer is the problem Guickly exists to solve.

What it looks like when you can see it

In a 1,000-person enterprise we modeled, AI spend ran $558k over 90 days across OpenAI, Anthropic, Copilot, and a tail of smaller tools. Set that as a flat annual budget and the first usage-based renewal blows the model apart. Instrument it instead, and the same spend becomes legible: which vendor, which team, which workflow, and crucially, which dollars are producing a return. The enterprises that win the next renewal cycle aren’t the ones that spent least. They’re the ones who could see the meter running before the invoice arrived.

The question to take into your next board meeting

Stop asking “what did we budget for AI this year?” That question assumes a fixed cost that no longer exists. Ask instead: “do we see our AI consumption in real time, and can we attribute it to outcomes?” If the answer is no, your AI budget is a guess that your vendors get to revise without telling you.

From AI chaos to AI advantage starts with seeing the meter.

FAQ

What is the AI pricing shift? The AI pricing shift is the move by major AI vendors away from flat per-seat subscriptions toward usage-based, consumption billing. In a twelve-month span, Cursor, Anthropic, and GitHub Copilot all changed their models so that token consumption, not seat count, drives the bill.

What is consumption-based AI billing? Consumption-based billing charges for AI by the token or unit of usage rather than a fixed monthly seat fee. The seat fee becomes a small base charge, and the majority of the cost scales with how much the tool is actually used, which makes the bill behave like a variable cost rather than a fixed one.

Why can’t enterprises forecast their AI spend? Because consumption-billed AI is structurally hard to predict. The same workflow can vary in token cost by five to ten times, retry loops and context inflation compound silently, and usage isn’t tied to headcount. Even Microsoft and Uber, with full engineering resources, blew through their 2026 AI budgets early.

How is usage-based AI pricing different from SaaS pricing? SaaS pricing is a fixed cost: a set number of seats at a set price, predictable for a year. Usage-based AI pricing behaves like cost of goods sold: it scales with activity, spikes with demand, and varies independently of how many people you employ. The two require completely different budgeting and monitoring approaches.

What should a CEO do about the AI pricing shift? Stop treating AI as a fixed annual line item and start managing it like a variable cost. That means real-time, vendor-neutral visibility into consumption, attributed to teams and outcomes, so you can see usage as it’s incurred rather than reacting to a surprise invoice or a mid-year renewal.

Your AI transformation

starts with visibility.

See every AI tool. Track every dollar. Control every budget. Optimize every call. One platform, live in under an hour.

Talk to Founder

GUICKLY

The AI Transformation Platform

Guickly gives enterprises complete visibility and control over their AI transformation from adoption through optimization. Trusted by teams that are AI-first.

Your AI transformation

starts with visibility.

See every AI tool. Track every dollar. Control every budget. Optimize every call. One platform, live in under an hour.

Talk to Founder

GUICKLY

The AI Transformation Platform

Guickly gives enterprises complete visibility and control over their AI transformation from adoption through optimization. Trusted by teams that are AI-first.

Your AI transformation

starts with visibility.

See every AI tool. Track every dollar. Control every budget. Optimize every call. One platform, live in under an hour.

Talk to Founder

GUICKLY

The AI Transformation Platform

Guickly gives enterprises complete visibility and control over their AI transformation from adoption through optimization. Trusted by teams that are AI-first.