Shadow AI is any AI tool, model, or agent in use across an enterprise that hasn't been sanctioned, inventoried, or budgeted. It includes free-tier LLM accounts, IDE coding assistants, AI features auto-enabled inside sanctioned SaaS, browser extensions that route prompts to third-party models, and single-purpose agents employees provision on personal accounts.

How much bigger is shadow AI than the AI IT knows about?

Most AI use never reaches IT. Microsoft's Work Trend Index found 78% of employees who use AI at work bring their own tools rather than company-approved ones, so what IT can see is only a fraction of what is running.

What are the risks of shadow AI?

The primary risk is data exposure. When employees use personal AI accounts, company data leaves your environment for a third party you have no contract with. IBM found shadow-AI-related breaches cost $670,000 more on average, and 65% of them involved customer PII.

What are the three stages of AI transformation?

The three stages are See (visibility into what AI you run, what it costs, and what it returns), Control (budgets, guardrails, and policy), and Optimize (model routing, vendor consolidation, and cost reduction). They are sequential — each stage depends on the one before it.

Why do most AI transformation efforts fail?

Most fail on sequencing, not technology. Enterprises try to control or optimize AI spend before they can see their full AI footprint. Skipping the See stage causes the later stages to collapse.

What is the AI pricing shift?

The AI pricing shift is the move by major AI vendors away from flat per-seat subscriptions toward usage-based, consumption billing. Cursor, Anthropic, and GitHub Copilot all changed their models so that token consumption, not seat count, drives the bill.

Why can't enterprises forecast their AI spend?

Because consumption-billed AI is structurally hard to predict. The same workflow can vary in token cost by five to ten times, and even Microsoft and Uber blew through their 2026 AI budgets early.

What is AI tool sprawl?

AI tool sprawl is the accumulation of unsanctioned, uncoordinated AI tools across an organization, each adopted by a team in isolation and none mapped to a shared standard, budget, or outcome. Most of these tools never reach IT's inventory.

What does 'AI tokens are the new currency' mean?

AI tokens have started to function like money: they are metered, spent, and increasingly substituted for things people used to pay for in dollars or time. Some companies have begun allocating token budgets to employees on top of salary.

How should companies measure AI token ROI?

By shifting the unit from tokens consumed to value produced per token: tokens per support ticket resolved, per deal closed, or per engineer-hour saved. The goal is a ledger that ties every token of spend to a business outcome.

Back to blogs

How to Measure AI ROI: A Leadership Framework

Q: What is the best metric for AI ROI?

Cost per outcome: cost per resolved ticket, per shipped commit, per booked lead, compared against the manual baseline.

Category:

AI ROI

Published date:

Jun 2, 2026

Your company will spend more on AI this year than last, probably by a lot. Ask your leadership team what last year’s spend returned and you’ll get a number nobody can defend. That’s not a local failure. 95% of enterprise AI pilots produce no measurable P&L impact, according to MIT’s 2025 research. The 5% that do have one thing in common, and it isn’t which model they picked.

They measure AI ROI properly. Almost nobody else does.

The measurement gap is the real gap

Most leadership teams measure AI by looking at the bill. The bill tells you what you spent. It tells you nothing about what you got. So the board conversation stalls on the only number anyone can produce, and the number is an input, not a return.

The data backs the pattern. McKinsey’s State of AI found only about 6% of companies attribute more than 5% of EBIT to AI, and only 39% report any enterprise-level EBIT impact at all. It isn’t that AI doesn’t work. It’s that most organizations can’t see whether it’s working. Spend is the lagging indicator of AI ROI. The things that predict ROI sit upstream, and almost no one instruments them.

What is AI ROI, actually?

AI ROI is the business value attributable to AI divided by the fully-loaded cost of producing it, sanctioned and shadow.

Both halves of that fraction are usually broken. The denominator is wrong because shadow AI hides 20% to 40% of real spend outside the budget finance can see. The numerator is missing entirely because most companies never attribute outcomes to the tools that produced them. You can’t compute a ratio when one term is understated and the other doesn’t exist.

The three layers leadership must measure

A defensible AI ROI number rests on three measurements, in order. Skip any one and the number collapses.

Layer 1: Adoption. Is what you bought actually being used? This is the leading indicator, and it’s the one boards never ask for. Buying seats is not adoption. Dormant licenses produce zero return at full cost, which is negative ROI dressed as investment. Before you ask what AI returned, ask how fluent your organization is with it, by department, weighted by headcount. AI fluency is the leading indicator of AI ROI. Spend is the lagging one.

Layer 2: Attribution. Which dollar produced which outcome? Aggregate AI spend tells you nothing actionable. Spend mapped to a workflow, a team, and an outcome tells you everything. The unit you want is cost per result: cost per resolved ticket, cost per shipped commit, cost per booked lead. When you can express AI as a cost-per-outcome, you can compare it to the manual baseline and the ROI becomes arithmetic instead of argument.

Layer 3: Marginal utility. Is the next dollar still worth it? Foundation Capital’s Jaya Gupta calls this “marginal token utility,” the business value created by each additional dollar of inference. ROI isn’t a single annual figure. It’s a curve. Some workflows return 4x; some return nothing past a certain volume. Leadership’s job is to find where the next dollar stops paying and reallocate it. You can’t do that without the first two layers feeding it live.

Three layers. Adoption tells you it’s being used. Attribution tells you what it produced. Marginal utility tells you whether to keep spending. The bill, by itself, tells you none of it. Seeing all three in one place, sanctioned and shadow, is the problem Guickly was built to solve.

What the 5% look like when they measure

The companies getting real AI returns aren’t guessing. IBM credited its AskHR agents and related automation with $3.5B in cumulative productivity, but only because it knew exactly which functions to automate and which to expand. Goldman Sachs reported 20%-plus developer productivity from its firmwide assistant rollout, a number it could only state because it measured adoption across 12,000 engineers first.

The pattern holds at smaller scale too. In a 1,000-person enterprise we modeled, AI drove the cost per support ticket from $6.10 to $1.42, lifted gross margin by 4.2 points, and returned a blended 3.8x per AI dollar. None of those numbers exist without the three layers underneath them. The companies in the 95% spent on the same tools. They just never built the instrumentation to know what came back.

The question for your next board meeting

Don’t walk in with the AI spend number. Everyone has that one, and it answers nothing. Walk in with three: how fluent is the organization, what is each major workflow’s cost per outcome versus its manual baseline, and where does the next AI dollar stop paying. If your team can’t produce those, you don’t have an AI ROI problem yet. You have an AI measurement problem. And the measurement problem is the one that decides which 5% you end up in.

FAQ

How do you measure AI ROI? Measure AI ROI as the business value attributable to AI divided by its fully-loaded cost, sanctioned and shadow. Doing that requires three upstream measurements: adoption (whether tools are actually used), attribution (which spend produced which outcome), and marginal utility (whether the next dollar still returns value). Spend alone is not a measure of ROI.

Why do most companies fail to get ROI from AI? MIT research found 95% of enterprise GenAI pilots produce no measurable P&L impact, and McKinsey found only about 6% of companies attribute more than 5% of EBIT to AI. The common failure isn’t model choice. It’s that most organizations never instrument adoption or attribute outcomes, so they can’t tell whether AI is working.

What is the best metric for AI ROI? Cost per outcome is the most useful single metric: cost per resolved ticket, per shipped commit, per booked lead, compared against the manual baseline. It turns AI ROI from an argument into arithmetic and lets leadership compare value across very different workflows.

Is AI fluency a leading indicator of AI ROI? Yes. Adoption and fluency are leading indicators because dormant licenses produce zero return at full cost. If a tool isn’t being used, no downstream ROI is possible. Spend is a lagging indicator, which is why measuring it alone tells leadership nothing about return.

What is marginal token utility? Marginal token utility, a term coined by Foundation Capital’s Jaya Gupta, is the business value created by each additional dollar of AI inference. It reframes ROI as a curve rather than a single figure: some workflows keep paying as spend grows, others stop, and leadership’s job is to find the inflection and reallocate.

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "How do you measure AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Measure AI ROI as the business value attributable to AI divided by its fully-loaded cost, sanctioned and shadow. This requires three upstream measurements: adoption (whether tools are actually used), attribution (which spend produced which outcome), and marginal utility (whether the next dollar still returns value). Spend alone is not a measure of ROI."
      }
    },
    {
      "@type": "Question",
      "name": "Why do most companies fail to get ROI from AI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "MIT research found 95% of enterprise GenAI pilots produce no measurable P&L impact, and McKinsey found only about 6% of companies attribute more than 5% of EBIT to AI. The common failure isn't model choice — it's that most organizations never instrument adoption or attribute outcomes, so they can't tell whether AI is working."
      }
    },
    {
      "@type": "Question",
      "name": "What is the best metric for AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Cost per outcome is the most useful single metric: cost per resolved ticket, per shipped commit, per booked lead, compared against the manual baseline. It turns AI ROI from an argument into arithmetic and lets leadership compare value across very different workflows."
      }
    },
    {
      "@type": "Question",
      "name": "Is AI fluency a leading indicator of AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Yes. Adoption and fluency are leading indicators because dormant licences produce zero return at full cost. If a tool isn't being used, no downstream ROI is possible. Spend is a lagging indicator, which is why measuring it alone tells leadership nothing about return."
      }
    },
    {
      "@type": "Question",
      "name": "What is marginal token utility?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Marginal token utility is the business value created by each additional dollar of AI inference. It reframes ROI as a curve rather than a single figure: some workflows keep paying as spend grows, others stop, and leadership's job is to find the inflection and reallocate."
      }
    }
  ]
}
</script>

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "How do you measure AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Measure AI ROI as the business value attributable to AI divided by its fully-loaded cost, sanctioned and shadow. This requires three upstream measurements: adoption (whether tools are actually used), attribution (which spend produced which outcome), and marginal utility (whether the next dollar still returns value). Spend alone is not a measure of ROI."
      }
    },
    {
      "@type": "Question",
      "name": "Why do most companies fail to get ROI from AI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "MIT research found 95% of enterprise GenAI pilots produce no measurable P&L impact, and McKinsey found only about 6% of companies attribute more than 5% of EBIT to AI. The common failure isn't model choice — it's that most organizations never instrument adoption or attribute outcomes, so they can't tell whether AI is working."
      }
    },
    {
      "@type": "Question",
      "name": "What is the best metric for AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Cost per outcome is the most useful single metric: cost per resolved ticket, per shipped commit, per booked lead, compared against the manual baseline. It turns AI ROI from an argument into arithmetic and lets leadership compare value across very different workflows."
      }
    },
    {
      "@type": "Question",
      "name": "Is AI fluency a leading indicator of AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Yes. Adoption and fluency are leading indicators because dormant licences produce zero return at full cost. If a tool isn't being used, no downstream ROI is possible. Spend is a lagging indicator, which is why measuring it alone tells leadership nothing about return."
      }
    },
    {
      "@type": "Question",
      "name": "What is marginal token utility?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Marginal token utility is the business value created by each additional dollar of AI inference. It reframes ROI as a curve rather than a single figure: some workflows keep paying as spend grows, others stop, and leadership's job is to find the inflection and reallocate."
      }
    }
  ]
}
</script>

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "How do you measure AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Measure AI ROI as the business value attributable to AI divided by its fully-loaded cost, sanctioned and shadow. This requires three upstream measurements: adoption (whether tools are actually used), attribution (which spend produced which outcome), and marginal utility (whether the next dollar still returns value). Spend alone is not a measure of ROI."
      }
    },
    {
      "@type": "Question",
      "name": "Why do most companies fail to get ROI from AI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "MIT research found 95% of enterprise GenAI pilots produce no measurable P&L impact, and McKinsey found only about 6% of companies attribute more than 5% of EBIT to AI. The common failure isn't model choice — it's that most organizations never instrument adoption or attribute outcomes, so they can't tell whether AI is working."
      }
    },
    {
      "@type": "Question",
      "name": "What is the best metric for AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Cost per outcome is the most useful single metric: cost per resolved ticket, per shipped commit, per booked lead, compared against the manual baseline. It turns AI ROI from an argument into arithmetic and lets leadership compare value across very different workflows."
      }
    },
    {
      "@type": "Question",
      "name": "Is AI fluency a leading indicator of AI ROI?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Yes. Adoption and fluency are leading indicators because dormant licences produce zero return at full cost. If a tool isn't being used, no downstream ROI is possible. Spend is a lagging indicator, which is why measuring it alone tells leadership nothing about return."
      }
    },
    {
      "@type": "Question",
      "name": "What is marginal token utility?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Marginal token utility is the business value created by each additional dollar of AI inference. It reframes ROI as a curve rather than a single figure: some workflows keep paying as spend grows, others stop, and leadership's job is to find the inflection and reallocate."
      }
    }
  ]
}
</script>

Your AI transformation

starts with visibility.

See every AI tool. Track every dollar. Control every budget. Optimize every call. One platform, live in under an hour.

Talk to Founder

GUICKLY

The AI Transformation Platform

Guickly gives enterprises complete visibility and control over their AI transformation from adoption through optimization. Trusted by teams that are AI-first.

Your AI transformation

starts with visibility.

See every AI tool. Track every dollar. Control every budget. Optimize every call. One platform, live in under an hour.

Talk to Founder

GUICKLY

The AI Transformation Platform

Guickly gives enterprises complete visibility and control over their AI transformation from adoption through optimization. Trusted by teams that are AI-first.

Your AI transformation

starts with visibility.

See every AI tool. Track every dollar. Control every budget. Optimize every call. One platform, live in under an hour.

Talk to Founder

GUICKLY

The AI Transformation Platform

Guickly gives enterprises complete visibility and control over their AI transformation from adoption through optimization. Trusted by teams that are AI-first.