Why Is Measuring AI Adoption So Hard?
And why tying it to performance reviews doesn't really move the needle
I was on a call this week with someone who runs AI transformation at a mid-sized company. His CTO told him (on a Friday, of course) that his new job is to ‘make the company an AI company.’ Come back Monday with a plan. He did not come back with a plan. He came back with a set of observations, and was curious about what it all added up to.
Copilot is mandatory at his org. Use it, or your performance review notices. Usage is being measured by tokens, so… A handful of engineers have set up token-consuming factories, idle prompts firing all day, climbing the internal leaderboard. (Because there is a leaderboard now).
The enterprise metrics platform is producing beautiful dashboards. But nobody trusts the dashboards. The dashboards only measure Copilot, and half the real AI work is happening in Claude and Codex. The CTO keeps asking, ‘How do we measure this?’ The board keeps asking, ‘How do we measure this?’ He is trying to answer, ‘How do we measure this?’
And the real question, the one nobody is asking yet, is: what is ‘this,’ actually?
See If This Feels Familiar 👇
If you lead AI transformation at your company, you are probably sitting somewhere around Waypoint 2 or 3 on the Hyperadaptive Map. Past the initial pilots. Past the first AI Champion rollouts. Deep enough that the pressure to show results is real. Early enough that the system is not actually producing results yet.
This is the stretch where mandates get issued. Because mandates feel like action. Copilot for everyone. AI usage tied to reviews. Adoption targets on the board deck.
JPMorgan just tied AI adoption to performance reviews for 65,000 engineers. Light user, heavy user, non-user. That is the category you get sorted into. That is your performance story now.
The logic is clean. The outcomes are not.
Why The Dashboard Is Lying To You
Three days ago, TechCrunch published a piece that named a phenomenon I have been watching show up in every Stage 2 organization I work with. They called it ‘tokenmaxxing.’ Engineers running idle AI agents to farm usage. Fragmenting prompts so each one counts separately. Doing whatever the metric rewards.
The numbers inside this are not subtle. One study found that engineers with the largest token budgets produced the most pull requests, yes, and at twice the throughput for ten times the cost. The measurement metric and the business metric diverge sharply.
Meta has built an internal leaderboard for employee AI token usage. Salesforce is publicly pushing back, calling token consumption a vanity metric and proposing something they call Agentic Work Units instead. HRZone has been running stories on what they are calling ‘performance theatre,’ with Reddit threads of employees admitting they fabricate AI usage to satisfy mandates they privately think are wrong.
And 74% of companies buying AI tools still cannot show tangible business value from the investment. The gap between ‘we bought the seats’ and ‘we got the outcome’ is where most of these programs are sitting right now.
The dashboard is lying because the measurement system is operating on the wrong layer.
Why ‘Let’s Just Measure Better’ Doesn’t Work
When an organization is bifurcating, a small group of power users pulling ahead while everyone else runs token factories to meet the mandate, the default instinct is to fix the measurement. Better metrics. Story points adjusted for AI. Business value per commit.
I understand the instinct. I have watched smart people chase it for the last year. It does not work, and I think I know why.
The measurement disconnect is structural. It comes out of running bolted onto a Linear Organization, the hierarchical, siloed org chart where strategy flows down through layers and work moves sequentially across departments.
You cannot measure your way out of a structural problem. You can only redesign the work.
MIT Sloan’s research lines up with what I see in the field. Organizations that build systematic feedback loops between humans and AI are six times more likely to derive substantial financial benefits from AI. Not organizations that measure better. Organizations that design different feedback architecture into the work itself.
When did the measurement system at your company last get redesigned for what AI actually changes about the work?
So What Do You Actually Do?
The Hyperadaptive Map names three specific Moves for organizations stuck between Waypoint 2 and Waypoint 3. All three are about work design, not measurement.
The first is standing up a proper AI Activation Hub, which is almost the opposite of mandatory Copilot. An Activation Hub is a small, cross-functional team whose job is to find valuable AI work, measure it, make it repeatable, and teach peers how to do it. Not a center of excellence. Not a governance body. A learning engine.
The second is formalizing AI Leads inside each domain. These are the power users who are already pulling ahead. Instead of letting them keep pulling ahead, you change their job description. Their new job is to spread the judgment they have developed, peer to peer. The bifurcation does not close because you punish the non-adopters. It closes because the adopters become teachers.
Read More: Why Appointing AI Leads Isn’t Enough (And What to Do Instead)
The third is replacing your adoption metrics with AI Learning Flywheel metrics. Instead of ‘how many people used Copilot this week,’ you measure ‘how many new valuable AI patterns did we identify, package, and spread this month.’ The denominator changes from users to patterns. Everything downstream of that changes.
None of these Moves is measurable against token usage. That is the point.
He is going to walk into his Monday one-on-one with no plan, because he already knows the plan is not the answer. What he is bringing back is an observation: that the company is living inside the pattern, the measurement system is part of the pattern, and you cannot escape a pattern by measuring it harder.
If you are in the same seat, I would love to know what you are seeing. Are your dashboards producing confidence, or confusion? Are your power users spreading judgment, or climbing past everyone else while the token factories run?
Hit reply. I read every one.
If this is showing up in your organization, the Waypoint Finder is free and takes five minutes. It returns your current waypoint and the three Moves most likely to accelerate you.
Get the Deep Dive
If what I write is resonating, consider getting the book. Trust me, it is $$ well spent.
Or, better yet, trust John Ford and his review.
Hyperadaptive: Rewiring the Enterprise to Become AI-Native releases May 12th. Pre-order at hyperadaptive.solutions/book. The book covers the full model; this essay is one waypoint inside it.


