From Asking to Delegating: What Changes When AI Starts Doing the Work | The AI Leader Lab

You've probably used ChatGPT to tidy up an email or pull the details out of a long report. Useful stuff, but notice what's happening in that exchange: you ask, the model answers, and you do everything that comes next.

Agentic AI moves the line. Instead of answering, the model is connected to tools that carry out a task, or series of tasks. It opens files, runs steps, drafts the thing, checks its own work, and comes back when it needs you. That's a different relationship, and it changes the question a leader is actually facing.

Most of the AI conversation aimed at leaders right now is some version of "we need to use it", and "get your people using it." Roll out the licences, run a lunch-and-learn, count the logins and maybe the overall license cost.

The problem is that usage barely tells you anything once the tool can do the work. A team can all be "using AI" and have changed nothing about how the work happens. The better question than whether people are using it, is what they're actually handing over, and whether the work has been rebuilt around that or just sped up.

A 130-year-old lesson about electric motors

When factories first swapped steam engines for electric motors, productivity barely moved. The economist Paul David showed why (David, 1990). The factories kept their old steam-era layouts, with everything crowded around one central power source, and they simply dropped an electric motor where the steam engine used to sit. Same floor plan, but with a new motor.

The real gains came years later, when firms redesigned the factory floor around what electric power could now do. Smaller motors meant you could put power exactly where you needed it and rearrange the line however the work demanded. That redesign, not the motor itself, was the thing that mattered.

Agentic AI is sitting at the early stage of that curve. Bolt it onto the old way of working and you get a faster version of what you already had. The bigger shift comes from rebuilding the work around what can now be delegated. In my own framework I call the bolt-on version Addition, and the rebuilt version Multiplication. The history says the second one is where the value lives, and it takes deliberate and persistent effort to get there.

The elephant in the room

The clearest data on all this comes from a recent OpenAI study of its own coding-and-work agent, Codex. OpenAI researchers, studying OpenAI's own product, using OpenAI's own staff as the headline example. It's worth reading with that firmly in mind, and the authors say so themselves: their own employees are not representative of a normal organisation. Internally there are no usage limits, nearly everyone is fluent with the tools, and the buy-in is total.

Of course the numbers look extreme.

So I'm not treating their frontier figures as a target, I'm treating them as a picture of what this looks like when every obstacle is taken away. And the structural lesson underneath, the electric-motor point, doesn't depend on OpenAI at all. It rests on decades of independent work on why new technologies take so long to pay off. The gains from a new technology tend to arrive later, because firms have to rebuild themselves around it before the payoff shows up.

As the work shifts toward delegation, the people who can frame a task well and judge whether it came back right get more valuable, not less (Tambe, 2026). And the delegation argument the OpenAI paper leans on traces back, in part, to a study from Anthropic, one of its own rivals (Hitzig et al., 2026). I find that quietly reassuring, when two competitors land on the same finding, it's usually because the finding is real.

The size of what people hand over is climbing fast

One number in the OpenAI study stands out. Among ordinary individual users, not OpenAI staff, the share handing the model a task that would take a person an hour or more climbed from 35.4% last December to 70.2% by May. The eight-hour version, a full working day of effort, went from 2.1% to 25.6% over the same six months. A quarter of users now delegating a day's work in one go, where six months earlier almost nobody did.

OpenAI: Distribution of users by peak task complexity

That's the shift in a couple of figures. The work people hand over is getting bigger, and quickly.

What actually happened in the non-technical teams

Here's the bit that I think matters most for someone who isn't a software engineer.

Inside OpenAI, the agentic tool started where you'd expect, with the developers. Then it spread outwards to other teams and the striking example is legal and recruiting.

In January those teams were close to zero usage. By early April they were around a fifth of their AI work on the agent, and within roughly a month that jumped to about three quarters. Not a slow creep but a monumental shift, once the conditions were right. By June, legal had reached 88% of its AI work on the agent and recruiting 89%, not far behind engineering at 99%.

OpenAI: OpenAI workers share of output tokens, by job function

And what were they using it for? Mostly knowledge work, not code. Documents, reports, summaries, the written and structured outputs that fill a normal working week. For the finance and accounting people specifically, producing and working through knowledge artefacts was the single biggest category of use, at 29%, ahead of any kind of coding. Spreadsheets and reports, not software.

What pushed it along wasn't clever technology, by the authors' account. It was training sessions, regular feedback loops, access that didn't make people ration themselves, and a culture where people shared the shortcuts they'd found. Unglamorous but mostly a people-and-habits job rather than a tooling one.

Now hold that against the outside world. In ordinary organisations, legal is the lowest adopter of agentic tools, somewhere around 1.9% of the average legal user's work. Set that next to the 88% those teams reached inside OpenAI. The same models are available to everyone; the difference is entirely in the conditions around them. That gap is the whole argument.

A few things to ponder

The honest version of "what about jobs" isn't "nothing changes." The work redistributes. Routine execution matters less, and judgement and review matter more. That's a redesign of roles, and pretending otherwise insults a senior reader.
Senior people aren't exempt from this. The data has agentic use rising across every seniority level. Senior individual contributors and principals sit at around 12% of their AI work on the agent, managers and directors at 5%, but every level is moving, and the senior end leans on it for planning and review rather than typing. If the top of the house treats it as something for juniors, that's a signal in itself.
The cheapest thing to copy from the frontier example isn't the technology. It's the wrapper around it: a feedback loop, plus access that doesn't make people count their questions.
Verifiable work moves first. Software went first because you can check it so the closer a task is to "you'd know quickly if it were wrong," the sooner an agent can take a real swing at it. Worth knowing where your own easy-to-check work sits.

The bigger picture, and the bit that's on us

Whether agentic AI lands as a cutting exercise or a building one is mostly a leadership choice, not a technology outcome.

Point it at the organisation as "where can we trim" and you get fear, thinner roles, and people watching the exits. Point it at "what could our people do that they genuinely can't do today" and you get something else entirely. The tool is the same but the framing, and the redesign that follows, is the job.

It would be easier if this were a procurement decision, but it isn't. It's a decision about how the work gets organised, and what we free people up to actually do. The teams that got the most out of it didn't have better software than everyone else. They had the conditions, and someone deliberately built those conditions.

If you've used a chatbot at work and wondered what the fuss is about, that instinct is fair. The chatbot was the warm-up. The real question, is what you'd rebuild if the work could be handed over rather than just sped up.

♻️ Enjoy this? Repost to your network if you found this useful.

🔔 Follow Martin Wheatley more AI for leadership.

✅ 𝙃𝙚𝙡𝙥𝙞𝙣𝙜 𝙡𝙚𝙖𝙙𝙚𝙧𝙨 𝙗𝙚𝙘𝙤𝙢𝙚 𝘼𝙄-𝙘𝙤𝙣𝙛𝙞𝙙𝙚𝙣𝙩. Putting AI to work with your people, without replacing what makes them great.

⭐️ If that's the conversation you're interested in having, leave a comment, drop a DM, book a call, or visit my website. All links in my profile.

References

Johnston, D., Holtz, D., Richmond, A. M., Ong, C., Tambe, P., and Chatterji, A. (2026). The Shift to Agentic AI: Evidence from Codex. OpenAI. https://openai.com/index/how-agents-are-transforming-work/?__cf_chl_f_tk=MK_zWOn2Pu5UrngVe3MkMdOOafU36N5gfG6sd7jLmrE-1782850102-1.0.1.1-UFvixOn0su5h9NnOxCwPIoT7f_L6W7ReTFuknueXT_w

David, P. A. (1990). The Dynamo and the Computer: An Historical Perspective on the Modern Productivity Paradox. American Economic Review, 80(2), 355-361.

Tambe, P. B. (2026). Reskilling the Workforce for AI: Domain Expertise and Algorithmic Literacy. Management Science, 72(1), 515-537.

Hitzig, Z., Massenkoff, M., Lyubich, E., Heller, R., and McCrory, P. (2026). Agentic Coding and Persistent Returns to Expertise. Anthropic. https://cdn.sanity.io/files/4zrzovbb/website/433472e34b60db1a52ebf0b8c6600f057b6908c5.pdf