Claims of productivity gains from AI coding tools aren’t consistent

There’s a lot of noise around AI tools like GitHub Copilot, Cursor, and similar platforms claiming to supercharge developer productivity. The story usually sounds the same: faster coding, smarter suggestions, accelerated delivery. And yet, when we look away from the marketing and into the actual workflows of engineering teams, results are far more restrained. According to LeadDev’s 2025 Engineering Leadership Report, only 6% of 617 surveyed engineering leaders observed substantial productivity gains from these tools.

That’s the data. Just 6%.

Now compare that to the messaging from the vendors themselves. GitHub claims 88% of Copilot users “feel” more productive. Jit’s Director of Engineering, Daniel Koch, went on record stating their team is up to three times faster delivering features using Cursor. JPMorgan Chase’s Global CIO, Lori Beer, shared a 10–20% improvement among internal engineering teams using a homegrown AI assistant. These numbers sound great, and in isolated, controlled environments, they might even be accurate. But “feeling” productive and achieving sustainable, measurable productivity are different things.

You can feel fast, but are you actually moving faster, and in the right direction?

As with any new tech, perceived success can get inflated fast, especially when it’s driven from the top without full scrutiny on daily execution. That disconnect is starting to show. Developers are using the tools, but the dramatic leaps promised by early AI hype aren’t showing up in delivery metrics, sprint performance, or broader outputs. In most cases, we’re seeing incremental improvements, if that. And for engineering leaders tasked with driving performance at scale, isolated success stories aren’t enough. They need compounding value across the organization.

That’s where the current narrative around AI coding tools breaks apart. We’re chasing scale from examples that may not scale. It’s not that the tools have no potential, they clearly do. But real productivity isn’t happening just because someone saved five minutes writing a function. It happens when systems evolve, friction gets removed, and teams solve problems faster. We’re not quite there yet.

Executives should treat these early results as signals, not solutions. Invest, iterate, stay close to your frontline engineers. Skip the noise. Focus on what’s actually moving the needle.

AI coding tools are primarily focused on code generation and related tasks

There’s no question AI tools are becoming part of the developer’s daily environment. Since the release of ChatGPT in late 2022, tools like GitHub Copilot and Cursor have been integrated directly into the coding workspace, inside popular IDEs like VS Code and JetBrains. Developers are using them for what they do well: code generation, refactoring, documentation. That’s where the bulk of usage is happening. According to the LeadDev Engineering Leadership Report, 47% of engineering leaders noted their teams use AI for code generation, 45% for refactoring, and 44% for documentation.

These are useful tasks. They save time, and they reduce repetitive load. But they’re not where the biggest problems are. The hard constraints live elsewhere, in test feedback loops, deployment delays, and communication gaps between engineering and other teams. AI usage in areas like bug fixing stands at just 22%. Internal communication? Barely 28%. That’s where time is wasted. That’s where progress stalls.

When code is easy to write but slow to ship, you haven’t solved the real problem.

Many tools today focus on the visible parts of the development process. They don’t address the underlying inefficiencies that stretch timelines and bleed momentum. Rebecca Murphey, Field CTO at Swarmia, pointed out that delays often come not from the hands-on typing of code, but from the waiting, waiting for tests to pass, for builds to complete, for approvals to finalize. These aren’t coding problems. They’re system inefficiencies. And AI isn’t currently being used strategically enough to fix them.

Here’s the nuance: AI applied inside a narrow scope creates limited impact. Organizations that view AI as a feature enhancement will get feature-level gains. Those that want to unlock exponential improvement have to be willing to address foundational challenges, where the real friction exists. That means teams need to move beyond asking how AI can help them write code faster, and start asking where they’re losing time across the entire software delivery process.

Executives looking to scale engineering effectiveness should rethink how AI tools are deployed. Not to automate isolated developer tasks but to resolve persistent, structural friction. Focus your efforts where human cycles are being wasted, not just where AI can autocomplete a function. The bigger gains live behind the scenes.

The top-down adoption of AI tools

There’s a pattern we’re seeing across many organizations: AI tools are being brought in by leadership teams who want fast improvements, but developers, the people doing the work, are often left out of the decision-making process. The result is clear. Tools get deployed to fix problems that aren’t actually slowing teams down, while the real blockers are ignored or left unaddressed.

Andrew Zigler, Senior Developer Advocate at LinearB, called out the core issue: most tools being rolled out today focus solely on writing code. That’s just one part of the process. When you introduce tools without developers weighing in, you risk solving problems that aren’t priorities or introducing new friction where there was none.

For AI to deliver actual impact, it has to be integrated where real pain exists. That integration doesn’t start with tool selection, it starts with identifying what’s broken or inefficient in your workflows. Laura Tacho, CTO at DX, made this point clear: leaders need to move away from pitching AI as a silver bullet. Instead, they need to precisely map where developers are losing time or facing barriers, and align AI capabilities to those points.

Early adoption often leans toward top-down enthusiasm. It’s a common dynamic, tools get pitched as transformative, then purchased, and handed out. But in the absence of bottom-up validation from engineers, deployment loses traction. Rebecca Murphey, Field CTO at Swarmia, reinforced this. You don’t start with a solution. You start by identifying the exact point of friction, work through it, and then repeat that process until you’ve reduced the real constraints.

The nuance for executives is simple: don’t let vision outrun insight. Yes, it’s leadership’s job to aim high and invest in innovation. But achieving meaningful change in engineering, especially with AI, is not about mass distribution. It’s about precision. Find the bottlenecks. Listen to your developers. Bring them in early. If you don’t, you risk implementing tools that offer little value and add management overhead.

Adopting AI since it’s the trend isn’t a strategy. Solving for the exact problem your team faces, then scaling that solution, creates real leverage.

Successful AI integration in engineering requires a collaborative approach

AI in software development won’t transform your organization if it’s treated as a license rollout. The tools alone don’t solve anything. What matters is how they’re used, and more importantly, why they’re being used. If your development teams aren’t involved in that conversation from the start, all you’re doing is spending budget on software that might never meaningfully integrate with the way your people work.

Laura Tacho, CTO at DX, was clear on this point: if you want change across the organization, you need contributions from across the organization. That starts by grounding your AI strategy in real feedback from the developers who understand the bottlenecks and friction points inside their sprints, systems, and teams. It’s not about passing out access and seeing what sticks, it’s about coordination and alignment around a clear improvement path.

Rebecca Murphey, Field CTO at Swarmia, pointed this out as well. Most efforts fail because leadership jumps straight to solution mode. But the real value unlocks when you start with a detailed problem diagnosis: what’s slowing teams down, where are systems breaking, and how do you progressively remove those points of inefficiency? That’s the correct sequence. Anything else just adds complexity.

The nuance for leaders: rollouts don’t equal results. You’re not aiming for superficial adoption. You’re aiming for measurable performance gains. That happens when developers trust that the tools being introduced are built to solve something important, not just handed to them because someone at the top saw a demo. AI can lead to faster delivery, better quality software, and lower burnouts, but only where it’s precision-applied and backed by shared ownership across the organization.

Move deliberately. Don’t throw tech at teams and expect impact. Start from the ground up, align teams, and execute on the parts that move output metrics. Then scale. That’s how AI drives real operating leverage in engineering.

Main highlights

  • AI productivity claims are overstated: Most engineering leaders (94%) don’t see significant productivity gains from AI coding tools, despite vendor marketing. Leaders should challenge inflated claims and focus on internal performance metrics before scaling adoption.
  • AI tools address surface tasks, not core blockers: While adoption is high for code generation and refactoring, AI rarely impacts deeper issues like testing delays or team communication. Executives should target AI toward real workflow bottlenecks to unlock meaningful gains.
  • Top-down adoption risks misalignment: Deploying AI without developer input often leads to poor integration and missed opportunities. Leaders should involve engineering teams early to ensure tools address actual pain points where they work.
  • Impact requires organization-wide alignment: Giving developers AI tools without clear workflows or shared objectives limits ROI. AI initiatives should be tied to cross-functional efforts focused on resolving root inefficiencies with bottom-up validation.

Alexander Procter

August 5, 2025

8 Min