AI-Generated code speeds up development, but doesn’t guarantee ROI

AI coding tools are moving fast. They’ve made writing code faster and easier, cutting out barriers and letting teams iterate on software much quicker. That’s good. But if your goal is return on investment, as it should be, then squeezing speed out of the development phase alone won’t cut it. What happens after the code is written matters more.

Charity Majors, CTO of Honeycomb, a company focused on performance and observability, says AI went for code generation first because it was easy. But actual value shows up deeper in the process. Think about testing, security, deployment, and maintenance, these steps shape long-term performance. And without AI helping here, you end up offsetting those early time savings with mounting complexity down the line.

Here’s the real issue: speed at the beginning without intelligence at the end slows you down when it matters most, production. AI must be part of the full SDLC if you want predictable outcomes and fewer interruptions. It has to work not just for writing code, but for shipping and sustaining it.

Executives should focus AI investments where most impact is created: quality control, incident response, and system health. Those are the areas where downtime hits customer experience and company reputation directly. Rushing into AI with a limited scope misses the compounding effect of lifecycle-level integration.

Accelerating development is just the starting line. ROI comes from making every part of the delivery process smarter and faster.

AI’s real opportunity lies in production, not just in coding

We’re now at the point where AI’s role in software development needs to expand. Coding is only one piece of it, and it’s already partly solved. If your team is still focused solely on faster code output, you’re already behind. The real performance gains show up in production, where your apps live, scale, and fail.

Charity Majors makes a clear point here: production is where the meaningful progress will happen. Faster, real-time feedback loops let developers validate AI-suggested code changes immediately. Incorporating canary deployments and feature flags means you can test features in slices, learn faster, and reverse wrong turns with minimal disruption. This is where speed scales responsibly.

Code is now deployed more like a conversation, iterative, dynamic, ongoing. Developers aren’t stopping and planning a release months down the line. They’re shipping continuously. That shift demands tighter control in production, or you end up fixing fires instead of building roadmaps. AI can either make that process more chaotic, or more decisive.

For leaders, there’s a strategic inflection point here. You either integrate your AI into fast-paced production processes, or you’ll struggle to capture its real advantage. This isn’t about shipping more code. It’s about creating stable environments where updates roll out safely and data drives continuous improvement.

There’s no upside to AI that you can’t measure during live operations. If it doesn’t make production better, safer, and faster, it doesn’t move the needle.

AI-Generated code reduces clarity and weakens system reliability if left unchecked

AI-generated code does what it promises, it writes fast. But the cost of that speed is starting to show. Teams are running into code that’s bloated, poorly understood, and hard to maintain. That’s a real issue, especially in production where code quality and observability matter the most.

Charity Majors from Honeycomb puts it bluntly: the real difficulty isn’t generating the code, it’s understanding it when something breaks. This is where the gap opens up. Most AI-generated code lacks the kind of transparency engineers need during incidents. When there’s a production failure, vague commit histories and general-purpose AI suggestions don’t help you find the root cause. If you don’t understand your system fully, you can’t fix it quickly. That kills team velocity when it matters most.

It’s also making debugging harder. When you don’t know who wrote the code, or if a human even fully reviewed it, you strain your operations teams trying to trace errors back to their origin. Unless you have advanced observability tools already in place, you’re not seeing enough to fix things fast. And most companies don’t.

This isn’t theoretical. Studies already show that AI-generated code introduces more bugs, expands codebases unnecessarily, and often hides vulnerabilities that surface later. That’s expensive. Not just financially, but in terms of trust and engineering bandwidth.

You want speed, but not at the expense of clarity. If your pipeline produces more, but you can’t maintain what you ship, you’re building fragility into your infrastructure. Resilient systems demand code you can understand, own, and improve at any moment.

Code ownership is non-negotiable as AI blurs authorship

AI is changing who writes the code, but it’s not changing who’s responsible for it. That still sits with your engineering team. And in this new dynamic, ownership matters more than authorship.

Here’s what’s happening: as AI tools assist more with writing and refactoring code, the lines start to blur. Who owns what? Who understands it well enough to defend it during failures or outages? Charity Majors is clear: “You need to own your code.” Not in a theoretical way, in a practical, accountable, production-ready kind of way.

Fast delivery without deep ownership creates instability. When developers are shipping code they haven’t fully thought through or tested in proper environments, you’re increasing the chance of firefighting later. That cuts deep into engineering efficiency, reliability, and customer trust.

Tight feedback loops solve this. If developers can see the behavior of their code immediately in an environment that closely mirrors production, they can spot flaws before they reach users. That’s scalable accountability. It also brings about better instinctive practices, developers get better at seeing their own risks sooner, and shipping more intentionally.

This is where leadership should focus. AI should enhance the ability to deliver structured, high-quality code, but not at the expense of accountability. Systems need to be built to reinforce code ownership across teams. That includes implementing smarter release workflows, canary deployments, feature gating, and rollback systems, so changes can be tested with users, monitored in real time, and dialed back if required.

Relying on AI doesn’t remove responsibility. It increases the importance of core engineering discipline. Teams that prioritize ownership will use AI to move faster and smarter. The others will spend their time reacting to production issues they didn’t see coming.

AI can upskill developers, but overreliance weakens core engineering capability

AI coding tools are now powerful enough to support workflows beyond code generation, everything from API refactoring to system health checks. Used correctly, they help junior engineers move faster and learn context more efficiently. This is progress. But handing off too much to AI can eventually degrade human capability.

Charity Majors, CTO of Honeycomb, makes this plain: when developers only supervise the AI instead of solving real problems themselves, their skills start to decline. The irony is clear. The same tools built to augment engineers can, if used carelessly, make them less technically competent over time.

That’s why it’s critical for companies to architect workflows that use AI to strengthen, not replace, developer engagement. For example, having AI generate code is valuable, but only if the engineers are actively evaluating, editing, and improving the output. When teams interact with AI through iterative prompting, error-checking, and context-guided adjustments, they develop stronger instincts over time. This creates compound value: faster delivery and sharper engineers.

At Honeycomb, they’re experimenting with frameworks that treat AI tools as interactive coaches instead of passive generators. This means guiding developers with checklists, prompting reflection, and enforcing best practices through automated feedback loops. It’s not just about delivering code, it’s about improving how that code is designed, tested, and deployed.

For executives, the takeaway is simple: AI should not be a short-term shortcut. It should be a long-term amplifier. If your AI strategy doesn’t promote foundational learning, it’s leading to decreased capability across your engineering teams, even as output rises.

ROI from AI tools demands resilience metrics, not just speed gains

You can’t manage what you don’t measure. And when it comes to AI investment, the usual performance metrics won’t give you the full picture. Time-to-code and lines-of-output are surface-level. Real ROI shows up in resilience, how well your systems perform when stressed, how quickly issues are resolved, and whether humans trust the tooling.

Charity Majors points to engineer trust as one of the most important metrics. If the people using the AI rely on it, you’re on the right path. If they avoid it, or spend more time managing its mistakes, you’re losing value. Qualitative signals matter here, even if they don’t show up in your ops dashboard.

But subjective signals aren’t the only ones that matter. Systems must also prove they can hold under pressure. Every company faces failure, it’s a matter of when, not if. That’s where Service Level Objectives (SLOs) come in. If AI-enhanced systems lower failure rates, shorten incident response times, or make issue triage more accurate, those indicators are hard evidence. These are measurable, operational gains that directly tie into customer experience and infrastructure efficiency.

Leadership should track resilience as closely as they track velocity. Faster code generation that leads to slower recoveries, higher rollback rates, or increased downtime negates its own benefit. Evaluate AI in real conditions. Monitor production. Stress-test often. And recognize that operational robustness, not raw output, is the strongest case for AI’s long-term value.

Any AI-backed system should scale without increasing fragility. If it doesn’t, you’re measuring the wrong outcomes.

Platform engineering is replacing traditional DevOps in the age of AI

DevOps isn’t going away. It’s just being reshaped. The responsibilities tied to DevOps, especially around deployment, observability, and production ownership, are consolidating under what’s now called platform engineering. This shift isn’t theoretical. It’s already operational.

Charity Majors, CTO at Honeycomb, explains it clearly: we’re at the twilight stage of the DevOps movement. Not because we’ve perfected it, but because its cultural principles have largely been accepted and baked into modern engineering. What’s emerging now is a model built for speed, autonomy, and AI integration, where teams don’t just develop code; they run it, monitor it, and own its outcomes in production.

Platform engineering supports this model by providing internal infrastructure and tooling that engineers use directly. No handoffs, minimal bottlenecks, and better scale, especially where AI is involved. With AI accelerating parts of development and deployment, teams need systemized ways to operate safely and consistently. Platform teams create that foundation.

For executives, this transition means more than new team titles. It’s a shift in responsibility structure. Engineers are now accountable for what happens after code is merged. That operational ownership, once siloed in DevOps or site reliability engineering (SRE) teams, is moving firmly into core development teams. And it needs to be supported with tools that enable secure automation, clear observability, and fast rollback capability.

If you’re investing in AI to move fast, but relying on legacy operations to catch problems, you’re running outdated infrastructure around a modern process. Platform engineering bridges that. It scales operational best practices across teams and ensures AI-generated output doesn’t outpace your ability to manage it.

AIOps is where AI will deliver real ROI, not just code output

Most organizations are still focused on AI for writing code. That’s useful, but it barely scratches the surface. The stronger impact, and the one that ties directly to operational efficiency, is AIOps. That’s where AI tools assist with debugging, monitoring, incident response, and system optimization.

Charity Majors is direct about this: AI should stop making more code and start improving lifecycle performance. That means helping engineers identify issues faster, optimize infrastructure on demand, and reduce time to recovery after a failure. These are areas where performance, cost savings, and system uptime intersect, and where real value is created at scale.

AIOps is still maturing, but the direction is clear. Instead of relying on manual alerts, engineers can use AI-driven observability to detect behavior shifts in systems early. Instead of waiting for errors to escalate, AI can flag anomalies, auto-triage incidents, and assist with root-cause analysis. It strengthens engineering confidence while protecting customer experience.

From a leadership perspective, AIOps aligns engineering with business goals by reducing risk exposure, minimizing downtime costs, and giving teams more time to build rather than fix. It also allows teams to handle higher system complexity without growing operational headcount at the same rate. That’s margin improvement through smarter automation, not just more productivity.

Companies that treat AI as a lifecycle tool rather than a code generator will gain the long-term advantages. Those that don’t will be dealing with more systems, more change, and more fragility, with no strategic buffer in place.

The future of AI in engineering is full-lifecycle. Leaders who understand that early will lead both in speed and in resilience.

Recap

AI in software development is moving quickly, but speed alone doesn’t justify investment. The value isn’t just in writing more code. It’s in reducing operational friction, increasing resilience, and strengthening engineering capability over time. That requires disciplined implementation, not just automation for automation’s sake.

Executives should focus less on how fast AI can generate output, and more on what it takes to support that output through production. Lifecycle integration, observability, and real code ownership aren’t optional, they’re the foundations of scalable software in an AI-powered world. Every shortcut that skips these fundamentals adds compounded risk to future operations.

This is about setting your teams up to deliver long-term performance, not just short-term throughput. AI tools that enhance decision-making, improve clarity, and reinforce accountability will outperform those that focus only on faster code generation.

If the ambition is to build systems that move fast, stay up, and adapt continuously, then AI’s role needs to evolve from contributor to collaborator across the entire engineering lifecycle.

Alexander Procter

August 6, 2025

12 Min