AI-driven CI/CD pipelines are susceptible to malicious prompt injections
There’s a new problem brewing in how software gets deployed, and it’s worth your attention. AI is being embedded into CI/CD pipelines, automated systems that take code and push it into production. They’re fast, scalable, and efficient. But researchers at Aikido Security found a serious weakness: these AI-enabled systems can be tricked into running harmful commands just by reading the wrong text from a GitHub issue or a pull request.
Here’s how it works. Say your pipeline uses AI tools like Gemini CLI or OpenAI Codex to automate parts of the workflow. Sometimes, those tools generate commands based on pieces of input, things like user-submitted pull request descriptions, commit messages, or issue comments. If those inputs aren’t filtered or validated, attackers can hide malicious instructions inside them. The AI reads the input, thinks it’s a legitimate prompt, and acts on it. No complicated hacks. Just plain text.
This means attackers, without hacking into your system, can get the pipeline to leak secrets, manipulate your codebase, or rerun processes with privileged access. In the test environment Aikido built, they showed how Gemini CLI could be manipulated into exposing credentials, all triggered from a standard GitHub issue. No elevated permissions needed.
Now, systems like GitHub Actions and GitLab CI/CD weren’t designed with this risk in mind, because the AI layer changes how commands are interpreted. It’s not just software running; it’s AI generating logic in real time, based on user input. That changes the threat model entirely.
AI’s speed is an advantage, but when connected to systems that make high-stakes decisions, you need controls. You can’t just trust that a model won’t misinterpret something, it doesn’t know intent, it just predicts patterns. If those patterns include “run a shell command,” and the input nudges it in that direction, you’ve lost control.
Google patched this issue when they found it in Gemini CLI after Aikido brought it to their attention. That’s a good move. But plenty of other AI-integrated pipelines are still running blind. Anyone building or overseeing software infrastructure should be asking: are our automation systems making decisions based on untrusted inputs? If the answer isn’t a clear no, that’s a problem.
We need safer defaults and a better understanding of how these systems process data. And we don’t need to wait. The first step is to stop assuming every input is safe, especially in public or open-source contexts. It’s basic hygiene. Your AI does what it’s told, even if it doesn’t know it’s being told to do something wrong.
The exploitation, termed “PromptPwnd,” leverages specific CI/CD configurations
The vulnerability, called “PromptPwnd,” is not some vague theoretical exploit. It’s a direct consequence of poor configuration. Two design decisions, both common, converge to open the door.
First, the AI agent in the pipeline is granted privileged access. This often includes tokens like GITHUB_TOKEN or cloud credentials, powerful keys used to run builds, push code, or access sensitive services. These aren’t low-level permissions. They have reach. Second, the prompts that guide the AI’s behavior include variables pulled straight from user-submitted fields like commit messages or issue descriptions. No checks. No filters. Just raw input passed into the system.
Now put those two together. You have an AI agent with broad authority taking direct instruction, consciously or not, from unverified external sources. That’s where the risk escalates. Attackers don’t need to break into your system if you’ve already handed over decision-making power to a model working off public input.
Aikido Security walked through exactly how this plays out. They crafted a comment in a GitHub issue that looked innocent but included instructions that got picked up by the AI. That input was treated as part of the prompt. From there, the AI generated shell commands that the CI/CD pipeline then executed. Even without real tokens used in the test, the outcome was clear: unintended commands triggered with no manual oversight.
C-suite leaders need to understand that this isn’t just misuse of privilege, it’s unintended delegation. You’re giving real operational power to a system that’s trained to complete prompts based on pattern recognition, not judgment. This changes how trust needs to function in software infrastructure. Traditional security barriers won’t hold if internal systems can be redirected from the outside.
The architecture behind these AI integrations is broadly reused across platforms. It’s not just one product. Gemini CLI, Claude Code Actions, GitHub AI Inference, OpenAI Codex, they all follow similar patterns, and that pattern is flawed if you don’t control the data entering the prompt.
Executives should treat this as a priority issue. The danger isn’t in what’s already happened, it’s in how many environments are unknowingly exposed. Don’t assume your existing configurations are safe. Have your teams explicitly verify: are we passing untrusted text into AI prompts? Are we limiting what those AI agents can do? If not, your exposure is higher than you think.
Mitigation strategies are available and being actively promoted to secure these vulnerable CI/CD configurations
The good news is this vulnerability isn’t without solutions. Aikido Security didn’t just point out the issue, they released tools to detect and reduce the risk. Their approach is focused on giving developer and security teams practical visibility into where unsafe configurations exist.
They’ve published a set of open-source detection rules under the tool “Opengrep,” designed specifically to scan GitHub Action YAML files for signs that user-controlled inputs are feeding directly into AI prompts. In parallel, they also offer a free code-scanning tool that works across GitHub and GitLab repositories. It flags insecure CI/CD patterns, excessive token privileges, and AI workflow flaws that would otherwise go unnoticed until they’re exploited.
This matters because these attacks don’t need complex exploits. They happen when trusted systems process untrusted data without oversight. That’s a preventable failure. If your pipeline takes an issue description and feeds it directly into an AI that can run commands, you’ve built in a vulnerability, by design. The fix isn’t about blocking AI; it’s about structuring your prompts and system boundaries correctly.
Aikido’s guidelines are clear: treat AI-generated output the same way you’d treat third-party code. It’s not safe by default. Validate everything. Put limits on what AI agents are allowed to execute, and never pass user-submitted content directly into AI prompts without sanitization. These adjustments don’t slow innovation, they prevent unnecessary exposure.
From a leadership perspective, the priority is simple. Ask your teams to audit these workflows. If AI tools are present in your CI/CD pipelines, you need assurance on four things: that untrusted data is filtered, that AI actions are sandboxed, that tokens are scoped conservatively, and that any dynamic AI-generated commands are subject to review or containment.
This is about raising the default standard, moving automation forward while keeping it grounded in security controls. There’s no upside in ignoring risk that stems from your own configs. You’re not just protecting your codebase, you’re reinforcing operational reliability. It’s faster to secure this now than deal with cleanup when it breaks.
The threat landscape is widened by the fact that even minimal access levels can facilitate exploitation
One of the more critical takeaways from Aikido Security’s research is this: you don’t need elevated access to trigger one of these AI-driven attacks. In many real-world CI/CD configurations, simply opening a public issue or submitting a pull request is enough to compromise the pipeline, if the system is poorly structured.
That lowers the barrier for malicious actors significantly. These aren’t insider threats or zero-day exploits. This is basic input exploitation. Anyone with access to a public repository, no credentials, no prior trust, can inject prompt-based instructions into descriptions or comments. If those inputs reach an AI model running with high-privilege access inside an automated workflow, they can lead to command execution, data exposure, or worse.
Some workflows still require collaborator permissions to exploit, but not all. Aikido found cases where general access is enough. That expands the risk surface beyond typical internal security models and moves the problem into the public domain. If your CI/CD pipeline listens to AI-generated commands, and if that AI takes input from open submissions, your attack surface is larger than you likely realize.
For leadership, this should prompt immediate reassessment of trust models, particularly in open-source or externally accessible environments where large contributor bases interact with code and configuration artifacts. Verification shouldn’t stop at authentication. It needs to extend to behavioral control: what can each workflow actually do? What inputs do they consume? Are we exposing sensitive automation processes to the public through silent assumptions in our design?
Aikido demonstrated these exploit paths in controlled settings. They didn’t need to use real tokens or break into systems to prove the point. That’s confirmation that existing infrastructure, if left unaudited, may already be vulnerable purely due to poor architectural hygiene. What looks like an edge case is actually a widespread pattern that will grow in risk as AI-integrated automation spreads.
The takeaway is straightforward: minimal external input can trigger maximum internal consequences if the configuration allows it. Businesses should respond with targeted actions, restrict the scope of AI capabilities in pipelines, decouple AI decision-making from user comments, and lock down what gets executed as a result of any prompt. Threat detection isn’t enough when the entry point is wide open by design.
Key highlights
- AI workflows can be exploited through public inputs: If AI tools in CI/CD pipelines process unfiltered data from GitHub issues or pull requests, attackers can silently inject commands. Leaders should mandate input validation and privilege limits across all automated workflows.
- PromptPwnd stems from flawed system design: The issue arises when AI is both highly privileged and fed user-generated content. Decision-makers should review all automation logic that allows models to interpret and act on external inputs.
- Practical mitigation tools exist and should be adopted: Open-source scanners like Opengrep can flag insecure workflows, identify overly permissive tokens, and catch unsafe AI prompt usage. Executives should prioritize integrating these scans into the development pipeline to reduce exposure.
- Low-level access can trigger high-level consequences: Even non-collaborators can exploit some flawed workflows simply by opening issues. Leaders must push for immediate audits of public repositories and treat all external inputs as potential threat vectors.


