Back to blog

Security · Jun 16, 2026

62% of AI-Generated Code Has Security Flaws: The Secure SDLC Was Never Built for This

Traditional SDLC assumes humans write code, humans review code, and security checks happen at fixed phases. AI breaks all three assumptions. Code, dependencies, CI workflows, and infrastructure now arrive together — pre-generated, pre-approved, and pre-merged before any human reviews them.

Secure SDLCAI Code GenerationActive ASPMDevSecOpsAI Pipeline Security

62% of AI-Generated Code Has Security Flaws: The Secure SDLC Was Never Built for This

A Cloud Security Alliance analysis published in 2025 found that 62% of AI-generated code contains design flaws or known vulnerabilities due to missing threat models and policy context. By 2026, that proportion has only grown. AI now generates code, dependencies, CI workflows, infrastructure definitions, and even the pipeline configurations that gate release approvals — and it generates them together, in a single step, often merged and deployed before any human reviews the result.

The Secure SDLC was designed for a development model that no longer exists. The original model assumed engineers write code, reviewers inspect it line by line, and security checks happen at fixed phases: design review, implementation, build, release. AI-driven development compresses these phases. Code, libraries, and infrastructure arrive pre-bundled. The step-by-step reviews that secure SDLC processes rely on do not happen because there is no sequence to review — there is a single artifact produced in a single step.

IBM's 2025 Cost of a Data Breach report highlighted how small SDLC gaps translate into real runtime exposure. The implication is unambiguous: when build-time checks are bypassed by automated generation, the gaps become exploit paths. The organizations that will operate AI-generated code securely in 2026 are the ones that recognize the SDLC has shifted from a sequential human-driven process to a continuous, automated, runtime-correlated risk control system.

Why Build-Time Security Fails for AI-Generated Applications

AI can generate entire applications in a single step: code, dependencies, CI workflows, deployment configurations, and infrastructure-as-code definitions. These changes are committed by AI agents, merged through preconfigured CI rules, and approved in bulk as a single pull request. The step-by-step reviews and approvals that secure SDLC processes rely on are bypassed — not maliciously, but structurally. The artifact is a single unit. The human reviewer either accepts the unit or rejects it; they cannot review individual decisions within it.

This means security risk no longer comes only from bugs in the code. Even "correct" code can be dangerous depending on the APIs it calls, the permissions it has, and the data it can access in production. Build-time checks alone miss this because they cannot see how the application actually behaves at runtime. A piece of code that passes SAST, SCA, and secrets scanning may still call an API with overly broad scope, may still inherit a service account with read access to the entire database, may still deploy with default security settings disabled.

The limits are structural. Security tools that operate on static snapshots of code or dependencies assume the snapshot represents deliberate choices. In AI-driven pipelines, the snapshot represents a model's best guess at what the developer wanted, evaluated against a context the model does not fully understand. The traditional interpretation — that a clean SAST result indicates a secure component — does not hold when the component's security properties are an emergent property of how the code interacts with its runtime environment.

The Three Broken Assumptions

The traditional SDLC rests on three assumptions. AI-driven development invalidates all three.

Assumption 1: Human authorship implies reviewable intent. When a human writes code, they make conscious decisions: which library to use, which authentication pattern to apply, which permissions to request. The code reflects the author's understanding of the requirements. AI-generated code does not reflect a human's understanding of the requirements — it reflects a model's pattern-matching against similar code it has seen. The intent is the user's prompt, not the developer's mental model. Reviewing the resulting code requires reconstructing the intent from the code itself, which is what threat modeling tries to do — but threat modeling depends on understanding the architecture, and the architecture was generated in the same step as the code.

Assumption 2: Sequential phases produce reviewable checkpoints. The traditional SDLC moves through requirements, design, implementation, testing, deployment, operations as distinct stages. Each stage has defined inputs, defined outputs, and a defined review gate. AI-driven development collapses these stages. Requirements, design, implementation, and even initial testing are produced together. The "design review" gate does not exist because there was no design phase. The "implementation review" gate does not exist because the implementation was generated as part of the design.

Assumption 3: Security tools can reason about risk at the artifact level. SAST tools analyze code for known vulnerability patterns. SCA tools analyze dependencies for known CVEs. Secrets scanners look for hardcoded credentials. These tools operate on individual artifacts and reason about risk at the artifact level. In AI-generated codebases, the risk emerges from the interaction between artifacts: the code calls an API, the API requires an authentication scope, the scope is granted by an IAM policy, the IAM policy is too broad, and the code's behavior in production is more permissive than the code's text suggests. None of these tools can reason about the interaction.

The three assumptions are not independent. They reinforce each other: code review depends on reviewable intent; review gates depend on sequential phases; artifact-level analysis depends on artifacts that are meaningfully distinct from each other. When AI breaks one assumption, the others become harder to defend.

The Six Phases of Secure SDLC in AI-Driven Pipelines

A Secure SDLC for AI-driven pipelines does not replace the traditional phases — it restructures them to operate in an environment where humans and automated systems share ownership. The six phases are:

1. Risk Assessment and Requirements Definition

The SSDLC begins by identifying security requirements alongside functional requirements. In AI-driven development, this includes defining acceptable use of AI-generated code, approved dependency sources, model input restrictions, and access boundaries. The requirements document becomes a policy that the AI generation tool and the runtime enforcement layer both reference.

Missing requirements let AI output make unsafe assumptions that shape the entire lifecycle. If the requirements document does not specify that production databases must use read-only credentials, the AI will generate code that uses whatever credentials are easiest to call. If the requirements document does not specify that user input must be validated at the API boundary, the AI will generate code that trusts user input. The requirements document is the source of truth for security intent; without it, the model's defaults are the security policy.

2. Threat Modeling and Secure Design Review

This phase identifies potential attack paths before implementation. In AI-generated systems, threat modeling must account for automated architecture decisions, API exposure, permission inheritance, and data access patterns suggested by models. Skipping this step risks systemic vulnerabilities that are hard to fix later.

The practical challenge: threat modeling assumes an architecture to model. When AI generates the architecture together with the code, the threat model must be created from a combination of the requirements document and the AI's stated design rationale. The model can produce design documentation as part of the generation process — and should, as a mandatory output. The threat modeler works from this documentation, identifies the trust boundaries and data flows, and validates that the AI's design decisions align with the security requirements. Mismatches between requirements and design are flagged before code is generated.

3. Secure Implementation and Static Analysis

During implementation, SSDLC emphasizes secure coding practices and static analysis to identify weaknesses early. In AI-driven pipelines, static analysis evaluates code that may not have been consciously written by a developer, increasing the importance of understanding what the code does rather than how it was created.

The shift in focus is from line-by-line review to behavior-level review. The static analyzer flags potential issues; the reviewer (human or automated) determines whether the flagged code is actually a problem in the deployment context. A code pattern flagged for SQL injection risk is a real issue if the input flows to a database; it is a false positive if the input is regex-validated upstream. The reviewer's job is to validate the context, not to re-review the line.

4. Security Testing and Code Review

Security testing expands beyond functionality to analyze how the application behaves under misuse or attack conditions. In AI-generated codebases, human code review often shifts toward approval rather than deep inspection, with reviewers focusing on build success or test results instead of security implications across the stack. Automated tests highlight issues abstractly. Without linking findings to runtime behavior, the results are incomplete.

The phase produces two artifacts: a test report and a risk register. The test report catalogs identified vulnerabilities with severity ratings. The risk register tracks which vulnerabilities are accepted, mitigated, or remediated, and which carry forward to the runtime phase. For AI-generated code, the risk register is the document that connects build-time findings to runtime risk — the place where a SAST warning either becomes a deployed vulnerability or is formally accepted with documented justification.

5. Secure Configuration and Deployment Validation

Before release, SSDLC requires verification of configurations related to authentication, authorization, secret handling, and environment isolation. AI-generated pipelines and Infrastructure as Code (IaC) frequently introduce overly permissive settings that slip into production. This phase ensures earlier controls translate into safe operations.

The validation includes: are the IAM roles scoped to least privilege? Are the secrets stored in a managed vault, not in environment variables? Are the network policies enforcing egress restrictions? Are the container images scanned for vulnerabilities and signed for integrity? Is the deployment manifest's security context aligned with the application's actual runtime requirements? Each check answers a specific question about whether the production environment matches the design.

6. Security Assessment and Runtime Monitoring

The final phase evaluates the system after deployment, focusing on how it behaves in production. Live permissions, API usage, and data access patterns determine whether identified weaknesses are reachable. In AI-driven environments, this phase is critical for validating whether earlier assumptions hold once the system is exposed to real traffic and inputs.

Runtime monitoring is the layer that catches what build-time checks cannot. An AI-generated application may have passed every static check and still behave insecurely in production — because the model's assumptions about the runtime context were wrong, because the configuration drifted, or because the application's behavior at machine velocity is different from its behavior in test. Runtime monitoring observes the actual behavior and flags deviations from the expected security baseline.

This is where Facio (the HITL-first agent runtime) provides the enforcement and observation layer for AI agents specifically. Every tool invocation is policy-checked at runtime, every authorization decision is logged, and the audit trail captures the full decision chain — not just the artifact, but the runtime behavior that the artifact produces. The runtime monitoring layer is the one that closes the gap between "the code passed review" and "the code behaves securely."

Active ASPM: The Foundation for Secure SDLC in AI-Driven Pipelines

Traditional ASPM (Application Security Posture Management) aggregates findings from individual tools: SAST, SCA, secrets scanning, IaC scanning. Active ASPM treats secure SDLC as an interconnected system. Security decisions are derived from how code, pipelines, artifacts, APIs, and runtime services relate to one another, rather than from isolated scan results.

The shift from aggregation to correlation is the architectural difference. A SAST finding on line 247 of file X is a static data point. A SAST finding on line 247 of file X, in a code path that calls API Y, which requires permission Z, which is granted to role R, which is assigned to user U, is a security context. The vulnerability's actual risk depends on the entire chain, not on the line where the pattern was detected.

For AI-driven pipelines, the correlation extends further. The AI generation tool's output is itself an input to the correlation system. The tool produced file X with the pattern on line 247; the prompt that requested this output included certain requirements; the requirements document specifies certain security constraints; the runtime environment enforces certain policies. The correlation determines whether the output satisfies the constraints in the runtime environment.

This is what "active" means in Active ASPM. The system does not just collect findings; it actively reasons about whether the findings represent actual risk given the deployment context. The shift is from static posture assessment to dynamic risk correlation.

What the Correlation Reveals

When signals are correlated across the SDLC, several patterns emerge that are invisible in isolated scans:

Code-to-runtime reachability. A SAST finding may flag a vulnerability that is technically present in the code but is unreachable in production. A correlated system determines: is this code path actually invoked? Is the input vector reachable from an external request? Is the vulnerable function called by any production code? The answer determines whether the finding is actionable or whether it can be deferred.

Permission-to-code alignment. An IAM role may grant broad permissions, but the actual code may use only a subset. The correlation determines whether the granted permissions exceed the used permissions. Excessive grants are a risk indicator; the runtime correlation surfaces them.

Pipeline-to-artifact integrity. A CI pipeline produces an artifact. The artifact is deployed to production. The correlation determines whether the deployed artifact is the one the pipeline produced — no manual override, no signed-image substitution, no silent rebuild. The integrity chain is verified end-to-end.

Build-time-to-runtime finding reconciliation. A SAST finding is flagged at build time. The deployment proceeds. The runtime monitor observes the vulnerable code path in production. The correlation closes the loop: the build-time finding is now a confirmed runtime exposure. The severity is updated; the remediation is escalated.

These correlations are impossible in scan-centric models. They require an architectural commitment to treating the SDLC as a continuous risk-control system, not as a sequence of isolated checkpoints.

The Continuous Risk Control Loop

The Secure SDLC for AI-driven pipelines operates as a continuous loop, not as a sequence of phases with defined endpoints. The six phases are present at any given moment; what changes is which phase is most active for a given change.

When a developer (or an AI agent on the developer's behalf) commits a change, the risk assessment phase re-evaluates whether the change fits within the documented requirements. The threat modeling phase re-evaluates whether the change introduces new attack paths. The implementation phase analyzes the generated code. The testing phase runs the appropriate test suites. The configuration phase validates the deployment manifest. The runtime phase observes the deployed behavior and reports back.

The loop closes when runtime findings inform earlier phases. A runtime anomaly that surfaces in production triggers a re-evaluation of the threat model, an update to the test suite, and a revision of the requirements. The next generation of code, by the same AI or a different one, is informed by the runtime findings. The SDLC becomes a learning system, not a static process.

Where Human Review Adds Value in the Loop

If the loop is mostly automated, what is left for the human? The same answer that applies to the AI red-teaming analysis from June 2026: triage, context, and design.

Triage. Automated systems produce findings at scale. The human's job is to determine which findings reflect real risk in a specific deployment context, which are false positives, and which represent acceptable risk. The triage decision is the one that determines what gets fixed and what gets shipped.

Context. The requirements document, the threat model, and the design rationale require human judgment. The AI can generate code that satisfies the requirements it was given; it cannot determine whether the requirements themselves are correct. The human defines the security intent; the system enforces it.

Design. Architecture decisions — which services to call, which data stores to access, which authentication patterns to use — are design decisions, not implementation decisions. The AI can implement the design; the human makes the design. For AI agents specifically, the design includes the tool allowlist, the permission scopes, the runtime policy boundaries, and the HITL approval thresholds — the structural decisions that determine the security architecture.

Placet.io (the HITL inbox and messenger) provides the structured approval layer for these design decisions: when an agent deployment requires architectural review, when a tool allowlist change requires human authorization, when a runtime policy decision crosses a configured threshold. The human review is recorded as part of the audit trail, making the design decisions attributable and reviewable.

The Bottom Line

The Secure SDLC was never built for AI-driven development. The original model — sequential phases, human authorship, artifact-level review — is structurally incompatible with pipelines where code, dependencies, and infrastructure are generated together, in a single step, often merged and deployed before any human review.

The replacement is not a new SDLC. It is a continuous risk control loop: six phases operating concurrently, correlated signals from build to runtime, and human judgment applied where it adds the most value — at the design, context, and triage boundaries. The organizations that will operate AI-generated code securely in 2026 are the ones that have made this architectural shift, from sequential checkpoints to continuous risk correlation, and from static posture assessment to active runtime enforcement.

The 62% of AI-generated code with security flaws is not a reason to slow AI adoption. It is a reason to build the SDLC that AI-driven development requires — one where the design, implementation, deployment, and runtime layers are continuously correlated, and where the security intent defined in the requirements document is enforced at every layer from code generation to tool invocation.

The alternative is shipping 62%-flawed code at machine velocity and discovering the flaws in production breach reports. The SDLC has changed. The discipline has not. The implementation has.


Further reading: