Last week, OpenAI lifted the veil on Aardvark, an internal AI security agent that allegedly does what most scanners only promise: fixing bugs and catching exploits before they become a problem. Picture an AI-powered Kaiju scouring non-metaphorically through codebases, sniffing out exploits and firing back merge-ready patches.

For European companies weighed down by legacy code, Aardvark marks the real shift (i.e. AppSec is not just shifting left). OpenAI envisions a Kaiju AI that reads, reasons, and fixes bugs and exploits faster than any attacker. The economics are clear: stop scanning for problems, start deploying agents that solve them.
OpenAI’s most release was Sora 2, an AI video mesh up that spawned a million Sam Altman videos. So why is it releasing Aardvark in a closed beta? According to Aitel, it comes down to SDLC necessity driven by economics and risk. When each pre-training run costs millions, even a single bug can result in compute spend of millions go down the drain. Any next-gen SDLC without an AI tool in the code quality regime will raise serious eyebrows. OpenAI investing in Aardvark confirms AI-native AppSec as seen as core infrastructure. AI tools like Aardvark are now a category line in the SDLC.
Aardvark runs continuously across repositories, flags vulnerabilities by exploitability, and focuses engineering effort where risk is real. It generates minimal diffs with paired regression tests, then routes fixes for human approval. The engine excels at logic errors and cryptographic mistakes that reviewers miss. The integration blueprint:
Aardvark Closed Beta Feedback: Early internal runs on specialized codebases showed strong accuracy, so high-value modules should pass through Aardvark before release.
This is a closed loop that starts in the IDE and ends with audit-ready evidence. Each step produces artifacts that engineers can trust, which is a critical factor that helps limit risks for European insurers.
Live threat modeling: add likely attack paths and required controls as you write.
Secure patterns: align code to CWE and OWASP from the first commit.
Policy feedback: enforce GDPR, ISO 27001, and DORA in context.
Dynamic probing: crawl services, fuzz with intent, correlate anomalies.
Goal-driven agents: pursue privilege escalation or RCE with tool use.
Production signals: learn normal traffic and flag abuse patterns.
Concise patches: trace dependencies and propose diffs that hold.
Tests included: generate regression coverage with each fix.
Issue class removal: run campaigns that retire families of bugs.
Safety checks: confirm both correctness and performance impact.
Evidence: pack all findings and fixes for future audits.
As AI-driven security moves from concept to implementation, these tools become part of everyday delivery, analyzing code, and validating results. Doing automated fixes inside the SDLC reinforces both application quality and resilience, from vulnerability detection to QA automation.
We integrate Aardvark-class agents with QA and AppSec to align with our ISO 27001 certification, GDPR, SOC 2, and DORA while maintaining clear traceability across changes.
This operating model keeps auditors satisfied, budgets predictable, and release trains on schedule. A very important consideration for AI tools is cost control: setting agent cost caps in tokens, ideally per repository.
Dave Aitel, a member of the technical staff at OpenAI, goes into the detail of the company’s new security product, Aardvark, below:
Dave Aitel’s recommendations for integrating AI into the SDLC, focusing on the practical implications for software development and cybersecurity. Aitel asserts that integrating AI tools like Aardvark into the SDLC is moving from an advantage to a necessity driven by economic reality.
The goal is not merely to find a high volume of low-quality bugs, but to apply intelligence to critical problems. Question remains if AI can even detect and mitigate zero-day vulnerabilities?
libasan.A software company that uses a tool like Aardvark should make the process easier and keep a good relationship with the development teams.
Aitel’s findings on Solidity suggest that specialized, high-risk code should be among the first to be fed into Aardvark, or alternatively, into your AI SDLC pipeline.
High-Risk, High-Reward: Given the AI’s success in finding bugs in Solidity, any company working with smart contracts or similar complex, high-value code should run them through an AI analysis tool as a critical pre-deployment check.
| Area | Recommendation | Rationale/Benefit |
| Strategy | Make AI Code Quality a Core Requirement. | Reduces overall system instability (crashes, errors) and security risk. |
| Economics | Budget for AI Tokens/Compute. | The cost of AI analysis is less than the cost of undetected bugs/downtime. |
| Implementation | Integrate Continuous Monitoring with an Automated Validator. | Counters “software entropy” (1-2% of commits introduce flaws) and maintains high signal-to-noise ratio. |
| Vulnerability Focus | Target Logic Flaws and Crypto Code. | AI excels at these complex areas that are difficult for human review and traditional tools. |
| Process Control | Maintain Human Control over Deployment. | A human must validate all fixes/patches to ensure 100% accuracy before merging. |
| Culture | Prioritize Confidentiality and Help. | Ensures AI is seen as a supportive tool, not a public shaming mechanism for developers. |
The future is not one of no vulnerabilities, but one of two internets, as Bruce Schneier predicted. The first will be the “new” internet, composed of code that is “born secure,” continuously validated by AI agents from its first commit.
The second will be the “legacy” internet, composed of the billions of lines of existing code that are not easily scanned or patched by these new agents. This legacy code becomes the primary, undefended attack surface for offensive AI, creating a systemic risk that will define cybersecurity in the next decade. TINQIN’s goal is to explore all the tools and techniques to help organizations to inventory, defend, and ring-fence this newly vulnerable legacy estate.