AI vs AI: When Agents Attack

HARDENED

Cybersecurity Intelligence

Issue No. 004  ·  March 23, 2026  ·  Monday Deep Dive  ·  hardened.news

> The signal. Not the noise.  —  For teams that defend.

Enterprise

Cloud & DevOps

Developers

End Users

01 — // Lead Story — Deep Dive

AI vs AI: When Agents Attack

An autonomous AI agent breached McKinsey’s internal chatbot in under two hours. A popular agent platform’s marketplace is distributing malware. Core AI infrastructure has critical unpatched flaws. This week’s theme: the tools organisations use to deploy AI are themselves under attack.

Gate 1 — Active Exploitation

Gate 2 — Blast Radius

On February 28, security firm CodeWall ran a controlled assessment against McKinsey’s internal AI chatbot, Lilli. The test subject was not a human penetration tester — it was an autonomous AI agent, pointed at the target with no credentials, no insider access, and no human guidance. Within approximately two hours, the agent had achieved full read-write access to the production database.

The attack chain was straightforward. The agent discovered publicly exposed API documentation, identified 22 unauthenticated endpoints, and exploited unparameterised JSON inputs to execute SQL injection. Read-write access meant an attacker could, in theory, silently rewrite Lilli’s system prompts — a prompt poisoning scenario that could manipulate the outputs seen by McKinsey’s 43,000+ employees without any visible indicator of compromise.

A note on the disputed data claims. CodeWall reported that 46.5 million messages, 728,000 files, 57,000 user accounts, and 95 system prompts were “accessible” through the compromised database. McKinsey’s third-party forensic investigation found no evidence that any data was actually accessed by unauthorised parties. Security analyst Edward Kiledjian flagged that CodeWall’s framing conflates “theoretically reachable” with “verified as exfiltrated” — a meaningful distinction. What is not disputed: the unauthenticated endpoints were real, the SQL injection worked, and read-write production access was achieved. McKinsey patched all endpoints, took the development environment offline, and blocked public API documentation within hours of disclosure.

The McKinsey assessment is a proof of concept for a broader problem: organisations are deploying internal AI systems faster than they are securing them. And the threat is no longer limited to human attackers working through a checklist. Autonomous agents can discover, chain, and exploit vulnerabilities at machine speed — which means the window between “deployed” and “compromised” is shrinking.

Threat 01 — Agent Platform Compromise

OpenClaw: Critical Flaws and a Poisoned Marketplace

OpenClaw (250K+ GitHub stars) has two critical CVEs: CVE-2026-22172 (CVSS 9.9) lets clients self-declare admin scopes with no server-side verification — a complete authorisation bypass. CVE-2026-25253 (CVSS 8.8) enables authentication token theft via the gatewayUrl parameter, leading to remote code execution. Separately, researchers found 1,184 malicious skills in the OpenClaw marketplace (up from 341 in the initial scan) distributing keyloggers on Windows and Atomic Stealer on macOS — a campaign dubbed “ClawHavoc.” SecurityScorecard’s STRIKE team found 135K+ instances exposed to the public internet, with 15K+ directly vulnerable to RCE. CNCERT issued two alerts (March 8 and 10). Microsoft’s Security Blog published a warning on February 19.

Threat 02 — AI Infrastructure Stack

Bedrock, LangSmith, SGLang: Three Unrelated Flaws, One Pattern

Three separate vulnerabilities across the AI tooling stack reveal a common gap: trust boundaries are weak or absent. Amazon Bedrock AgentCore Code Interpreter allows outbound DNS queries that enable sandbox escape to interactive shells (CVSS 7.5, no CVE assigned — AWS considers this “intended functionality” and declines to patch). LangSmith had a URL parameter injection flaw (CVE-2026-25750, CVSS 8.5) enabling token theft and account takeover, discovered by Miggo Security and patched in December 2025. SGLang, an open-source LLM serving framework, has two CVSS 9.8 flaws (CVE-2026-3059 and CVE-2026-3060) involving deserialisation of untrusted objects leading to remote code execution, plus CVE-2026-3989 (CVSS 7.8) — all discovered by Orca Security researcher Igor Stepansky and still unpatched.

Threat 03 — AI in Consumer Software

Chrome Gemini Panel: Privilege Escalation via Extensions

CVE-2026-0628 (CVSS 8.8): when Chrome’s Gemini AI loaded in the side panel rather than a standard tab, it ran with permissions far beyond what the panel should have had — effectively gaining the same trust level as a full browser process. Malicious extensions with only basic declared permissions could exploit this gap to access the victim’s camera, microphone, capture screenshots across any open website, and read local files. Discovered by Gal Weizman at Palo Alto Networks Unit 42. No confirmed wild exploitation, but a public proof-of-concept exists. Patched in Chrome 143.0.7499.192.

The pattern across all three threat vectors is the same: AI systems are being deployed with insufficient authentication, overly permissive trust boundaries, and minimal supply chain verification. An autonomous agent exploiting an unsecured API. A marketplace distributing malware through its official skill registry. A cloud provider classifying a sandbox escape as intended behaviour. A browser embedding an AI panel that inherits privileges it was never meant to have.

// Framework: NIST SP 800-218A — Secure Software Development for Generative AI

[✓]	Secure AI API endpoints. NIST 800-218A extends the SSDF with AI-specific controls — starting with authentication on every endpoint that touches model inputs or outputs. The McKinsey finding is a textbook failure here.

[✓]	Verify supply chain integrity for AI plugins and skills. The OpenClaw marketplace campaign shows that AI agent registries require the same vetting as package managers. Treat skills like dependencies.

[✓]	Enforce execution sandboxing for AI workloads. When a cloud provider classifies a sandbox escape as “intended functionality,” your compensating controls are your only defence. Restrict outbound network access from AI execution environments.

[✓]	Define trust boundaries for embedded AI features. Browser-integrated AI panels, IDE copilots, and enterprise chatbots all inherit permissions from their host environment. Map those inherited privileges explicitly.

Primary source: NIST SP 800-218A →

02 — // The Canada Angle

What This Means for Canadian Organisations

Regulatory exposure across PIPEDA, OSFI E-23, and Bill C-8 (successor to C-26) for AI deployments with insufficient access controls

Every story in this issue maps directly to Canadian regulatory obligations. An internal AI chatbot with unauthenticated API endpoints that could expose employee data is a PIPEDA accountability failure — the organisation deploying the system bears responsibility for safeguarding personal information regardless of whether a breach is confirmed or merely “theoretically reachable.” PIPEDA’s Breach of Security Safeguards Regulations make no exception for AI systems: if a breach creates a real risk of significant harm, the organisation must report to the Privacy Commissioner and notify affected individuals regardless of the technology involved.

Framework 1 — Federally Regulated Financial Institutions

OSFI E-23 — Model Risk Management

OSFI E-23 (effective May 1, 2027) will require federally regulated financial institutions to maintain inventories of AI models, assess risk at deployment, and conduct ongoing monitoring. The OpenClaw and Bedrock findings demonstrate exactly why: an AI agent platform with a compromised marketplace or a cloud execution environment with no sandbox boundaries would fail E-23’s model validation and third-party risk requirements.

The advisory: Financial institutions using AI agent platforms should audit their marketplace skill inventories and execution sandbox configurations now — before E-23’s validation requirements take effect.

Primary source: OSFI E-23 Guideline →

Framework 2 — Critical Infrastructure

Bill C-8 (Successor to C-26) — AI in Designated Operators

Bill C-26 passed both chambers but died when Parliament was prorogued in January 2025 before receiving Royal Assent. Its critical infrastructure cybersecurity provisions were reintroduced as Bill C-8 in the 45th Parliament and are currently in committee review. The bill designates telecommunications, energy, finance, and transportation as critical infrastructure sectors subject to cybersecurity obligations, and empowers the Governor in Council to direct operators to take specific actions to protect critical cyber systems. Organisations in these sectors deploying AI agent platforms with known critical vulnerabilities face potential compliance exposure once the bill receives Royal Assent.

The advisory: Even before Royal Assent, designated operators running AI agent platforms should verify they are on patched versions and have removed or audited any third-party marketplace skills. The direction of travel is clear.

Primary source: Bill C-26 Legislative Info →

03 — // On Our Radar + Patch Priority

// On Our Radar — Not Yet at Critical Threshold

→	NIST SP 800-218A adoption tracking: Published in final form, but enterprise adoption is still early. Watch for CCCS alignment guidance and whether OSFI references it in E-23 supplementary materials. NIST →

→	AI agent marketplace security standards: No industry-wide vetting standard exists for AI agent skills or plugins. The OpenClaw ClawHavoc campaign may accelerate work here. Expect vendor-led proposals before any formal standard emerges.

→	AWS Bedrock sandbox policy: BeyondTrust’s finding that AWS classifies DNS-based sandbox escape as intended behaviour sets a concerning precedent. Monitor for AWS policy changes or compensating control guidance.

// Patch Priority — This Fortnight

P1 — NOW

CVE-2026-22172 (OpenClaw, CVSS 9.9) — Authorisation bypass. Patch to v2026.3.12.

DevOps

P1 — NOW

CVE-2026-3059/3060 (SGLang, CVSS 9.8) — RCE via object deserialisation. NO PATCH — restrict network access.

ML Eng

P2 — Week

CVE-2026-25253 (OpenClaw, CVSS 8.8) — Auth token theft leading to RCE. Patch to v2026.1.29.

DevOps

P2 — Week

CVE-2026-0628 (Chrome Gemini, CVSS 8.8) — Privilege escalation via extensions. Update to Chrome 143.0.7499.192.

All

P3 — Month

CVE-2026-25750 (LangSmith, CVSS 8.5) — Token theft via URL injection. Patched Dec 2025 — verify your instance.

ML Eng

Editor’s Source Note — Issue #004: McKinsey/Lilli assessment sourced from The Register, Inc., Outpost24, The Stack, SecurityWeek, NeuralTrust, BankInfoSecurity, and The Decoder. CodeWall’s data claims (46.5M messages, 728K files) are disputed by McKinsey’s third-party forensics; both positions presented. Edward Kiledjian’s analysis cited for the “theoretically reachable vs. verified” distinction. OpenClaw CVEs sourced from NVD, TheHackerWire, The Hacker News, Dark Reading, SecurityWeek, SonicWall, SOCRadar, Microsoft Security Blog, SCMP, TechRadar, eSecurity Planet, and Particula. ClawHavoc malware campaign (1,184 malicious skills) and STRIKE team data (135K+ exposed instances) from SecurityScorecard. CNCERT alerts confirmed (March 8 and 10). Amazon Bedrock sandbox escape from BeyondTrust research. LangSmith CVE-2026-25750 from Miggo Security. SGLang CVEs from Orca Security researcher Igor Stepansky, confirmed via The Hacker News and SC Media. Chrome Gemini CVE-2026-0628 credited to Gal Weizman, Palo Alto Networks Unit 42. NIST SP 800-218A referenced from NIST CSRC. Canadian regulatory citations verified against primary government sources: OSFI E-23 (osfi-bsif.gc.ca), PIPEDA Breach of Security Safeguards Regulations (laws-lois.justice.gc.ca), Bill C-26/C-8 (parl.ca). Note: Bill C-26 died on prorogation January 2025; its provisions were reintroduced as Bill C-8 in the 45th Parliament. HARDENED has no commercial relationship with any vendor or security firm mentioned in this issue.

HARDENED

HARDENED is published for general informational and educational purposes. All threat data is sourced from publicly available security research and cited accordingly. This newsletter does not constitute professional security advice. Security configurations and threat landscapes vary by organisation. Consult a qualified security professional for implementation guidance specific to your environment. All data as of March 23, 2026.

How we work: HARDENED uses AI agents for research, drafting, and automation. Every issue is reviewed by humans before publication. If you spot an error, reply directly — we correct the record promptly.

Sources: The Register, Inc., Outpost24, The Stack, SecurityWeek, NeuralTrust, BankInfoSecurity, The Decoder, NVD, TheHackerWire, The Hacker News, Dark Reading, SonicWall, SOCRadar, Microsoft Security Blog, SCMP, TechRadar, eSecurity Planet, Particula, SecurityScorecard, CNCERT, BeyondTrust, Miggo Security, Orca Security, SC Media, Palo Alto Networks Unit 42, NIST CSRC, OSFI, Parliament of Canada

hardened.news

AI vs AI: When Agents Attack

Keep Reading

Hardened Cybersecurity and AI Newsletter