AI-driven vulnerability discovery & exploitation capabilities
Impact: Unauthenticated remote → full root access
Stack buffer overflow (304 bytes past 128-byte buffer). No stack canary (buffer declared as int32_t[32] bypasses -fstack-protector). No KASLR (kernel load address predictable). Info leak via unauthenticated NFSv4 EXCHANGE_ID provides host UUID and boot time for handle creation.
Exploit: 6-packet ROP chain appends attacker SSH key to /root/.ssh/authorized_keys. Full autonomous discovery and exploitation in several hours.
Impact: Remote kernel crash via crafted SACK packet
Double-bug in SACK hole tracking: (1) validates end of range but not start, (2) if SACK block deletes only hole and triggers append, writes through NULL pointer. Exploit uses signed integer overflow in TCP sequence comparison (int)(a - b) < 0. Placing SACK start ~2³¹ away overflows sign bit in both comparisons, satisfying impossible condition.
Discovery cost: <$50 for specific run. Total $20k for 1000 runs finding dozens of vulnerabilities.
Impact: Out-of-bounds heap write (difficult to exploit)
Slice counter is 32-bit int, but tracking table uses 16-bit entries. Table initialized with memset(..., -1, ...) (16-bit value 65535 as sentinel). If attacker creates frame with 65,536 slices, slice #65535 collides with sentinel. Decoder treats nonexistent neighbor as real, writes out-of-bounds.
Significance: Underlying bug (-1 sentinel) existed since 2003. Became exploitable in 2010 refactor. Missed by all fuzzers for 16 years despite FFmpeg being one of most thoroughly fuzzed projects.
Impact: Malicious guest → host memory write
Production VMM written in memory-safe language with vulnerability in unsafe operation (Rust unsafe, Java JNI/sun.misc.Unsafe, Python ctypes). VMMs must interact with hardware using raw pointers. Easy DoS, potentially exploitable in chain.
SHA-3: b63304b28375c023abaa305e68f19f3f8ee14516dd463a72a2e30853
Impact: TLS certificate authentication bypass, certificate forgery
Additional crypto bugs (SHA-3):
05fe117f9278cae788601bca74a05d48251eefed8e6d7d3dc3dd50e08af3a08357a6bc9cdd5b42e7c5885f0bb804f723aafad0d9f99e5537eead5195d761aad2f6dc8e4e1b56c4161531439fad524478b7c7158bImpact: Local unprivileged → root via 2-4 vulnerability chains
Example 4-vuln chain: (1) Bypass KASLR, (2) Read kernel struct, (3) Write to freed heap object, (4) Heap spray to place struct at write location → grant root permissions. Most unpatched. Recent example: e2f78c7ec165.
SHA-3 commitments:
b23662d05f96e922b01ba37a9d70c2be7c41ee405f562c99e1f9e7d5c2e3da6e85be2aa7011ca21698bb66593054f2e71a4d583728ad1615c1aa12b01a4851722ba4ce89594efd7983b96fee81643a912f37125b6114e52cc9792769907cf82c9733e58d632b96533819d4365d582b03Impact: Multiple complete authentication bypasses
Method: Reverse-engineered from stripped binaries
SHA-3:
d4f233395dc386ef722be4d7d4803f2802885abc4f1b45d370dc9f97f4adbc142bf534b9c514b5fe88d532124842f1dfb40032c982781650| Metric | Value | Context |
|---|---|---|
| Zero-days discovered | 1000s | 99% unpatched |
| Oldest vulnerability | 27 years | OpenBSD TCP SACK |
| Firefox exploit improvement | 90.5x | 181 vs 2 (Opus 4.6) |
| OSS-Fuzz tier 5 hijacks | 10 | Opus 4.6: 0 |
| Human validator agreement | 89% | Exact severity match |
| N-day weaponization | 40/100 | 2024-2025 Linux CVEs |
Security capabilities emerged from general code/reasoning improvements, not targeted training. Same improvements making model better at patching also make it better at exploiting.
Language models enable file-by-file systematic review. FreeBSD vulnerability survived 17 years not due to subtlety, but because human auditors skip files assuming "someone checked that." Models don't make that assumption.
Complex multi-stage exploits (ROP chains, packet splitting, heap spraying) that required weeks of expert work now complete in hours. Friction-based defenses weakening against AI-assisted adversaries.
Disclosed and patched vulnerabilities become exploitable in hours. Patch itself is roadmap to bug. Window between disclosure and mass exploitation collapsing.
Still effective: KASLR (requires info leak), strong stack canaries (-fstack-protector-strong), W^X
Weakening: Defense-in-depth measures relying on tedium rather than impossibility
-fstack-protector-strong, enable full KASLR, verify W^XSource: Anthropic Red Team Technical Report (April 7, 2026)