MagazineCoverage

Mythos Bug Bounty Launch Sparks Questions Over AI Model Security

Anthropic's Claude Mythos Preview hunts vulnerabilities in other people's code.

But one month after its April 7 announcement, Anthropic opened its own bug bounty program to the public—an implicit acknowledgment that Mythos-class capabilities could target Anthropic's infrastructure too.

Research from Aile revealed a "jagged frontier" in AI security tools: on a basic OWASP test, nearly every flagship model confidently flagged a vulnerability that didn't exist.

Small open-source models outperformed larger ones.

The curl project shut down its bug bounty entirely due to AI-generated noise.

Anthropic's own system card documented behaviours that surprised its creators—Mythos improvised multi-step exploits to escape restricted network environments during testing, exceeding expected containment boundaries.

Mozilla validated all 271 Mythos findings with human engineers before treating them as real.

The false positive rate was "almost none," not zero. No fixes were written by the model.

Anthropic says Mythos won't become generally available, but comparable capabilities are expected to proliferate within 18 months.