AI-assisted vs full-AI, Claude's slop mode, and the fun of bug bounty

AI-assisted vs full-AI, Claude's slop mode, and the fun of bug bounty

This week I finally got back to some hunting. It had been a while since I felt motivated to do it. It's not that I didn't want to anymore, but the fun, playful side of it isn't really there. I talk about that a bit later in the article.

AI-assisted vs full-AI

The first time I tested Claude Code for hunting, I just launched it telling it "go find bugs".

And it did. It found things, and it can still do that pretty widely. If you give it the right tools, the right MCPs and skills, it can naturally find bugs without too much difficulty.

But there's a problem with that. It goes in every direction and reports anything and everything. That's the main issue right now, the one that's overwhelming triagers and making companies receive thousands of bugs.

The thing is, out of those bugs, maybe 5 to 10% are valid, and I'm probably being generous. In the end, applying this technique you spend more time triaging what Claude reports than actually finding bugs.

You slowly become an agent manager, and the fun, technical part disappears. I tested this, and let's be honest, it does find bugs, from low to critical sometimes, and that's super strong. But at what cost.

This week I tested another approach. Focus it on a precise objective that I define myself. Take a lead I already had and make it dig in that direction.

There are quite a few programs where I dug on my own, and I had plenty of ideas, but the setup is often heavy: spin up an attacker SAML server, or OAuth with specific config. That kind of stuff you don't want to set up yourself, that's where Claude is really useful.

By delegating that part but keeping the lead, that's where you're really strong and you get some of the pleasure back.

That's what I did this week, and I found some nice bugs on scopes I knew well.

While still letting a /goal look for leads on the side. Even if 95% was wrong, it helps cover the whole surface and gives ideas.

But all of that still has a big problem.

Claude's slop mode activated

When you launch Claude to look for bugs, it can find some, but it can also do anything. Especially when you give it time and it runs out of ideas.

At some point, it goes out of scope, and that's exactly what you want to avoid. The other day I launched it on a scope, and it went off on its own to explore subdomains that weren't in scope at all. In those cases you really have to watch it and give it strict rules so it doesn't drift.

Another time, I was hunting on a big SaaS with a lot of paid options. I was on a free trial, but I still had to put a credit card on file. Well, Claude decided to go test some paid features and made me spend 90 euros to add an extra user to my account.

I ended up getting refunded, but it's crazy how it can go in every direction. In that kind of case, I should have used a burner card with very little money on it. If you let it run, it can do absolutely anything.

The fun of bug bounty

Like I said earlier, the reason I've done less bug bounty lately is mostly because of a lack of fun.

You wait months to get triaged and paid. Thousands of reports get sent every day.

Even the triagers I've talked to are fed up.

Shubs talks about it really well in his latest article: https://shubs.io/the-down-fall-of-bug-bounties/

I don't really know how long things are going to stay like this, probably until Claude's prices blow up or bug bounty platforms change their rules.

In the meantime, I keep doing some, trying to find fun and hard objectives, and maybe doing a bit of research too. We'll see.

Comments