Parasite bots are quietly eating your hosting bill
AI scrapers that never send you a visitor still ship you a bandwidth invoice every month. Here's how to see the damage and stop paying for it.
- bandwidth
- infrastructure
- ai-bots
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Your hosting bill has been creeping up and you can't quite explain why. Traffic looks flat. Real user visits are steady. But egress is double what it was last year.
Welcome to the parasite bot era.
What a parasite bot actually is
Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. We split bot traffic into two buckets:
- Citation bots. They read your page and might send you a user later — via a ChatGPT link, a Perplexity citation, a Claude response. These earn their keep.
- Parasite bots. They read your page, train a model on it, and never send you anything back. No traffic, no citation, no attribution. Just a line item on your bandwidth invoice.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. The ratio is worse than you think. On a typical content site we see 60% of AI crawler traffic classified as parasite, and that number is climbing.
The math
Duis aute irure dolor in reprehenderit in voluptate velit esse. Let's put real numbers on it. Say you get 500k monthly page views and your pages average 500 KB of HTML + assets:
Bot share of traffic: 35% of visits
Parasite share of bots: 60% of bot hits
Parasite hits/month: 105,000
Bandwidth burned: 52.5 GB
Cost on Vercel: ~$7.88/month
Cost on AWS CloudFront: ~$4.20/month
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim. Eight dollars doesn't sound like much. Multiply it by every domain you own, every year, for the rest of your company's life, and it compounds into real money — and it only goes up as models get hungrier.
Why robots.txt doesn't save you
Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit. Three problems:
- It's advisory. Well-behaved bots honor it. The ones you most want to block often don't.
- It's static. You update it once and forget it. Meanwhile new crawlers show up every month.
- It's coarse. You can only allow or deny. You can't say "let ChatGPT read but not train" or "allow for citations but block for training."
Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet. You need edge-level enforcement — a layer that inspects every request, checks the user-agent and IP against an always-fresh bot catalog, and drops the parasites before they hit your origin.
What to do this week
At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis. Three concrete steps:
- Run the Bandwidth Waste Calculator to size the problem on your own traffic.
- Run the AI Bot Audit to see what your robots.txt is actually doing today.
- Decide whether the number you see is worth a fifteen-minute integration or another year of compounding waste.
Et harum quidem rerum facilis est et expedita distinctio. The tools are free. The decision is yours.
Next step
Serve llms.txt from 300+ POPs in milliseconds.
Dive into Edge delivery — how it works, what it costs, and what it replaces on your stack.
Read the Edge delivery pageLiked this? Get the next one.
One email on Tuesdays. No tracking pixels, no filler.