Web traffic isn’t just humans anymore. Bots — both good and bad — crawl websites at massive scale. Search engines, monitoring tools, and uptime checkers are helpful. But bad bots — scraping content, harvesting emails, or launching automated attacks — can harm performance, skew analytics, and expose security weaknesses.
In this article, we’ll explore how to identify bad bot traffic using:
- IP addresses and IP ranges
- User agent strings
And how to block them at the server or application level.
Why Bot Traffic Matters
Before we dive into code and tools, it’s important to understand the impact of bad bot traffic:
- Skewed analytics and performance metrics
- Server load and increased hosting costs
- Content scraping and intellectual property theft
- Credential stuffing and automated attacks
Many website owners are now proactively taking stronger steps against automated activity. For example, a growing number of U.S. site operators have begun blocking non‑U.S. traffic in an effort to reduce bot activity and large‑scale scraping. A detailed analysis of this trend can be found here:
https://www.searchen.com/2025/04/03/u-s-website-owners-increasingly-blocking-non-u-s-traffic-to-combat-bot-activity-and-scraping/
Step 1 — Detecting Bot Traffic
Check Your Analytics
Start with tools like Google Analytics 4:
- Look for sudden spikes in sessions
- High bounce rates with long visit durations
- Unusual referral sources
These patterns often signal non‑human traffic.
Analyze Server Logs
Raw logs contain client IP and user agent information:
123.45.67.89 - - [09/Feb/2026:13:45:01 +0000] "GET /index.html HTTP/1.1" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
In this example:
- 123.45.67.89 is the client IP
- The user agent identifies Googlebot
Logs are one of the most reliable sources to determine traffic patterns.
Step 2 — Identify Suspicious IP Addresses and Ranges
Bots often originate from predictable IP blocks. You can detect these by:
Reverse DNS Lookup
Use tools like dig or online lookup to see if an IP resolves to a known provider:
dig -x 123.45.67.89 +short
If the domain is generic or known for proxy/VPN services, it could be a bot.
Consult IP Reputation Databases
Services like IPinfo, Spamhaus, or Threat Intelligence lists maintain reputations for IP blocks.
Step 3 — Identify Suspicious User Agents
Bots usually identify themselves with non‑standard user agent strings. Some common patterns include:
- bot
- crawler
- spider
- scraper
Step 4 — Block Bad Traffic by IP and User Agent
Nginx Example (IP Range Block)
# Block entire bad IP range
deny 123.45.67.0/24;
deny 98.76.54.123;
Nginx Example (User Agent Block)
if ($http_user_agent ~* (bot|crawler|spider|scraper)) {
return 403;
}
Apache Example (mod_rewrite)
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (bot|crawler|spider|scraper) [NC]
RewriteRule .* - [F,L]
Blocking in Cloudflare
Cloudflare allows:
- IP Access Rules
- Bot Fight Mode
- Custom Firewall rules
Step 5 — Monitor and Adjust
Bad bots evolve constantly. Monitoring is essential:
- Watch for new IP spikes
- Log denied requests
- Update patterns as needed
Automated tools like fail2ban or commercial services can help.
Conclusion
Identifying bad bot traffic by IP range and user agent empowers you to clean up analytics, protect content, and secure your site. With server‑level rules or cloud firewall policies, you can mitigate the majority of automated abuse.