Hi, I have the following rewrite rule in place on one of our staging
sites to redirect bots and malicious scripts to our corporate page:
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).*
[NC,OR]
RewriteCond %{HTTP_USER_AGENT}
^.*(HTTrack|clshttp|archiver|loader|email|nikto|miner|python).* [NC,OR]
RewriteCond %{HTTP_USER_AGENT}
^.*(winhttp|libwww\-perl|curl|wget|harvest|scan|grab|extract).* [NC,OR]
RewriteCond %{HTTP_USER_AGENT}
^.*(Googlebot|SemrushBot|PetalBot|Bytespider|bingbot).* [NC]
RewriteRule (.*) https://guardiandigital.com$1 [L,R=301]
However, it doesn't appear to always work properly:
66.249.68.6 - - [08/Jul/2024:11:43:41 -0400] "GET /robots.txt HTTP/1.1"
200 343 r:"-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)" 0/5493 1145/6615/343 H:HTTP/1.1
U:/robots.txt s:200
Instead of making changes to my rules then having to wait until the
condition is met (Googlebot scans the site again), I'd like to simulate
the above request against my ruleset to see if it matches. Is this possible?
Thanks,
Dave