On Thu, Jul 11, 2024 at 4:49 AM Marc <m...@f1-outsourcing.eu> wrote:

> >
> > RewriteCond %{HTTP_USER_AGENT} ^$
> > [OR]
> > RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).*
> > [NC,OR]
> > RewriteCond %{HTTP_USER_AGENT}
> > ^.*(HTTrack|clshttp|archiver|loader|email|nikto|miner|python).* [NC,OR]
> > RewriteCond %{HTTP_USER_AGENT} ^.*(winhttp|libwww\-
> > perl|curl|wget|harvest|scan|grab|extract).* [NC,OR]
> > RewriteCond %{HTTP_USER_AGENT}
> > ^.*(Googlebot|SemrushBot|PetalBot|Bytespider|bingbot).* [NC]
> > RewriteRule (.*)    https://guardiandigital.com/$1 [L,R=301]
> >
> >
> > SetEnvIf user-agent "(?i:GoogleBot)" googlebot=1
> > SetEnvIf user-agent "(?i:SemrushBot)" googlebot=1
> > SetEnvIf user-agent "(?i:PetalBot)" googlebot=1
> > SetEnvIf user-agent "(?i:Bytespider)" googlebot=1
> > SetEnvIf user-agent "(?i:bingbot)" googlebot=1
> >
> >
> >   <RequireAny>
> >         Require ip 1.2.3.4
> >         Require env googlebot
> >   </RequireAny>
> >
>
> I would think that mod_security is more efficient for this
> SecRule REQUEST_HEADERS:User-Agent "xxxx"
> "id:'13006',phase:2,log,deny,status:200"
>
> Why allow SemrushBot, PetalBot and Bytespider? If they don't give you
> traffic, block them. Better add things for yandex and duckduckgo.
> Duckduckgo is getting better than google. Maybe start looking for ai
> crawlers also.
>
> > I was also originally trying to associate the rewriterules with the
> > requireany using <If> but then realized I didn't even have to do that -
> > it would just automatically get processed independently. It looks so
> > simple now, but took me a while to make it this simple.
> >
> >
>
> What also helps is blocking these clouds, just get their ip ranges
>
> - amazon
> - googleusercontent
> - digital ocean
> - ovh
>
>
>
> PS. Don't give google the credit to have bot variable named after them ;).
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
> For additional commands, e-mail: users-h...@httpd.apache.org


The follow bit:

"has to appear in .htaccess because it's processed after the virtualhost
config and any requireall/requireany entries are overridden that already
appear there"

Makes no sense.  You can just create your vhost properly to produce the
expected behaviour.

Reply via email to