Many thanks to all for your contributions.
My conclusion is that the method I use with many bots would be slow.
It seems the best option is to use nginx maps:
https://forum.nginx.org/read.php?11,255678,270417#msg-270417
https://community.centminmod.com/threads/blocking-bad-or-aggressive-bots.643
I find Naxsi hard to debug. For me, it generated many false positives. YMMV
You could also look at the nginx module naxsi :
https://github.com/nbs-system/naxsi
More flexibility with regex and actions
--
StackStar Managed Hosting Services : https://www.stackstar.com
Shift8 Web Design in Toronto : https://www.shift8web.ca
On Mon, Nov 14, 2016 at 10:04 AM, debilish99
wr
I use nginx maps which depending on user agent either block, rate limit or
whitelist
https://community.centminmod.com/threads/blocking-bad-or-aggressive-bots.6433/
as the list gets large nginx maps just make it easier to manage
Posted at Nginx Forum:
https://forum.nginx.org/read.php?2,270930,270
Hi there !
so I do, with 2 different ways :
==
if ($http_user_agent ~* MJ12bot|SemrushBot) {
return 403;
}
if ($http_user_agent ~* bot|crawl|spider|tools|java) {
rewrite ^ http://www.cnrtl.fr/definitio
fwiw,
I use the map approach discussed here.
I've a list of a hundred or so 'bad bots'.
I reply with a 444. Screw 'em.
IMO, the performance hit of blocking them is far less than the performance
havoc they wreak if allowed to (try to) scan your site, &/or the inevitable
flood of crap from you
Comparing strings is CS101. If map is a linear search, that should be something to improve.I'm assuming you read the code
On Mon, Nov 14, 2016 at 8:51 AM, wrote:
> I'd be shocked if the map function doesn't use a smart search scheme
> rather than check every item.
>
You're in for a bit of a shock then. It is a linear search :p Curious as to
what you think it should look like instead?
Getting back to the original q
ng table. I can go weeks before a new IP gets used.
Original Message
From: debilish99
Sent: Monday, November 14, 2016 7:04 AM
To: nginx@nginx.org
Reply To: nginx@nginx.org
Subject: Bloking Bad bots
Hello,
I have a server with several domains, in the configuration file of each
domain I hav
gt; an IP space you can look up via BGP tools, I take the IP used and add it to
> my firewall blocking table. I can go weeks before a new IP gets used.
>
> Original Message
> From: debilish99
> Sent: Monday, November 14, 2016 7:04 AM
> To: nginx@nginx.org
> Reply To: nginx@ngi
Message
From: debilish99
Sent: Monday, November 14, 2016 7:04 AM
To: nginx@nginx.org
Reply To: nginx@nginx.org
Subject: Bloking Bad bots
Hello,
I have a server with several domains, in the configuration file of each
domain I have a line like this to block bad bots.
If ($ http_user_agent ~ *
(zealbot|MJ
Hello,
I have a server with several domains, in the configuration file of each
domain I have a line like this to block bad bots.
If ($ http_user_agent ~ *
(zealbot|MJ12bot|AhrefsBot|sogou|PaperLiBot|uipbot|DotBot|GetIntent|Cliqzbot|YandexBot|Nutch|TurnitinBot|IndeedBot)
Return 403;
}
This works
12 matches
Mail list logo