Benjamin Lees <[email protected]> wrote:

>> Is there an actual problem you're trying to solve here?  Is there any
indication that spam bots are affecting your site's performance?  If not,
worrying about this is probably a waste of your time.

Spambots and CPU is a known issue:
https://www.google.com/search?q=spam+bots+cpu&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a#hl=en&safe=off&client=firefox-a&hs=BbI&rls=org.mozilla:en-US%3Aofficial&sclient=psy-ab&q=spambots+cpu&oq=spambots+cpu&gs_l=serp.12...0.0.0.103881.0.0.0.0.0.0.0.0..0.0...0.0...1c..7.psy-ab.qbdbh4AqgLU&pbx=1&bav=on.2,or.r_qf.&bvm=bv.44442042,d.dmg&fp=742c113135fff5b3&biw=1920&bih=832

Is it a problem? Yes, they're constantly trying to break in and that
increases CPU usage. I dont have any analysis to prove it but I've seen
many times where traffic has been normal (google analytics) but the CPU
usage has gone very high. I came from a shared server where I was actually
asked to leave because of the CPU usage. I've had big problems with CPU.
I'm on VPS now and it can still be a problem. Average CPU usage recently
went up from around 20% to 160% (multi-core, thats why it goes over 100 or
some other reason) for a few hours, while Google analytics showed no
change. Whether this is a malicious/ddos bot or an advertising bot, this is
something that needs to be studied and dealt with. If I stay on 200% CPU
usage on the VPS I may be asked to leave the server. So yes I have to keep
a watch over CPU and I have to explore all possible options to keep the
usage down. I'm using caching and nginx (earlier suggestions by people on
this list).

As to how to prevent genuine viewers from being blocked, thats problem #2
and its something that can be improved.

I'll try this suggested by Henny:
http://danielwebb.us/software/bot-trap/

Anne wrote:
>> +1 - there is one well-known blog site that uses capture, and I've tried
as many as 10 times on a single submission, only to give up because I
simply couldn't get the captcha right.  Now I don't even try to comment
there.

They need a better captcha. But yes, you guys have reminded me that
whatever method is used, I need to make sure genuine visitors are not
effected. The link by Henny might take care of it as the 'hidden' link is
not seen by humans.

thanks
Dan




On Fri, Mar 29, 2013 at 1:29 PM, Benjamin Lees <[email protected]> wrote:

> On Thu, Mar 28, 2013 at 11:32 PM, Dan Fisher <[email protected]>
> wrote:
>
> > Here's one idea: If a
> > certain IP address fails the captchas a specified number of times in 5
> > minutes or so, it should be banned temporarily for say, 24 hours (through
> > htaccess or firewall etc).
>
> Humans regularly get CAPTCHAs wrong, and they often do so multiple times
> (if you have any elderly relatives, feel free to see how many tries it
> takes them to solve a reCAPTCHA one).  Blocking them from even viewing your
> site for a day seems a little extreme.
>
> Is there an actual problem you're trying to solve here?  Is there any
> indication that spam bots are affecting your site's performance?  If not,
> worrying about this is probably a waste of your time.
> _______________________________________________
> MediaWiki-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>
_______________________________________________
MediaWiki-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Reply via email to