[cctalk] Re: Large language model (LLM) Web Scrapers

Bill Degnan via cctalk Tue, 16 Sep 2025 19:47:01 -0700

I set my server to limit requests per hour from the same IP to slow them
down, and I have code to detect bots and redirect their sessions to a low
impact catch page.  It’s not that hard to control, but lately I have
noticed old tricks no longer work as well.  AI arms race.  But I always
believed publishing publicly would eventually cause the content to enter
the public domain.


On Tue, Sep 16, 2025 at 9:02 PM Cameron Kaiser via cctalk <
[email protected]> wrote:

>
> > For those of you who run vintage computing-related info sites, have you
> > noticed all of the LLM scraper activity?    AI services are using the LLM
> > scrapers to populate their knowledge bases.
>
> A massive, massive IP filter. There has been some collateral damage, but
> unfortunately I don't think this is avoidable. They're a plague.
>
> --
> ------------------------------------ personal:
> http://www.cameronkaiser.com/ --
>   Cameron Kaiser * Floodgap Systems * www.floodgap.com *
> [email protected]
> -- Time is an illusion. Lunch time, doubly so. -- Douglas Adams
> ---------------
>
>

[cctalk] Re: Large language model (LLM) Web Scrapers

Reply via email to