I set my server to limit requests per hour from the same IP to slow them down, and I have code to detect bots and redirect their sessions to a low impact catch page. It’s not that hard to control, but lately I have noticed old tricks no longer work as well. AI arms race. But I always believed publishing publicly would eventually cause the content to enter the public domain.
On Tue, Sep 16, 2025 at 9:02 PM Cameron Kaiser via cctalk < [email protected]> wrote: > > > For those of you who run vintage computing-related info sites, have you > > noticed all of the LLM scraper activity? AI services are using the LLM > > scrapers to populate their knowledge bases. > > A massive, massive IP filter. There has been some collateral damage, but > unfortunately I don't think this is avoidable. They're a plague. > > -- > ------------------------------------ personal: > http://www.cameronkaiser.com/ -- > Cameron Kaiser * Floodgap Systems * www.floodgap.com * > [email protected] > -- Time is an illusion. Lunch time, doubly so. -- Douglas Adams > --------------- > >
