On Tuesday 26 December 2006 22:44, Adam Hardy wrote:
> A large number of entries in the google results from the list are
> useless because they are the 'debian-user Jun 2006 by
> thread/author/whatever' monthly summaries.

> Judicious use of the robots.txt file on the mail archive server would
> be a simple way to remove the summaries from the search engines, for
> instance something like:

If we want to do this, I guess we will need something like this if we want 
to catch all index types and also overflow pages:

User-agent: *
Disallow: /**/threads.html
Disallow: /**/thrd*.html
Disallow: /**/author*.html
Disallow: /**/mail*.html
Disallow: /**/subject*.html

# We also need to catch the index.html symlinks
# This should catch most without excluding more generic pages
Disallow: /*/*/index.html
Disallow: /debian-*/index.html

Fellow listmasters (especially those with actual robots.txt experience), 
please comment.

Attachment: pgp78MVivFNXR.pgp
Description: PGP signature

Reply via email to