bugs.debian.org

Jason Spiro Thu, 03 Jan 2008 14:36:59 -0800

2008/1/3, Don Armstrong <[EMAIL PROTECTED]> wrote:
>
> On Thu, 03 Jan 2008, Jason Spiro wrote:
>
> > Please allow search engines to index http://bugs.debian.org.  This can
> > be done by deleting the file http://bugs.debian.org/robots.txt.
>
> Just for the record, the reasons why we disallow indexing are because
> the robots.txt specification isn't complete enough to specify a
> maximum scan rate for specific portions of the site sufficient to
> allow us to actually allow bots to access the site without degrading
> performance for other users of the site.


http://en.wikipedia.org/wiki/Robots.txt#Crawl-delay_directive will
help.  Yahoo and MSNBot both support it.  I bet other major bots
support it too.  So we can allow Yahoo and MSNBot (plus Googlebot, if
they support it too) and block everyone else.

> There are already mirrors which allow indexing, and you can use the
> BTS's own search engine which is far superior to gooogle (or any other
> search engine which doesn't have access to internal metadata) in this
> regard.

Using my browser's default search engine is more convenient.  :)
Also, most users assume that web search engines index everything.
They may waste time searching the web before realizing that bugs.d.o
is unindexed.

Cheers,
-- 
Jason Spiro: corporate trainer, web developer, IT consultant.
I support Linux, UNIX, Windows, and more.
Contact me to discuss your needs and get a free estimate.
+1 (613) 668-6096 / Email: [EMAIL PROTECTED] / MSN: [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Bug#458939: allow search engines to index http://bugs.debian.org

Reply via email to