Re: [Mediawiki-l] Robot.txt

ekompute Sat, 07 Feb 2009 18:13:19 -0800

Thank you very much Platonides. Your reply is clear and easy to follow.

PM Poon


On Sun, Feb 8, 2009 at 9:08 AM, Platonides <[email protected]> wrote:

> ekompute wrote:
> > Hi, can anyone help me with my robot.txt.
>
> The name is 'robots.txt'
>
> > My contents for the page reads as
> > follows:
> >
> > User-agent: *
> > Disallow: /Help
> > Disallow: /MediaWiki
> > Disallow: /Template
> > Disallow: /skins/
> >
> > But it is blocking pages like:
> >
> >    - http://www.dummipedia.org/Special:Protectedpages
> >    - http://dummipedia.org/Special:Allpages
>
> Special pages try to autoprotect themselves.
> See how they have '<meta name="robots" content="noindex,nofollow" />'
> A crawler traversing Special:Allpages would likely produce too much load.
>
>
> > and external pages like:
> >
> >    - http://www.stumbleupon.com/
> >    - http://www.searchtheweb.com/
>
> $wgNoFollowLinks = false;
>
> http://www.mediawiki.org/wiki/Manual:$wgNoFollowLinks
> http://www.mediawiki.org/wiki/Manual:$wgNoFollowDomainExceptions
>
> > As you can see, my robot.txt did not block these pages. Also, should I
> block
> > the print version to prevent what Google calls "duplicate content"? If
> so,
> > how?
>
> Disable /index.php (printable, edit...)
>
> > Response will be very much appreciated.
> >
> > PM Poon
>
>
> _______________________________________________
> MediaWiki-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>
_______________________________________________
MediaWiki-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l

Re: [Mediawiki-l] Robot.txt

Reply via email to