Thank you very much Platonides. Your reply is clear and easy to follow. PM Poon
On Sun, Feb 8, 2009 at 9:08 AM, Platonides <[email protected]> wrote: > ekompute wrote: > > Hi, can anyone help me with my robot.txt. > > The name is 'robots.txt' > > > My contents for the page reads as > > follows: > > > > User-agent: * > > Disallow: /Help > > Disallow: /MediaWiki > > Disallow: /Template > > Disallow: /skins/ > > > > But it is blocking pages like: > > > > - http://www.dummipedia.org/Special:Protectedpages > > - http://dummipedia.org/Special:Allpages > > Special pages try to autoprotect themselves. > See how they have '<meta name="robots" content="noindex,nofollow" />' > A crawler traversing Special:Allpages would likely produce too much load. > > > > and external pages like: > > > > - http://www.stumbleupon.com/ > > - http://www.searchtheweb.com/ > > $wgNoFollowLinks = false; > > http://www.mediawiki.org/wiki/Manual:$wgNoFollowLinks > http://www.mediawiki.org/wiki/Manual:$wgNoFollowDomainExceptions > > > As you can see, my robot.txt did not block these pages. Also, should I > block > > the print version to prevent what Google calls "duplicate content"? If > so, > > how? > > Disable /index.php (printable, edit...) > > > Response will be very much appreciated. > > > > PM Poon > > > _______________________________________________ > MediaWiki-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/mediawiki-l > _______________________________________________ MediaWiki-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
