-----Original message----- From: Gregory S. Youngblood <[email protected]> Sent: Fri 15-02-2013 18:47 Subject: Re: [OpenIndiana-discuss] opensolaris.org shutting down next month To: Discussion list for OpenIndiana <[email protected]>; > And it can go away at any time. If they change robots.txt to block spiders > they > will remove content. That happened to an old site I had in the 90s that they > archived. I let domain go and new owners did that and archive blocked or > purged > my sites pages. > > -- > Sent from my Jelly Bean Galaxy Nexus > > Hugh McIntyre <[email protected]> wrote: > > >Is this going to be any different from www.archive.org, which already > >exists and has a full archive of the Internet, including opensolaris.org? > > > >See http://web.archive.org/web/*/opensolaris.org. > > > >Of course this does not guarantee to include active content that rely on > >server-side scripting, but then you won't get this with wget either. > > > >Hugh.
robots.txt at hub.opensolaris.org already forbids to crawl the site: User-agent: * Disallow: /*! If I remeber correctly, they replaced the robots.txt since yesterday.... So, the usual web archives are worthless i this case. http://tinyurl.com/29dwvxf should work, anyway. cu _______________________________________________ OpenIndiana-discuss mailing list [email protected] http://openindiana.org/mailman/listinfo/openindiana-discuss
