3 problems: a- he wanted to read it locally. b- crawling the open web is imperfect. c- /browse needs to get at the files with the same URL as the uploader.
a and b- Try downloading the whole thing with 'wget'. It has a 'make links point to the downloaded files' option. Wget is great.
I have done this by parking my files behind a web server. You can use Tomcat. (I recommend the XAMPP distro: http://www.apachefriends.org/en/xampp.html). Then, use Erik's command to crawl that server. Use /browse to read it.
Looking at this again, it should be possible to add a file system service to the Solr start.jar etc/jetty.xml file. I think I did this once. It would be a handy patch. In fact, this whole thing would make a great blog post.
On 12/30/2012 05:05 AM, Erik Hatcher wrote:
Here's a geeky way to do it yourself: Fire up Solr 4.x, run this from example/exampledocs: java -Ddata=web -Ddelay=2 -Drecursive=1 -jar post.jar http://wiki.apache.org/solr/ (although I do end up getting a bunch of 503's, so maybe this isn't very reliable yet?) Tada: http://localhost:8983/solr/collection1/browse :) Erik On Dec 29, 2012, at 16:54 , d_k wrote:Hello, I'm setting up Solr inside an intranet without an internet access and I was wondering if there is a way to obtain the data dump of the Solr Wiki (http://wiki.apache.org/solr/) for offline viewing and searching. I understand MoinMoin has an export feature one can use (http://moinmo.in/MoinDump and http://moinmo.in/HelpOnMoinCommand/ExportDump) but i'm afraid it needs to be executed from within the MoinMoin server. Is there a way to obtain the result of that command? Is there another way to view the solr wiki offline?
