3 problems:
a- he wanted to read it locally.
b- crawling the open web is imperfect.
c- /browse needs to get at the files with the same URL as the uploader.

a and b- Try downloading the whole thing with 'wget'. It has a 'make links point to the downloaded files' option. Wget is great.

I have done this by parking my files behind a web server. You can use Tomcat. (I recommend the XAMPP distro: http://www.apachefriends.org/en/xampp.html). Then, use Erik's command to crawl that server. Use /browse to read it.

Looking at this again, it should be possible to add a file system service to the Solr start.jar etc/jetty.xml file. I think I did this once. It would be a handy patch. In fact, this whole thing would make a great blog post.

On 12/30/2012 05:05 AM, Erik Hatcher wrote:
Here's a geeky way to do it yourself:

Fire up Solr 4.x, run this from example/exampledocs:

    java -Ddata=web -Ddelay=2 -Drecursive=1 -jar post.jar 
http://wiki.apache.org/solr/

(although I do end up getting a bunch of 503's, so maybe this isn't very 
reliable yet?)

Tada: http://localhost:8983/solr/collection1/browse

:)

        Erik


On Dec 29, 2012, at 16:54 , d_k wrote:

Hello,

I'm setting up Solr inside an intranet without an internet access and
I was wondering if there is a way to obtain the data dump of the Solr
Wiki (http://wiki.apache.org/solr/) for offline viewing and searching.

I understand MoinMoin has an export feature one can use
(http://moinmo.in/MoinDump and
http://moinmo.in/HelpOnMoinCommand/ExportDump) but i'm afraid it needs
to be executed from within the MoinMoin server.

Is there a way to obtain the result of that command?
Is there another way to view the solr wiki offline?

Reply via email to