Did we have any progress with that? I'd still love an offline package.
Regards,
Alex.
P.s. I found 'package' command in the menu and got all excited. It let me
execute it, but the resulting zip is on the server and I cannot download
it. I hope I did not offend Infra gods so much as to get banne
I had a partial success with executing a wget as follows:
wget --recursive --page-requisites --no-clobber --html-extension
--convert-links --restrict-file-names=windows --domains
wiki.apache.org http://wiki.apache.org/solr/ -w 10 -l 5
configuring a web server to serve that location and then ind
I have permission to provide an export. Right now I'm thinking of it
being a one off dump, without the user dir. If someone wants to research
how to make moin automate it, I at least promise to listen.
Upayavira
On Tue, Jan 1, 2013, at 08:10 AM, Alexandre Rafalovitch wrote:
> That's why I think t
That's why I think this could be a nice joint project with Apacha Infra.
They provide Moin export, we build a way to index it with Solr for local
usage. Start with our own - Solr - project , then sell it to others once it
has been dog-fooded enough. Instant increased Solr exposure to all Apache
pro
3 problems:
a- he wanted to read it locally.
b- crawling the open web is imperfect.
c- /browse needs to get at the files with the same URL as the uploader.
a and b- Try downloading the whole thing with 'wget'. It has a 'make
links point to the downloaded files' option. Wget is great.
I have do
Here's a geeky way to do it yourself:
Fire up Solr 4.x, run this from example/exampledocs:
java -Ddata=web -Ddelay=2 -Drecursive=1 -jar post.jar
http://wiki.apache.org/solr/
(although I do end up getting a bunch of 503's, so maybe this isn't very
reliable yet?)
Tada: http://localhost:8983/
I can ask this. If folks there are okay with it, I can produce the dump,
but it is unlikely to be a service rather a one off.
Upayavira
On Sun, Dec 30, 2012, at 06:34 AM, Otis Gospodnetic wrote:
> Hi,
>
> Sorry, by infra I meant ASF infrastructure people. There's a mailing list
> and a JIRA proj
Hi,
Sorry, by infra I meant ASF infrastructure people. There's a mailing list
and a JIRA project for infra stuff.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Dec 29, 2012 8:45 PM, "Alexandre Rafalovitch" wrote:
> Sorry,
>
> What's Infra? A mailing list? Demand is probably low for
Sorry,
What's Infra? A mailing list? Demand is probably low for Solr, but may be
sufficient for all Apache's individual projects. I guess one way to check
is too see in Apache logs if there is a lot of scrapers running (by user
agents).
Anyway, for Solr specifically, an acceptable substitute coul
I'd take it to Infra, although I think demand for this is so low...
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Dec 29, 2012 8:14 PM, "Alexandre Rafalovitch" wrote:
> Should that be setup as a public service then (like Wikipedia dump)?
> Because I need one too and I don't think it
Should that be setup as a public service then (like Wikipedia dump)?
Because I need one too and I don't think it is a good idea for DDOSing Wiki
with crawlers. And I bet, there will be some 'challenges' during scraping.
Regards,
Alex.
P.s. In fact, it would make an interesting example to have
Hi,
You can easily crawl it with wget to get a local copy.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Dec 29, 2012 4:54 PM, "d_k" wrote:
> Hello,
>
> I'm setting up Solr inside an intranet without an internet access and
> I was wondering if there is a way to obtain the data dump
12 matches
Mail list logo