Re: Viewing the Solr MoinMoin wiki offline

2012-12-30 Thread Upayavira
I can ask this. If folks there are okay with it, I can produce the dump,
but it is unlikely to be a service rather a one off.

Upayavira

On Sun, Dec 30, 2012, at 06:34 AM, Otis Gospodnetic wrote:
> Hi,
> 
> Sorry, by infra I meant ASF infrastructure people. There's a mailing list
> and a JIRA project for infra stuff.
> 
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On Dec 29, 2012 8:45 PM, "Alexandre Rafalovitch" 
> wrote:
> 
> > Sorry,
> >
> > What's Infra? A mailing list? Demand is probably low for Solr, but may be
> > sufficient for all Apache's individual projects. I guess one way to check
> > is too see in Apache logs if there is a lot of scrapers running (by user
> > agents).
> >
> > Anyway, for Solr specifically, an acceptable substitute could be the manual
> > version from Lucid Imagination:
> > http://lucidworks.lucidimagination.com/display/home/PDF+Versions
> >
> > Regards,
> >Alex.
> > P.s. I am getting a feeling that Lucid (and other commercial company)
> > people are not allowed to mention their products on this list.
> >
> > Personal blog: http://blog.outerthoughts.com/
> > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> > - Time is the quality of nature that keeps events from happening all at
> > once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
> >
> >
> > On Sun, Dec 30, 2012 at 12:17 PM, Otis Gospodnetic <
> > otis.gospodne...@gmail.com> wrote:
> >
> > > I'd take it to Infra, although I think demand for this is so low...
> > >
> > > Otis
> > > Solr & ElasticSearch Support
> > > http://sematext.com/
> > > On Dec 29, 2012 8:14 PM, "Alexandre Rafalovitch" 
> > > wrote:
> > >
> > > > Should that be setup as a public service then (like Wikipedia dump)?
> > > > Because I need one too and I don't think it is a good idea for DDOSing
> > > Wiki
> > > > with crawlers. And I bet, there will be some 'challenges' during
> > > scraping.
> > > >
> > > > Regards,
> > > > Alex.
> > > > P.s. In fact, it would make an interesting example to have an offline
> > > copy
> > > > with Solr index, etc.
> > > >
> > > > Personal blog: http://blog.outerthoughts.com/
> > > > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> > > > - Time is the quality of nature that keeps events from happening all at
> > > > once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
> > book)
> > > >
> > > >
> > > > On Sun, Dec 30, 2012 at 9:15 AM, Otis Gospodnetic <
> > > > otis.gospodne...@gmail.com> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > You can easily crawl it with wget to get a local copy.
> > > > >
> > > > > Otis
> > > > > Solr & ElasticSearch Support
> > > > > http://sematext.com/
> > > > > On Dec 29, 2012 4:54 PM, "d_k"  wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I'm setting up Solr inside an intranet without an internet access
> > and
> > > > > > I was wondering if there is a way to obtain the data dump of the
> > Solr
> > > > > > Wiki (http://wiki.apache.org/solr/) for offline viewing and
> > > searching.
> > > > > >
> > > > > > I understand MoinMoin has an export feature one can use
> > > > > > (http://moinmo.in/MoinDump and
> > > > > > http://moinmo.in/HelpOnMoinCommand/ExportDump) but i'm afraid it
> > > needs
> > > > > > to be executed from within the MoinMoin server.
> > > > > >
> > > > > > Is there a way to obtain the result of that command?
> > > > > > Is there another way to view the solr wiki offline?
> > > > > >
> > > > >
> > > >
> > >
> >


Re: Viewing the Solr MoinMoin wiki offline

2012-12-30 Thread Erik Hatcher
Here's a geeky way to do it yourself:

Fire up Solr 4.x, run this from example/exampledocs:

   java -Ddata=web -Ddelay=2 -Drecursive=1 -jar post.jar 
http://wiki.apache.org/solr/

(although I do end up getting a bunch of 503's, so maybe this isn't very 
reliable yet?)

Tada: http://localhost:8983/solr/collection1/browse

:)

Erik


On Dec 29, 2012, at 16:54 , d_k wrote:

> Hello,
> 
> I'm setting up Solr inside an intranet without an internet access and
> I was wondering if there is a way to obtain the data dump of the Solr
> Wiki (http://wiki.apache.org/solr/) for offline viewing and searching.
> 
> I understand MoinMoin has an export feature one can use
> (http://moinmo.in/MoinDump and
> http://moinmo.in/HelpOnMoinCommand/ExportDump) but i'm afraid it needs
> to be executed from within the MoinMoin server.
> 
> Is there a way to obtain the result of that command?
> Is there another way to view the solr wiki offline?



Re: ZooKeeper ensemble behind load balancer

2012-12-30 Thread Marcin Rzewucki
Hi,
Thanks for your replies. I don't change ZK hosts a lot of times. I'm using
Amazon platform and if a ZK host is not available I will start new EC2
instance and will have to plug it to Solr configuration (restart required).
In case of load balancer this would be much easier (no need to restart
solr).

Regards.


On 30 December 2012 04:06, Anirudha Jadhav  wrote:

> A zookeeper ensemble should be a fairly reliable, large enough no.of
> machines(3+ typically 5,7,9) for a quorum.
> So adding a load balancer on top will just add a hop and
> decrease performance, and also add a failure point in the system.
>
> that being said there needs to be a way to provide solr with a way to
> refresh conf. without restart.
>
> Solr takes a list of zk hosts on startup, If i am correct , uses one of
> them unless it fails or round robins.
>
> why do your zkhosts need to change a lot?
>
>
> On Sat, Dec 29, 2012 at 10:58 AM, Upayavira  wrote:
>
> > I would suggest asking this on the zookeeper user list.
> >
> > And let us know here what you find out, I'd be interested.
> >
> > Note, zookeeper, as I understand it, uses its own protocol, so to some
> > reasonable extent it probablmy depends on yr load balancer. Also, as I
> > understand it, zookeeper maintains active connections to solr hosts,
> > which is not a common scenario for load balances as I understand it.
> >
> > Upayavira
> >
> > On Fri, Dec 28, 2012, at 04:39 PM, Marcin Rzewucki wrote:
> > > Hi,
> > >
> > > Does Solr need connection to all of hosts in ZK ensemble or only to one
> > > of
> > > them at a time ? I wonder if it is possible to use load balancer for ZK
> > > ensemble and use only one address as zkHost for Solr ? Having load
> > > balancer
> > > makes it easier to change ZK hosts while still using same address by
> Solr
> > > (no need to restart Solr or change its configuration).
> > >
> > > Thanks in advance.
> > > Regards.
> >
>
>
>
> --
> Anirudha P. Jadhav
>


Re: ZooKeeper ensemble behind load balancer

2012-12-30 Thread Aloke Ghoshal
Hi Marcin,

Since you are thinking of this in the context of Amazon, I would suggest
taking a different route. Assign an Elastic IP (EIP) to each EC2 instance
running the ZK node & use the EIP in Solr. This way you could easily map
the EIP to a new EC2 instance subsequently, if required, and the changes
would get picked up by Solr automatically without a restart.

Regards,
Aloke


On Sun, Dec 30, 2012 at 8:29 PM, Marcin Rzewucki wrote:

> Hi,
> Thanks for your replies. I don't change ZK hosts a lot of times. I'm using
> Amazon platform and if a ZK host is not available I will start new EC2
> instance and will have to plug it to Solr configuration (restart required).
> In case of load balancer this would be much easier (no need to restart
> solr).
>
> Regards.
>
>
> On 30 December 2012 04:06, Anirudha Jadhav  wrote:
>
> > A zookeeper ensemble should be a fairly reliable, large enough no.of
> > machines(3+ typically 5,7,9) for a quorum.
> > So adding a load balancer on top will just add a hop and
> > decrease performance, and also add a failure point in the system.
> >
> > that being said there needs to be a way to provide solr with a way to
> > refresh conf. without restart.
> >
> > Solr takes a list of zk hosts on startup, If i am correct , uses one of
> > them unless it fails or round robins.
> >
> > why do your zkhosts need to change a lot?
> >
> >
> > On Sat, Dec 29, 2012 at 10:58 AM, Upayavira  wrote:
> >
> > > I would suggest asking this on the zookeeper user list.
> > >
> > > And let us know here what you find out, I'd be interested.
> > >
> > > Note, zookeeper, as I understand it, uses its own protocol, so to some
> > > reasonable extent it probablmy depends on yr load balancer. Also, as I
> > > understand it, zookeeper maintains active connections to solr hosts,
> > > which is not a common scenario for load balances as I understand it.
> > >
> > > Upayavira
> > >
> > > On Fri, Dec 28, 2012, at 04:39 PM, Marcin Rzewucki wrote:
> > > > Hi,
> > > >
> > > > Does Solr need connection to all of hosts in ZK ensemble or only to
> one
> > > > of
> > > > them at a time ? I wonder if it is possible to use load balancer for
> ZK
> > > > ensemble and use only one address as zkHost for Solr ? Having load
> > > > balancer
> > > > makes it easier to change ZK hosts while still using same address by
> > Solr
> > > > (no need to restart Solr or change its configuration).
> > > >
> > > > Thanks in advance.
> > > > Regards.
> > >
> >
> >
> >
> > --
> > Anirudha P. Jadhav
> >
>


SolrJ | Add a date field to ContentStreamUpdateRequest

2012-12-30 Thread uwe72
Hi there,

how can i add a date field to a pdf document?

   ContentStreamUpdateRequest up = new
ContentStreamUpdateRequest("/update/extract");
   up.addFile(pdfFile, "application/octet-stream");
   up.setParam("literal." + SolrConstants.ID, solrPDFId);

Regards
Uwe



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrJ-Add-a-date-field-to-ContentStreamUpdateRequest-tp4029704.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrJ | Add a date field to ContentStreamUpdateRequest

2012-12-30 Thread Yury Kats
On 12/30/2012 11:57 AM, uwe72 wrote:
> Hi there,
> 
> how can i add a date field to a pdf document?

Same way you add the ID field, using literal parameter.

> 
>ContentStreamUpdateRequest up = new
> ContentStreamUpdateRequest("/update/extract");
>up.addFile(pdfFile, "application/octet-stream");
>up.setParam("literal." + SolrConstants.ID, solrPDFId);
> 
> Regards
> Uwe
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/SolrJ-Add-a-date-field-to-ContentStreamUpdateRequest-tp4029704.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 



Re: ZooKeeper ensemble behind load balancer

2012-12-30 Thread Marcin Rzewucki
Right, that's a good idea. Thanks!

On 30 December 2012 17:41, Aloke Ghoshal  wrote:

> Hi Marcin,
>
> Since you are thinking of this in the context of Amazon, I would suggest
> taking a different route. Assign an Elastic IP (EIP) to each EC2 instance
> running the ZK node & use the EIP in Solr. This way you could easily map
> the EIP to a new EC2 instance subsequently, if required, and the changes
> would get picked up by Solr automatically without a restart.
>
> Regards,
> Aloke
>
>
> On Sun, Dec 30, 2012 at 8:29 PM, Marcin Rzewucki  >wrote:
>
> > Hi,
> > Thanks for your replies. I don't change ZK hosts a lot of times. I'm
> using
> > Amazon platform and if a ZK host is not available I will start new EC2
> > instance and will have to plug it to Solr configuration (restart
> required).
> > In case of load balancer this would be much easier (no need to restart
> > solr).
> >
> > Regards.
> >
> >
> > On 30 December 2012 04:06, Anirudha Jadhav  wrote:
> >
> > > A zookeeper ensemble should be a fairly reliable, large enough no.of
> > > machines(3+ typically 5,7,9) for a quorum.
> > > So adding a load balancer on top will just add a hop and
> > > decrease performance, and also add a failure point in the system.
> > >
> > > that being said there needs to be a way to provide solr with a way to
> > > refresh conf. without restart.
> > >
> > > Solr takes a list of zk hosts on startup, If i am correct , uses one of
> > > them unless it fails or round robins.
> > >
> > > why do your zkhosts need to change a lot?
> > >
> > >
> > > On Sat, Dec 29, 2012 at 10:58 AM, Upayavira  wrote:
> > >
> > > > I would suggest asking this on the zookeeper user list.
> > > >
> > > > And let us know here what you find out, I'd be interested.
> > > >
> > > > Note, zookeeper, as I understand it, uses its own protocol, so to
> some
> > > > reasonable extent it probablmy depends on yr load balancer. Also, as
> I
> > > > understand it, zookeeper maintains active connections to solr hosts,
> > > > which is not a common scenario for load balances as I understand it.
> > > >
> > > > Upayavira
> > > >
> > > > On Fri, Dec 28, 2012, at 04:39 PM, Marcin Rzewucki wrote:
> > > > > Hi,
> > > > >
> > > > > Does Solr need connection to all of hosts in ZK ensemble or only to
> > one
> > > > > of
> > > > > them at a time ? I wonder if it is possible to use load balancer
> for
> > ZK
> > > > > ensemble and use only one address as zkHost for Solr ? Having load
> > > > > balancer
> > > > > makes it easier to change ZK hosts while still using same address
> by
> > > Solr
> > > > > (no need to restart Solr or change its configuration).
> > > > >
> > > > > Thanks in advance.
> > > > > Regards.
> > > >
> > >
> > >
> > >
> > > --
> > > Anirudha P. Jadhav
> > >
> >
>


Re: SolrJ | Add a date field to ContentStreamUpdateRequest

2012-12-30 Thread Yury Kats
On 12/30/2012 3:55 PM, uwe72 wrote:
> but i can just add String values.i want to add Date objects?!

You represent the Date as a String, in format Solr uses for dates:
http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/schema/DateField.html



Re: old index not cleaned up on the slave

2012-12-30 Thread Jason
Hi, Erick
I didn't configure anything for index backup.
My ReplicationHandler configuration is below.
Other setting in solrconfig.xml is almost default.
Is there a deletion policy for replication?
I know "maxNumberOfBackups" parameter, but this is for master server.
Are there any configuration for index backup on slave server?


   
 ${solr.master.enable:false}
 optimize
 startup
 schema.xml,solrconfig.xml,db-data-config.xml,stopwords.txt,stopwords_en.txt
   
 
 ${solr.slave.enable:false}
 http://${solr.master.url}/${solr.context.name}/${solr.core.name}/replication
   




--
View this message in context: 
http://lucene.472066.n3.nabble.com/old-index-not-cleaned-up-on-the-slave-tp4029370p4029736.html
Sent from the Solr - User mailing list archive at Nabble.com.


Need Help in Delta DataImport Scheduler using stored procedure.

2012-12-30 Thread Pragyanshis Pattanaik
Hi ,

 

Please help how I can schedule a delta dataimport by using DIHS. Please note
I have already followed the steps mentioned at below link.

http://wiki.apache.org/solr/DataImportHandler#Scheduling still I am not able
to succeed.

 

 

Can anybody tell me can I send Last solr Indexed time as parameter to any
procedure ?

Ex :- deltaImportQuery="[SP_GetAccountDetails] '${dih.last_index_time}'"

 

Or any way to achieve it ?

 

 

Thanks