SolrCloud: Collection API question and problem with core loading

2013-07-15 Thread Patrick Mi
Hi there,

I run 2 solr instances ( Tomcat 7, Solr 4.3.0 , one shard),one external
Zookeeper instance and have lots of cores. 

I use collection API to create the new core dynamically after the
configuration for the core is uploaded to the Zookeeper and it all works
fine.

As there are so many cores it takes very long time to load them at start up
I would like to start up the server quickly and load the cores on demand.

When the core is created via collection API it is created with default
parameter : loadOnStartup="true" ( this can be seen in solr.xml )

Question: is there a way to specify this parameter so it can be set 'false'
in collection API ?  

Problem: If I manually set loadOnStartup="true" for the core I had exception
below when I used CloudSolrServer to query the core : 
Error: org.apache.solr.client.solrj.SolrServerException: No live SolrServers
available to handle this request  

Seems to me that CloudSolrServer will not trigger the core to be loaded. 

Is it possible to get the core loaded using CloudSolrServer?

Regards,
Patrick




RE: SolrCloud with Zookeeper ensemble : fail to restart master server

2013-04-16 Thread Patrick Mi
After a number of testing I found that running embedded zookeeper isn't a
good idea especially only run one Zookeeper instance. When the Solr instance
with ZooKeeper embedded gets rebooted it got confused who should be the
leader therefore it will not start while others(followers) are still
running. I now use standalone Zookeeper instance and that works well.

Thanks Erick for giving the right direction, much appreciated!

Regards,
Patrick

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, 20 March 2013 2:57 a.m.
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud with Zookeeper ensemble : fail to restart master
server

First, the bootstrap_conf  and numShards should only be specified the
_first_ time you start up your leader. bootstrap_conf's purpose is to push
the configuration files to Zookeeper. numShards is a one-time-only
parameter that you shouldn't specify more than once, it is ignored
afterwards I think. Once the conf files are up in zookeeper, then they
don't need to be pushed again until they change, and you can use the
command-line tools to do that

Terminology: we're trying to get away from master/slave and use
leader/replica in SolrCloud mode to distinguish it from the old replication
process, so just checking to be sure that you probably really mean
leader/replica, right?

 Watch your admin/SolrCloud link as you bring machines up and down. That
page will show you the state of each of your machines. Normally there's no
trouble bringing the leader up and down, _except_ it sounds like you have
your zookeeper running embedded. A quorum of ZK nodes (in this case one)
needs to be running for SolrCloud to operate. Still, that shouldn't prevent
your machine running ZK from coming back up.

So I'm a bit puzzled, but let's straighten out the startup stuff and watch
your solr log on your leader when you bring it up, that should generate
some more questions..

Best
Erick


On Mon, Mar 18, 2013 at 11:12 PM, Patrick Mi  wrote:

> Hi there,
>
> I have experienced some problems starting the master server.
>
> Solr4.2 under Tomcat 7 on Centos6.
>
> Configuration :
> 3 solr instances running on different machines, one shard, 3 cores, 2
> replicas, using Zookeeper comes with Solr
>
> The master server A has the following run option: -Dbootstrap_conf=true
> -DzkRun -DnumShards=1,
> The slave servers B and C have : -DzkHost=masterServerIP:2181
>
> It works well for add/update/delete etc after I start up master and slave
> servers in order.
>
> When the master A is up stop/start slave B and C are OK.
>
> When slave B and C are running I couldn't restart master A. Only after I
> shutdown B and C then I can start master A.
>
> Is this a feature or bug or something I haven't configure properly?
>
> Thanks advance for your help
>
> Regards,
> Patrick
>
>



OPENNLP current patch compiling problem for 4.x branch

2013-05-22 Thread Patrick Mi
Hi,

I checked out from here
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0 and
downloaded the latest patch LUCENE-2899-current.patch.

Applied the patch ok but when I did 'ant compile' I got the following error:


==
[javac]
/home/lucene_solr_4_3_0/lucene/analysis/opennlp/src/java/org/apache/lucene/a
nalysis/opennlp/FilterPayloadsFilter.java:43: error
r: cannot find symbol
[javac] super(Version.LUCENE_44, input);
[javac]  ^
[javac]   symbol:   variable LUCENE_44
[javac]   location: class Version
[javac] 1 error
==

Compiled it on trunk without problem.

Is this patch supposed to work for 4.X?

Regards,
Patrick 



RE: OPENNLP current patch compiling problem for 4.x branch

2013-05-28 Thread Patrick Mi
Thanks Steve, that worked for branch_4x 

-Original Message-
From: Steve Rowe [mailto:sar...@gmail.com] 
Sent: Friday, 24 May 2013 3:19 a.m.
To: solr-user@lucene.apache.org
Subject: Re: OPENNLP current patch compiling problem for 4.x branch

Hi Patrick,

I think you should check out and apply the patch to branch_4x, rather than
the lucene_solr_4_3_0 tag:

http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x

Steve

On May 23, 2013, at 2:08 AM, Patrick Mi 
wrote:

> Hi,
> 
> I checked out from here
> http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0 and
> downloaded the latest patch LUCENE-2899-current.patch.
> 
> Applied the patch ok but when I did 'ant compile' I got the following
error:
> 
> 
> ==
>[javac]
>
/home/lucene_solr_4_3_0/lucene/analysis/opennlp/src/java/org/apache/lucene/a
> nalysis/opennlp/FilterPayloadsFilter.java:43: error
> r: cannot find symbol
>[javac] super(Version.LUCENE_44, input);
>[javac]  ^
>[javac]   symbol:   variable LUCENE_44
>[javac]   location: class Version
>[javac] 1 error
> ==
> 
> Compiled it on trunk without problem.
> 
> Is this patch supposed to work for 4.X?
> 
> Regards,
> Patrick 
> 



OPENNLP problems

2013-05-28 Thread Patrick Mi
Hi there,

Checked out branch_4x and applied the latest patch
LUCENE-2899-current.patch however I ran into 2 problems

Followed the wiki page instruction and set up a field with this type aiming
to keep nouns and verbs and do a facet on the field
==

  




  

==

Struggled to get that going until I put the extra parameter
keepPayloads="true" in as below. 
 

Question: am I doing the right thing? Is this a mistake on wiki 

Second problem:

Posted the document xml one by one to the solr and the result was what I
expected.



  1
  check in the hotel


However if I put multiple documents into the same xml file and post it in
one go only the first document gets processed( only 'check' and 'hotel' were
showing in the facet result.) 
 


  1
  check in the hotel


  2
  removes the payloads


  3
  retains only nouns and verbs 



Same problem when updated the data using csv upload.

Is that a bug or something I did wrong?

Thanks in advance!

Regards,
Patrick




RE: OPENNLP problems

2013-06-09 Thread Patrick Mi
Hi Lance,

I updated the src from 4.x and applied the latest patch LUCENE-2899-x.patch
uploaded on 6th June but still had the same problem.


Regards,
Patrick

-Original Message-
From: Lance Norskog [mailto:goks...@gmail.com] 
Sent: Thursday, 6 June 2013 5:16 p.m.
To: solr-user@lucene.apache.org
Subject: Re: OPENNLP problems

Patrick-
I found the problem with multiple documents. The problem was that the 
API for the life cycle of a Tokenizer changed, and I only noticed part 
of the change. You can now upload multiple documents in one post, and 
the OpenNLPTokenizer will process each document.

You're right, the example on the wiki is wrong. The FilterPayloadsFilter 
default is to remove the given payloads, and needs keepPayloads="true" 
to retain them.

The fixed patch is up as LUCENE-2899-x.patch. Again, thanks for trying it.

Lance

https://issues.apache.org/jira/browse/LUCENE-2899

On 05/28/2013 10:08 PM, Patrick Mi wrote:
> Hi there,
>
> Checked out branch_4x and applied the latest patch
> LUCENE-2899-current.patch however I ran into 2 problems
>
> Followed the wiki page instruction and set up a field with this type
aiming
> to keep nouns and verbs and do a facet on the field
> ==
>  positionIncrementGap="100">
>
>   tokenizerModel="opennlp/en-token.bin"/>
>   posTaggerModel="opennlp/en-pos-maxent.bin"/>
>   payloadList="NN,NNS,NNP,NNPS,VB,VBD,VBG,VBN,VBP,VBZ,FW"/>
>  
>
>  
> ==
>
> Struggled to get that going until I put the extra parameter
> keepPayloads="true" in as below.
>payloadList="NN,NNS,NNP,NNPS,VB,VBD,VBG,VBN,VBP,VBZ,FW"/>
>
> Question: am I doing the right thing? Is this a mistake on wiki
>
> Second problem:
>
> Posted the document xml one by one to the solr and the result was what I
> expected.
>
> 
> 
>1
>check in the hotel
> 
>
> However if I put multiple documents into the same xml file and post it in
> one go only the first document gets processed( only 'check' and 'hotel'
were
> showing in the facet result.)
>   
> 
> 
>1
>check in the hotel
> 
> 
>2
>removes the payloads
> 
> 
>3
>retains only nouns and verbs 
> 
> 
>
> Same problem when updated the data using csv upload.
>
> Is that a bug or something I did wrong?
>
> Thanks in advance!
>
> Regards,
> Patrick
>
>




DataDirectory: relative path doesn't work

2013-02-25 Thread Patrick Mi
I am running Solr4.0/Tomcat 7 on Centos6

According to this page http://wiki.apache.org/solr/SolrConfigXml if
 is not absolute, then it is relative to the instanceDir of the
SolrCore.

However the index directory is always created under the directory where I
start the Tomcat (startup.sh) rather than under instanceDir of the SolrCore.

Am I doing something wrong in configuration?

Regards,
Patrick



RE: DataDirectory: relative path doesn't work

2013-03-11 Thread Patrick Mi
Thanks for fixing the wiki page http://wiki.apache.org/solr/SolrConfigXml
now it says this:
'If this directory is not absolute, then it is relative to the directory
you're in when you start SOLR.'

It will be nice if you drop me a line here after you make the change on the
document ...

-Original Message-
From: Patrick Mi [mailto:patrick...@touchpointgroup.com] 
Sent: Tuesday, 26 February 2013 5:49 p.m.
To: solr-user@lucene.apache.org
Subject: DataDirectory: relative path doesn't work 

I am running Solr4.0/Tomcat 7 on Centos6

According to this page http://wiki.apache.org/solr/SolrConfigXml if
 is not absolute, then it is relative to the instanceDir of the
SolrCore.

However the index directory is always created under the directory where I
start the Tomcat (startup.sh) rather than under instanceDir of the SolrCore.

Am I doing something wrong in configuration?

Regards,
Patrick



SolrCloud with Zookeeper ensemble : fail to restart master server

2013-03-18 Thread Patrick Mi
Hi there,

I have experienced some problems starting the master server.

Solr4.2 under Tomcat 7 on Centos6.

Configuration : 
3 solr instances running on different machines, one shard, 3 cores, 2
replicas, using Zookeeper comes with Solr 

The master server A has the following run option: -Dbootstrap_conf=true
-DzkRun -DnumShards=1, 
The slave servers B and C have : -DzkHost=masterServerIP:2181 

It works well for add/update/delete etc after I start up master and slave
servers in order.

When the master A is up stop/start slave B and C are OK.

When slave B and C are running I couldn't restart master A. Only after I
shutdown B and C then I can start master A.

Is this a feature or bug or something I haven't configure properly?

Thanks advance for your help

Regards,
Patrick