Re: Regarding Solr Cloud issue...

2013-10-16 Thread Shalin Shekhar Mangar
Chris, can you post your complete clusterstate.json? Do all shards have a
null range? Also, did you issue any core admin CREATE commands apart from
the create collection api.

Primoz, I was able to reproduce this but by doing an illegal operation.
Suppose I create a collection with numShards=5 and then I issue a core
admin create command such as:
http://localhost:8983/solr/admin/cores?action=CREATE&name=xyz&collection=mycollection51&shard=shard6

Then a "shard6" is added to the collection with a null range. This is a bug
because we should never allow such a core admin create to succeed anyway.
I'll open an issue.



On Wed, Oct 16, 2013 at 11:49 AM,  wrote:

> I sometimes also do get null ranges when doing colletions/cores API
> actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed
> because zkCli had problems with "putfile" command, but in 4.5.0 it works
> OK. All you have to do is "download" clusterstate.json from ZK ("get
> /clusterstate.json"), fix ranges to appropriate values and upload the file
> back to ZK with zkCli.
>
> But why those null ranges happen at all is beyond me :)
>
> Primoz
>
>
>
> From:   Shalin Shekhar Mangar 
> To: solr-user@lucene.apache.org
> Date:   16.10.2013 07:37
> Subject:Re: Regarding Solr Cloud issue...
>
>
>
> I'm sorry I am not able to reproduce this issue.
>
> I started 5 solr-4.4 instances.
> I copied example directory into example1, example2, example3 and example4
> cd example; java -Dbootstrap_confdir=./solr/collection1/conf
> -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
> cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
> cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar start.jar
> cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar start.jar
> cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar start.jar
>
> After that I invoked:
>
> http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection51&numShards=5&replicationFactor=1
>
>
> I can see all shards having non-null ranges in clusterstate.
>
>
> On Tue, Oct 15, 2013 at 8:47 PM, Chris  wrote:
>
> > Hi Shalin,.
> >
> > Thank you for your quick reply. I appreciate all the help.
> >
> > I started the solr cloud servers first...with 5 nodes.
> >
> > then i issued a command like below to create the shards -
> >
> >
> >
>
> http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=5&replicationFactor=1
>
> > <
> >
>
> http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=4
>
> > >
> >
> > Please advice.
> >
> > Regards,
> > Chris
> >
> >
> > On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar <
> > shalinman...@gmail.com> wrote:
> >
> > > How did you create these shards? Can you tell us how to reproduce the
> > > issue?
> > >
> > > Any shard in a collection with compositeId router should never have
> null
> > > ranges.
> > >
> > >
> > > On Tue, Oct 15, 2013 at 7:07 PM, Chris  wrote:
> > >
> > > > Hi,
> > > >
> > > > I am using solr 4.4 as cloud. while creating shards i see that the
> last
> > > > shard has range of "null". i am not sure if this is a bug.
> > > >
> > > > I am stuck with having null value for the range in clusterstate.json
> > > > (attached below)
> > > >
> > > > "shard5":{ "range":null, "state":"active",
> "replicas":{"core_node1":{
> > > > "state":"active", "core":"Web_shard5_replica1",
> > > > "node_name":"domain-name.com:1981_solr", "base_url":"
> > > > http://domain-name.com:1981/solr";, "leader":"true",
> > > > "router":"compositeId"},
> > > >
> > > > I tried to use zookeeper cli to change this, but it was not able to.
> I
> > > > tried to locate this file, but didn't find it anywhere.
> > > >
> > > > Can you please let me know how do i change the range from null to
> > > something
> > > > meaningful? i have the range that i need, so if i can find the file,
> > > maybe
> > > > i can change it manually.
> > > >
> > > > My next question is - can we have a catch all for ranges, i mean if
> > > things
> > > > don't match any other range then insert in this shard..is this
> > possible?
> > > >
> > > > Kindly advice.
> > > > Chris
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Shalin Shekhar Mangar.
> > >
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
If I am not mistaken the only way to create a new shard from a collection 
in 4.4.0 was to use cores API. That worked fine for me until I used 
*other* cores API commands. Those usually produced null ranges. 

In 4.5.0 this is fixed with newly added commands "createshard" etc. to the 
collections API, right?

Primoz



From:   Shalin Shekhar Mangar 
To: solr-user@lucene.apache.org
Date:   16.10.2013 09:06
Subject:Re: Regarding Solr Cloud issue...



Chris, can you post your complete clusterstate.json? Do all shards have a
null range? Also, did you issue any core admin CREATE commands apart from
the create collection api.

Primoz, I was able to reproduce this but by doing an illegal operation.
Suppose I create a collection with numShards=5 and then I issue a core
admin create command such as:
http://localhost:8983/solr/admin/cores?action=CREATE&name=xyz&collection=mycollection51&shard=shard6


Then a "shard6" is added to the collection with a null range. This is a 
bug
because we should never allow such a core admin create to succeed anyway.
I'll open an issue.



On Wed, Oct 16, 2013 at 11:49 AM,  wrote:

> I sometimes also do get null ranges when doing colletions/cores API
> actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed
> because zkCli had problems with "putfile" command, but in 4.5.0 it works
> OK. All you have to do is "download" clusterstate.json from ZK ("get
> /clusterstate.json"), fix ranges to appropriate values and upload the 
file
> back to ZK with zkCli.
>
> But why those null ranges happen at all is beyond me :)
>
> Primoz
>
>
>
> From:   Shalin Shekhar Mangar 
> To: solr-user@lucene.apache.org
> Date:   16.10.2013 07:37
> Subject:Re: Regarding Solr Cloud issue...
>
>
>
> I'm sorry I am not able to reproduce this issue.
>
> I started 5 solr-4.4 instances.
> I copied example directory into example1, example2, example3 and 
example4
> cd example; java -Dbootstrap_confdir=./solr/collection1/conf
> -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
> cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar 
start.jar
> cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar 
start.jar
> cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar 
start.jar
> cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar 
start.jar
>
> After that I invoked:
>
> 
http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection51&numShards=5&replicationFactor=1

>
>
> I can see all shards having non-null ranges in clusterstate.
>
>
> On Tue, Oct 15, 2013 at 8:47 PM, Chris  wrote:
>
> > Hi Shalin,.
> >
> > Thank you for your quick reply. I appreciate all the help.
> >
> > I started the solr cloud servers first...with 5 nodes.
> >
> > then i issued a command like below to create the shards -
> >
> >
> >
>
> 
http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=5&replicationFactor=1

>
> > <
> >
>
> 
http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=4

>
> > >
> >
> > Please advice.
> >
> > Regards,
> > Chris
> >
> >
> > On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar <
> > shalinman...@gmail.com> wrote:
> >
> > > How did you create these shards? Can you tell us how to reproduce 
the
> > > issue?
> > >
> > > Any shard in a collection with compositeId router should never have
> null
> > > ranges.
> > >
> > >
> > > On Tue, Oct 15, 2013 at 7:07 PM, Chris  wrote:
> > >
> > > > Hi,
> > > >
> > > > I am using solr 4.4 as cloud. while creating shards i see that the
> last
> > > > shard has range of "null". i am not sure if this is a bug.
> > > >
> > > > I am stuck with having null value for the range in 
clusterstate.json
> > > > (attached below)
> > > >
> > > > "shard5":{ "range":null, "state":"active",
> "replicas":{"core_node1":{
> > > > "state":"active", "core":"Web_shard5_replica1",
> > > > "node_name":"domain-name.com:1981_solr", "base_url":"
> > > > http://domain-name.com:1981/solr";, "leader":"true",
> > > > "router":"compositeId"},
> > > >
> > > > I tried to use zookeeper cli to change this, but it was not able 
to.
> I
> > > > tried to locate this file, but didn't find it anywhere.
> > > >
> > > > Can you please let me know how do i change the range from null to
> > > something
> > > > meaningful? i have the range that i need, so if i can find the 
file,
> > > maybe
> > > > i can change it manually.
> > > >
> > > > My next question is - can we have a catch all for ranges, i mean 
if
> > > things
> > > > don't match any other range then insert in this shard..is this
> > possible?
> > > >
> > > > Kindly advice.
> > > > Chris
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Shalin Shekhar Mangar.
> > >
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
>


-- 
Regards,
Shalin Shekhar Mangar.



Re: SPLITSHARD not working in SOLR-4.4.0

2013-10-16 Thread RadhaJayalakshmi
Shalin,
It is working for me. As you pointed rightly, i had defined UNIQUE_KEY field
in schema, but forgot to mention this field in the  decalaration.
After i added this, it started working.
One another question i have with regard to SPLITSHARD is, we are not able to
control, which nodes of tomcat, the splitted shards should be create.
While creating a collection, we can mention createNodeSet to set our
preference of tomcat nodes on which the collections slices should be
created.
But i dont find that feature in SPLITSHARD API. Would you know that it is a
limitation in solr 4.4 or is there any other means by which we can achieve
this



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SPLITSHARD-not-working-in-SOLR-4-4-0-tp4095623p4095809.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [Indexing XML files in Solr with DataImportHandler]

2013-10-16 Thread kujta1
it is not indexing, it is saying there are no files indexed



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-XML-files-in-Solr-with-DataImportHandler-tp4095628p4095811.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [Indexing XML files in Solr with DataImportHandler]

2013-10-16 Thread Gora Mohanty
On 16 October 2013 13:06, kujta1  wrote:
> it is not indexing, it is saying there are no files indexed

If you expect answers on the mailing list it might be best to provide
details here. From a quick glance at Stackoverflow, it looks like you
need a FileListEntityProcessor.

Searching Google turns up many examples of using a FileDataSource,
e.g., see:
http://java.dzone.com/news/data-import-handler-%E2%80%93-import

Regards,
Gora


Re: Debugging update request

2013-10-16 Thread michael.boom
Thanks Erick!

The version is 4.4.0.

I'm posting 100k docs batches every 30-40 sec from each indexing client and
sometimes two or more clients post in a very small timeframe. That's when i
think the deadlock happens.

I'll try to replicate the problem and check the thread dump.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095821.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Debugging update request

2013-10-16 Thread Chris Geeringh
I ran an import last night, and this morning my cloud wouldn't accept
updates. I'm running the latest 4.6 snapshot. I was importing with latest
solrj snapshot, and using java bin transport with CloudSolrServer.

The cluster had indexed ~1.3 million docs before no further updates were
accepted, querying still working.

I'll run jstack shortly and provide the results.

On Wednesday, October 16, 2013, michael.boom wrote:

> Thanks Erick!
>
> The version is 4.4.0.
>
> I'm posting 100k docs batches every 30-40 sec from each indexing client and
> sometimes two or more clients post in a very small timeframe. That's when i
> think the deadlock happens.
>
> I'll try to replicate the problem and check the thread dump.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095821.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: SPLITSHARD not working in SOLR-4.4.0

2013-10-16 Thread Shalin Shekhar Mangar
Thanks for clearing that.

The way it is implemented, shard splitting must create the leaders of
sub-shards on the same node as the leader of the parent shard. The location
of the other replicas of the sub-shards are chosen at random. Split shard
doesn't support a createNodeSet parameter yet but it'd make for a nice
improvement. Can you please open a jira issue?


On Wed, Oct 16, 2013 at 1:00 PM, RadhaJayalakshmi <
rlakshminaraya...@inautix.co.in> wrote:

> Shalin,
> It is working for me. As you pointed rightly, i had defined UNIQUE_KEY
> field
> in schema, but forgot to mention this field in the 
> decalaration.
> After i added this, it started working.
> One another question i have with regard to SPLITSHARD is, we are not able
> to
> control, which nodes of tomcat, the splitted shards should be create.
> While creating a collection, we can mention createNodeSet to set our
> preference of tomcat nodes on which the collections slices should be
> created.
> But i dont find that feature in SPLITSHARD API. Would you know that it is a
> limitation in solr 4.4 or is there any other means by which we can achieve
> this
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SPLITSHARD-not-working-in-SOLR-4-4-0-tp4095623p4095809.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: Regarding Solr Cloud issue...

2013-10-16 Thread Shalin Shekhar Mangar
If the initial collection was created with a numShards parameter (and hence
compositeId router then there was no way to create a new logical shard. You
can add replicas with the core admin API but only to shards that already
exist. A new logical shard can only be created by splitting an existing one.

The "createshard" API also has the same limitation -- it cannot create a
shard for a collection with compositeId router. It is supposed to be used
for collections with custom sharding (i.e. "implicit" router). In such
collections, there is no concept of a hash range and routing is done
explicitly by the user using the "shards" parameter in the request or by
sending the request to the target core/node directly.

So, in summary, attempting to add a new logical shard to a collection with
compositeId router via CoreAdmin APIs is wrong, unsupported and should be
disallowed. Adding replicas to existing logical shards is okay though.


On Wed, Oct 16, 2013 at 12:56 PM,  wrote:

> If I am not mistaken the only way to create a new shard from a collection
> in 4.4.0 was to use cores API. That worked fine for me until I used
> *other* cores API commands. Those usually produced null ranges.
>
> In 4.5.0 this is fixed with newly added commands "createshard" etc. to the
> collections API, right?
>
> Primoz
>
>
>
> From:   Shalin Shekhar Mangar 
> To: solr-user@lucene.apache.org
> Date:   16.10.2013 09:06
> Subject:Re: Regarding Solr Cloud issue...
>
>
>
> Chris, can you post your complete clusterstate.json? Do all shards have a
> null range? Also, did you issue any core admin CREATE commands apart from
> the create collection api.
>
> Primoz, I was able to reproduce this but by doing an illegal operation.
> Suppose I create a collection with numShards=5 and then I issue a core
> admin create command such as:
>
> http://localhost:8983/solr/admin/cores?action=CREATE&name=xyz&collection=mycollection51&shard=shard6
>
>
> Then a "shard6" is added to the collection with a null range. This is a
> bug
> because we should never allow such a core admin create to succeed anyway.
> I'll open an issue.
>
>
>
> On Wed, Oct 16, 2013 at 11:49 AM,  wrote:
>
> > I sometimes also do get null ranges when doing colletions/cores API
> > actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed
> > because zkCli had problems with "putfile" command, but in 4.5.0 it works
> > OK. All you have to do is "download" clusterstate.json from ZK ("get
> > /clusterstate.json"), fix ranges to appropriate values and upload the
> file
> > back to ZK with zkCli.
> >
> > But why those null ranges happen at all is beyond me :)
> >
> > Primoz
> >
> >
> >
> > From:   Shalin Shekhar Mangar 
> > To: solr-user@lucene.apache.org
> > Date:   16.10.2013 07:37
> > Subject:Re: Regarding Solr Cloud issue...
> >
> >
> >
> > I'm sorry I am not able to reproduce this issue.
> >
> > I started 5 solr-4.4 instances.
> > I copied example directory into example1, example2, example3 and
> example4
> > cd example; java -Dbootstrap_confdir=./solr/collection1/conf
> > -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
> > cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar
> start.jar
> > cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar
> start.jar
> > cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar
> start.jar
> > cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar
> start.jar
> >
> > After that I invoked:
> >
> >
>
> http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection51&numShards=5&replicationFactor=1
>
> >
> >
> > I can see all shards having non-null ranges in clusterstate.
> >
> >
> > On Tue, Oct 15, 2013 at 8:47 PM, Chris  wrote:
> >
> > > Hi Shalin,.
> > >
> > > Thank you for your quick reply. I appreciate all the help.
> > >
> > > I started the solr cloud servers first...with 5 nodes.
> > >
> > > then i issued a command like below to create the shards -
> > >
> > >
> > >
> >
> >
>
> http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=5&replicationFactor=1
>
> >
> > > <
> > >
> >
> >
>
> http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=4
>
> >
> > > >
> > >
> > > Please advice.
> > >
> > > Regards,
> > > Chris
> > >
> > >
> > > On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar <
> > > shalinman...@gmail.com> wrote:
> > >
> > > > How did you create these shards? Can you tell us how to reproduce
> the
> > > > issue?
> > > >
> > > > Any shard in a collection with compositeId router should never have
> > null
> > > > ranges.
> > > >
> > > >
> > > > On Tue, Oct 15, 2013 at 7:07 PM, Chris  wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I am using solr 4.4 as cloud. while creating shards i see that the
> > last
> > > > > shard has range of "null". i am not sure if this is a bug.
> > > > >
> > > > > I am stuck with having null value for the range in
> 

RE: ClusteringComponent under Tomcat 7

2013-10-16 Thread Lieberman, Ariel
Hi,

If I recall correctly this problem relate to the class loader path.

make sure that the ./lib (solr home, were you've replaced the jars) is not also 
part of the Tomcat class loader path.
(in other words solr and Tomcat cannot share the same ./lib directories.)

-Ariel

-Original Message-
From: ravi koshal [mailto:ravikosha...@gmail.com] 
Sent: Tuesday, October 15, 2013 10:10 AM
To: solr-user@lucene.apache.org
Subject: Re: ClusteringComponent under Tomcat 7

Hi Lieberman, 
I am facing the same issue. were you able to resolve this?
I am able to see the solr home , but the cores do not appear.
my stack trace is as follows :

org.apache.solr.common.SolrException: Error Instantiating SearchComponent, 
solr.clustering.ClusteringComponent failed to instantiate 
org.apache.solr.handler.component.SearchComponent
at org.apache.solr.core.SolrCore.(SolrCore.java:834)
at org.apache.solr.core.SolrCore.(SolrCore.java:625)
at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:524)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:559)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:241)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source) Caused by: 
org.apache.solr.common.SolrException: Error Instantiating SearchComponent, 
solr.clustering.ClusteringComponent failed to instantiate 
org.apache.solr.handler.component.SearchComponent
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:547)
at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:582)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2128)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2122)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2155)
at
org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java:1177)
at org.apache.solr.core.SolrCore.(SolrCore.java:762)
... 11 more
Caused by: java.lang.ClassCastException: class 
org.apache.solr.handler.clustering.ClusteringComponent
at java.lang.Class.asSubclass(Unknown Source)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:44
3)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:38
1)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:526)


Lieberman, Ariel  verint.com> writes:

> 
> Hi,
> 
> I'm trying to run Solr 4.3 (and 4.4) with 
> -Dsolr.clustering.enabled=true
> 
> I've copied all relevant jars to ./lib directory under the instance.
> 
> With jetty it runs OK! But, under Tomcat I receives the error 
> (exception)
below.
> 
> Any idea/help?
> 
> Thanks,
> 
> -Ariel
> 
> org.apache.solr.common.SolrException: Error Instantiating 
> SearchComponent, solr.clustering.ClusteringComponent failed to 
> instantiate
org.apache.solr.handler.component.SearchComponent
>  at org.apache.solr.core.SolrCore.(SolrCore.java:835)
>  at org.apache.solr.core.SolrCore.(SolrCore.java:629)
>  at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:622)
>  at org.apache.solr.core.CoreContainer.create(CoreContainer.java:657)
>  at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364)
>  at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)
>  at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>  at java.util.concurrent.FutureTask.run(Unknown Source)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)
>  at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>  at java.util.concurrent.FutureTask.run(Unknown Source)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>  at java.lang.Thread.run(Unknown Source) Caused by: 
> org.apache.solr.common.SolrException: Error Instantiating
SearchComponent,
> solr.clustering.ClusteringComponent failed to instantiate
org.apache.solr.handler.component.SearchComponent
>  at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
>  at
org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:586)
>  at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2173)
>  at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2167)
>  at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2200)
>  at
org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java:1231)
>  at org.apache.solr.core.SolrCore.(SolrCore.java:766)
>  ... 13 more
> Caused by: java.lang.ClassC

req info : SOLRJ and TermVector

2013-10-16 Thread elfu
hi,

can i access TermVector information using solrj ?


thx,
elfu


Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
Yap, you are right - I only created extra replicas with cores API. For a 
new shard I had to use "split shard" command.

My apologies.

Primož



From:   Shalin Shekhar Mangar 
To: solr-user@lucene.apache.org
Date:   16.10.2013 10:45
Subject:Re: Regarding Solr Cloud issue...



If the initial collection was created with a numShards parameter (and 
hence
compositeId router then there was no way to create a new logical shard. 
You
can add replicas with the core admin API but only to shards that already
exist. A new logical shard can only be created by splitting an existing 
one.

The "createshard" API also has the same limitation -- it cannot create a
shard for a collection with compositeId router. It is supposed to be used
for collections with custom sharding (i.e. "implicit" router). In such
collections, there is no concept of a hash range and routing is done
explicitly by the user using the "shards" parameter in the request or by
sending the request to the target core/node directly.

So, in summary, attempting to add a new logical shard to a collection with
compositeId router via CoreAdmin APIs is wrong, unsupported and should be
disallowed. Adding replicas to existing logical shards is okay though.


On Wed, Oct 16, 2013 at 12:56 PM,  wrote:

> If I am not mistaken the only way to create a new shard from a 
collection
> in 4.4.0 was to use cores API. That worked fine for me until I used
> *other* cores API commands. Those usually produced null ranges.
>
> In 4.5.0 this is fixed with newly added commands "createshard" etc. to 
the
> collections API, right?
>
> Primoz
>
>
>
> From:   Shalin Shekhar Mangar 
> To: solr-user@lucene.apache.org
> Date:   16.10.2013 09:06
> Subject:Re: Regarding Solr Cloud issue...
>
>
>
> Chris, can you post your complete clusterstate.json? Do all shards have 
a
> null range? Also, did you issue any core admin CREATE commands apart 
from
> the create collection api.
>
> Primoz, I was able to reproduce this but by doing an illegal operation.
> Suppose I create a collection with numShards=5 and then I issue a core
> admin create command such as:
>
> 
http://localhost:8983/solr/admin/cores?action=CREATE&name=xyz&collection=mycollection51&shard=shard6

>
>
> Then a "shard6" is added to the collection with a null range. This is a
> bug
> because we should never allow such a core admin create to succeed 
anyway.
> I'll open an issue.
>
>
>
> On Wed, Oct 16, 2013 at 11:49 AM,  wrote:
>
> > I sometimes also do get null ranges when doing colletions/cores API
> > actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily 
fixed
> > because zkCli had problems with "putfile" command, but in 4.5.0 it 
works
> > OK. All you have to do is "download" clusterstate.json from ZK ("get
> > /clusterstate.json"), fix ranges to appropriate values and upload the
> file
> > back to ZK with zkCli.
> >
> > But why those null ranges happen at all is beyond me :)
> >
> > Primoz
> >
> >
> >
> > From:   Shalin Shekhar Mangar 
> > To: solr-user@lucene.apache.org
> > Date:   16.10.2013 07:37
> > Subject:Re: Regarding Solr Cloud issue...
> >
> >
> >
> > I'm sorry I am not able to reproduce this issue.
> >
> > I started 5 solr-4.4 instances.
> > I copied example directory into example1, example2, example3 and
> example4
> > cd example; java -Dbootstrap_confdir=./solr/collection1/conf
> > -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
> > cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar
> start.jar
> > cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar
> start.jar
> > cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar
> start.jar
> > cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar
> start.jar
> >
> > After that I invoked:
> >
> >
>
> 
http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection51&numShards=5&replicationFactor=1

>
> >
> >
> > I can see all shards having non-null ranges in clusterstate.
> >
> >
> > On Tue, Oct 15, 2013 at 8:47 PM, Chris  wrote:
> >
> > > Hi Shalin,.
> > >
> > > Thank you for your quick reply. I appreciate all the help.
> > >
> > > I started the solr cloud servers first...with 5 nodes.
> > >
> > > then i issued a command like below to create the shards -
> > >
> > >
> > >
> >
> >
>
> 
http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=5&replicationFactor=1

>
> >
> > > <
> > >
> >
> >
>
> 
http://localhost:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&replicationFactor=4

>
> >
> > > >
> > >
> > > Please advice.
> > >
> > > Regards,
> > > Chris
> > >
> > >
> > > On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar <
> > > shalinman...@gmail.com> wrote:
> > >
> > > > How did you create these shards? Can you tell us how to reproduce
> the
> > > > issue?
> > > >
> > > > Any shard in a collection with compositeId router should never 
have
> > null
> > > > ranges.
> > > >
> > > >
> > >

Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Stavros Delisavas
Hello Solr-Experts,

I am currently having a strange issue with my solr querys. I am running
a small php/mysql-website that uses Solr for faster text-searches in
name-lists, movie-titles, etc. Recently I noticed that the results on my
local development-environment differ from those on my webserver. Both
use the 100% same mysql-database with identical solr-queries for
data-import.
This is a sample query:

http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29&version=2.2&start=0&rows=1000&indent=on&fl=titleid

It is autogenerated by an php-script and 100% identical on local and on
my webserver. My local solr gives me the expected results: all entries
that have the words "into" AND "the" AND "wild*" in them.
But my webserver acts as if I was looking for "into" OR "the" OR
"wild*", eventhough the query is the same (as shown above). That's why I
get useless (too many) results on the webserver-side.

I don't know what could be the issue. I have tried to check the
config-files but I don't really know what to look for, so it is
overwhelming for me to search through this big file without knowing.

What could be the problem, where can I check/find it and how can I solve
that problem?

In case, additional informations are needed, let me know please.

Thank you!

(Excuse my poor english, please. It's not my mother-language.)


Re: Debugging update request

2013-10-16 Thread michael.boom
I got the trace from jstack.
I found references to "semaphore" but not sure if this is what you meant.
Here's the trace:
http://pastebin.com/15QKAz7U



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095847.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Debugging update request

2013-10-16 Thread Chris Geeringh
Here is my jstack output... Lots of blocked threads.

http://pastebin.com/1ktjBYbf


On 16 October 2013 10:28, michael.boom  wrote:

> I got the trace from jstack.
> I found references to "semaphore" but not sure if this is what you meant.
> Here's the trace:
> http://pastebin.com/15QKAz7U
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095847.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Boosting a field with defType:dismax --> No results at all

2013-10-16 Thread uwe72
Hi there,

i want to boost a field, see below.

If i add the defType:dismax i don't get results at all anymore.

What i am doing wrong?

Regards
Uwe



true
text
AND


default

true
true
1

100
true
true
1


dismax

   SignalImpl.baureihe^1011 text^0.1







spellcheck





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850.html
Sent from the Solr - User mailing list archive at Nabble.com.


Timeout Errors while using Collections API

2013-10-16 Thread RadhaJayalakshmi
Hi,
My setup is
Zookeeper ensemble - running with 3 nodes
Tomcats - 9 Tomcat instances are brought up, by registereing with zookeeper. 

Steps :
1) I uploaded the solr configuration like db_data_config, solrconfig, schema
xmls into zookeeoper
2)  Now, i am trying to create a collection with the collection API like
below:

http://miadevuser001.albridge.com:7021/solr/admin/collections?action=CREATE&name=Schwab_InvACC_Coll&numShards=1&replicationFactor=2&createNodeSet=localhost:7034_solr,localhost:7036_solr&collection.configName=InvestorAccountDomainConfig

Now, when i execute this command, i am getting the following error:
50060015createcollection the collection time out:60sorg.apache.solr.common.SolrException: createcollection the
collection time out:60s
at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:156)
at
org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:290)
at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:112)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:218)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
500

Now after i got this error, i am not able to do any operation on these
instances with collection API. It is repeteadly giving the same timeout
error..
This setup was working fine 5 mins back. suddenly it started throwing this
exceptions. Any ideas please??






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Timeout-Errors-while-using-Collections-API-tp4095852.html
Sent from the Solr - User mailing list archive at Nabble.com.


how does solr load plugins?

2013-10-16 Thread Liu Bo
Hi

I write a plugin to index contents reusing our DAO layer which is developed
using Spring.

What I am doing now is putting the plugin jar and all other depending jars
of DAO layer to shared lib folder under solr home.

In the log, I can see all the jars are loaded through SolrResourceLoader
like:

INFO  - 2013-10-16 16:25:30.611; org.apache.solr.core.SolrResourceLoader;
Adding 'file:/D:/apache-tomcat-7.0.42/solr/lib/spring-tx-3.1.0.RELEASE.jar'
to classloader


Then initialize the Spring context using:

ApplicationContext context = new
FileSystemXmlApplicationContext("/solr/spring/solr-plugin-bean-test.xml");


Then Spring will complain:

INFO  - 2013-10-16 16:33:57.432;
org.springframework.context.support.AbstractApplicationContext; Refreshing
org.springframework.context.support.FileSystemXmlApplicationContext@e582a85:
startup date [Wed Oct 16 16:33:57 CST 2013]; root of context hierarchy
INFO  - 2013-10-16 16:33:57.491;
org.springframework.beans.factory.xml.XmlBeanDefinitionReader; Loading XML
bean definitions from file
[D:\apache-tomcat-7.0.42\solr\spring\solr-plugin-bean-test.xml]
ERROR - 2013-10-16 16:33:59.944;
com.test.search.solr.spring.AppicationContextWrapper; Configuration
problem: Unable to locate Spring NamespaceHandler for XML schema namespace [
http://www.springframework.org/schema/context]
Offending resource: file
[D:\apache-tomcat-7.0.42\solr\spring\solr-plugin-bean-test.xml]

Spring context requires spring-tx-3.1.xsd which does exist
in spring-tx-3.1.0.RELEASE.jar under
"org\springframework\transaction\config\" package, but the program can't
find it even though it could load spring classes successfully.

The following won't work either.

ApplicationContext context = new
ClassPathXmlApplicationContext("classpath:spring/solr-plugin-bean-test.xml");
//the solr-plugin-bean-test.xml is packaged in plugin.jar as well.

But when I but all the jars under TOMECAT_HOME/webapp/solr/WEB-INF/lib, and
using

ApplicationContext context = new
ClassPathXmlApplicationContext("classpath:spring/solr-plugin-bean-test.xml");

everything works fine, I could initialize spring context and load DAO beans
to read data and then write them to solr index. But isn't modifying
solr.war a bad practice?

It seems SolrResourceLoader only loads classes from plugins jars but these
jars are NOT in classpath. Please correct me if I am wrong,

Is there any ways to use resources in plugin jars such as configuration
file?

BTW is there any difference between SolrResourceLoader with tomcat webapp
classLoader?

-- 
All the best

Liu Bo


SolrCloud Query Balancing

2013-10-16 Thread michael.boom
I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3
machines along with 3 Zookeeper instances.

My web application makes queries to Solr specifying the hostname of one of
the machines. So that machine will always get the request and the other ones
will just serve as an aid.
So I would like to setup a load balancer that would fix that, balancing the
queries to all machines. 
Maybe doing the same while indexing.

Would this be a good practice ? Any recommended tools for doing that?

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud Query Balancing

2013-10-16 Thread Chris Geeringh
If your web application is using SolrJ/Java based - use a CloudSolrServer
instance with the zkHosts. It will take care of load balancing when
querying, indexing, and handle routing if a node goes down.


On 16 October 2013 10:52, michael.boom  wrote:

> I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3
> machines along with 3 Zookeeper instances.
>
> My web application makes queries to Solr specifying the hostname of one of
> the machines. So that machine will always get the request and the other
> ones
> will just serve as an aid.
> So I would like to setup a load balancer that would fix that, balancing the
> queries to all machines.
> Maybe doing the same while indexing.
>
> Would this be a good practice ? Any recommended tools for doing that?
>
> Thanks!
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: SolrCloud Query Balancing

2013-10-16 Thread michael.boom
Thanks!

I've read a lil' bit about that, but my app is php-based so I'm afraid I
can't use that.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095857.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Different document types in different collections OR same collection without sharing fields?

2013-10-16 Thread user 01
Can some expert users please leave a comment on this ?


On Sun, Oct 6, 2013 at 2:54 AM, user 01  wrote:

>  Using a single node Solr instance, I need to search for, lets say,
> electronics items & grocery items. But I never want to search both of them
> together. When I search for electrnoics I don't expect a grocery item ever
> & vice versa.
>
> Should I be defining both these document types within a single schema.xml
> or should I use different collection for each of these two(maintaining
> separate schema.xml & solrconfig.xml for each of two) ?
>
> I believe that if I add both to a single collection, without sharing
> fields among these two document types, I should be equally good as
> separating them in two collection(in terms of performance & all), as their
> indexes/filter caches would be totally independent of each other when they
> don't share fields?
>
>
> Also posted at SO: http://stackoverflow.com/q/19202882/530153
>


Re: SolrCloud Query Balancing

2013-10-16 Thread Henrik Ossipoff Hansen
What you could do (and what we do) is to have a simple proxy in front of your 
Solr instances. We for example run with Nginx in front of all of our Tomcats, 
and use Nginx's upstream capabilities to do a simple loadbalancer for our 
SolrCloud cluster.

http://wiki.nginx.org/HttpUpstreamModule

I'm sure other web servers have similar modules.

Den 16/10/2013 kl. 12.08 skrev michael.boom 
mailto:my_sky...@yahoo.com>>:

Thanks!

I've read a lil' bit about that, but my app is php-based so I'm afraid I
can't use that.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095857.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Erik Hatcher
What does the debug output say from debugQuery=true say between the two?



On Oct 16, 2013, at 5:16, Stavros Delisavas  wrote:

> Hello Solr-Experts,
> 
> I am currently having a strange issue with my solr querys. I am running
> a small php/mysql-website that uses Solr for faster text-searches in
> name-lists, movie-titles, etc. Recently I noticed that the results on my
> local development-environment differ from those on my webserver. Both
> use the 100% same mysql-database with identical solr-queries for
> data-import.
> This is a sample query:
> 
> http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29&version=2.2&start=0&rows=1000&indent=on&fl=titleid
> 
> It is autogenerated by an php-script and 100% identical on local and on
> my webserver. My local solr gives me the expected results: all entries
> that have the words "into" AND "the" AND "wild*" in them.
> But my webserver acts as if I was looking for "into" OR "the" OR
> "wild*", eventhough the query is the same (as shown above). That's why I
> get useless (too many) results on the webserver-side.
> 
> I don't know what could be the issue. I have tried to check the
> config-files but I don't really know what to look for, so it is
> overwhelming for me to search through this big file without knowing.
> 
> What could be the problem, where can I check/find it and how can I solve
> that problem?
> 
> In case, additional informations are needed, let me know please.
> 
> Thank you!
> 
> (Excuse my poor english, please. It's not my mother-language.)


Solr Copy field append values ?

2013-10-16 Thread vishgupt
Hi ,
Schema like this 

external_id is multivalued field.



I want to know will values of upc will be appended to exiting values of
external_id or override it ?

For example if I send a document having values 

upc:131
external_id:423

for indexing in sorl with above mentioned schema.what will be value of
external_id field 131 or 131,423.

Thanks
Vishal





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Copy-field-append-values-tp4095862.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Different document types in different collections OR same collection without sharing fields?

2013-10-16 Thread shrikanth k
Hi,

Please refer below link for clarification on fields having null value.

http://stackoverflow.com/questions/7332122/solr-what-are-the-default-values-for-fields-which-does-not-have-a-default-value

logically it is better to have different collections for different domain
data. Having 2 collections will improve the overall performances.

Currently am holding 2 collections for different domain data. It eases
importing data and re-indexing.


regards,
Shrikanth



On Wed, Oct 16, 2013 at 3:48 PM, user 01  wrote:

> Can some expert users please leave a comment on this ?
>
>
> On Sun, Oct 6, 2013 at 2:54 AM, user 01  wrote:
>
> >  Using a single node Solr instance, I need to search for, lets say,
> > electronics items & grocery items. But I never want to search both of
> them
> > together. When I search for electrnoics I don't expect a grocery item
> ever
> > & vice versa.
> >
> > Should I be defining both these document types within a single schema.xml
> > or should I use different collection for each of these two(maintaining
> > separate schema.xml & solrconfig.xml for each of two) ?
> >
> > I believe that if I add both to a single collection, without sharing
> > fields among these two document types, I should be equally good as
> > separating them in two collection(in terms of performance & all), as
> their
> > indexes/filter caches would be totally independent of each other when
> they
> > don't share fields?
> >
> >
> > Also posted at SO: http://stackoverflow.com/q/19202882/530153
> >
>



--


Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
Hi,

Please find the clusterstate.json as below:

I have created a dev environment on one of my servers so that you can see
the issue live - http://64.251.14.47:1984/solr/

Also, There seems to be something wrong in zookeeper, when we try to add
documents using solrj, it works fine as long as load of insert is not much,
but once we start doing many inserts, then it throws a lot of errors...

I am doing something like -

CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL);
solrCoreCloud.setDefaultCollection("Image");
UpdateResponse up = solrCoreCloud.addBean(resultItem);
UpdateResponse upr = solrCoreCloud.commit();



clusterstate.json ---

{
  "collection1":{
"shards":{
  "shard2":{
"range":"b333-e665",
"state":"active",
"replicas":{"core_node4":{
"state":"active",
"core":"collection1",
"node_name":"64.251.14.47:1984_solr",
"base_url":"http://64.251.14.47:1984/solr";,
"leader":"true"}}},
  "shard3":{
"range":"e666-1998",
"state":"active",
"replicas":{"core_node5":{
"state":"active",
"core":"collection1",
"node_name":"64.251.14.47:1985_solr",
"base_url":"http://64.251.14.47:1985/solr";,
"leader":"true"}}},
  "shard4":{
"range":"1999-4ccb",
"state":"active",
"replicas":{
  "core_node2":{
"state":"active",
"core":"collection1",
"node_name":"64.251.14.47:1982_solr",
"base_url":"http://64.251.14.47:1982/solr"},
  "core_node6":{
"state":"active",
"core":"collection1",
"node_name":"64.251.14.47:1981_solr",
"base_url":"http://64.251.14.47:1981/solr";,
"leader":"true"}}},
  "shard5":{
"range":"4ccc-7fff",
"state":"active",
"replicas":{"core_node3":{
"state":"active",
"core":"collection1",
"node_name":"64.251.14.47:1983_solr",
"base_url":"http://64.251.14.47:1983/solr";,
"leader":"true",
"router":"compositeId"},
  "Web":{
"shards":{
  "shard1":{
"range":"8000-b332",
"state":"active",
"replicas":{"core_node2":{
"state":"active",
"core":"Web_shard1_replica1",
"node_name":"64.251.14.47:1983_solr",
"base_url":"http://64.251.14.47:1983/solr";,
"leader":"true"}}},
  "shard2":{
"range":"b333-e665",
"state":"active",
"replicas":{"core_node3":{
"state":"active",
"core":"Web_shard2_replica1",
"node_name":"64.251.14.47:1984_solr",
"base_url":"http://64.251.14.47:1984/solr";,
"leader":"true"}}},
  "shard3":{
"range":"e666-1998",
"state":"active",
"replicas":{"core_node4":{
"state":"active",
"core":"Web_shard3_replica1",
"node_name":"64.251.14.47:1982_solr",
"base_url":"http://64.251.14.47:1982/solr";,
"leader":"true"}}},
  "shard4":{
"range":"1999-4ccb",
"state":"active",
"replicas":{"core_node5":{
"state":"active",
"core":"Web_shard4_replica1",
"node_name":"64.251.14.47:1985_solr",
"base_url":"http://64.251.14.47:1985/solr";,
"leader":"true"}}},
  "shard5":{
"range":null,
"state":"active",
"replicas":{"core_node1":{
"state":"active",
"core":"Web_shard5_replica1",
"node_name":"64.251.14.47:1981_solr",
"base_url":"http://64.251.14.47:1981/solr";,
"leader":"true",
"router":"compositeId"},
  "Image":{
"shards":{
  "shard1":{
"range":"8000-b332",
"state":"active",
"replicas":{"core_node1":{
"state":"active",
"core":"Image_shard1_replica1",
"node_name":"64.251.14.47:1983_solr",
"base_url":"http://64.251.14.47:1983/solr";,
"leader":"true"}}},
  "shard2":{
"range":"b333-e665",
"state":"active",
"replicas":{"core_node2":{
"state":"active",
"core":"Image_shard2_replica1",
"node_name":"64.251.14.47:1985_solr",
"base_url":"http://64.251.14.47:1985/solr";,
"leader":"true"}}},
  "shard3":{
"range":"e666-1998",
"state":"active",
"replicas":{"core_node3":{
"state":"active",
"core":"Image_shard3_replica1",
"node_name":"64.251.14.47:1984_solr",
"base_url":"http://64.251.14.47:1984/solr";,
"leader":"true"}}},
  "shard4":{
"range":"1999-4ccb",
  

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Stavros Delisavas
My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:
> What does the debug output say from debugQuery=true say between the two?
>
>
>
> On Oct 16, 2013, at 5:16, Stavros Delisavas  wrote:
>
>> Hello Solr-Experts,
>>
>> I am currently having a strange issue with my solr querys. I am running
>> a small php/mysql-website that uses Solr for faster text-searches in
>> name-lists, movie-titles, etc. Recently I noticed that the results on my
>> local development-environment differ from those on my webserver. Both
>> use the 100% same mysql-database with identical solr-queries for
>> data-import.
>> This is a sample query:
>>
>> http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29&version=2.2&start=0&rows=1000&indent=on&fl=titleid
>>
>> It is autogenerated by an php-script and 100% identical on local and on
>> my webserver. My local solr gives me the expected results: all entries
>> that have the words "into" AND "the" AND "wild*" in them.
>> But my webserver acts as if I was looking for "into" OR "the" OR
>> "wild*", eventhough the query is the same (as shown above). That's why I
>> get useless (too many) results on the webserver-side.
>>
>> I don't know what could be the issue. I have tried to check the
>> config-files but I don't really know what to look for, so it is
>> overwhelming for me to search through this big file without knowing.
>>
>> What could be the problem, where can I check/find it and how can I solve
>> that problem?
>>
>> In case, additional informations are needed, let me know please.
>>
>> Thank you!
>>
>> (Excuse my poor english, please. It's not my mother-language.)



Re: Concurent indexing

2013-10-16 Thread Erick Erickson
Run jstack on the solr process (standard with Java) and
look for the word "semaphore". You should see your
servers blocked on this in the Solr code. That'll pretty
much nail it.

There's an open JIRA to fix the underlying cause, see:
SOLR-5232, but that's currently slated for 4.6 which
won't be cut for a while.

Also, there's a patch that will fix this as a side effect,
assuming you're using SolrJ, see. This is available in 4.5
SOLR-4816

Best,
Erick




On Tue, Oct 15, 2013 at 1:33 PM, michael.boom  wrote:

> Here's some of the Solr's last words (log content before it stoped
> accepting
> updates), maybe someone can help me interpret that.
> http://pastebin.com/mv7fH62H
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
oops, the actual url is -http://64.251.14.47:1981/solr/

Also, another issue that needs to be raised is the creation of cores from
the "core admin" section of the gui, doesnt really work well, it creates
files but then they do not work (again i am using 4.4)


On Wed, Oct 16, 2013 at 4:12 PM, Chris  wrote:

> Hi,
>
> Please find the clusterstate.json as below:
>
> I have created a dev environment on one of my servers so that you can see
> the issue live - http://64.251.14.47:1984/solr/
>
> Also, There seems to be something wrong in zookeeper, when we try to add
> documents using solrj, it works fine as long as load of insert is not much,
> but once we start doing many inserts, then it throws a lot of errors...
>
> I am doing something like -
>
> CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL);
> solrCoreCloud.setDefaultCollection("Image");
> UpdateResponse up = solrCoreCloud.addBean(resultItem);
> UpdateResponse upr = solrCoreCloud.commit();
>
>
>
> clusterstate.json ---
>
> {
>   "collection1":{
> "shards":{
>   "shard2":{
> "range":"b333-e665",
> "state":"active",
> "replicas":{"core_node4":{
> "state":"active",
> "core":"collection1",
> "node_name":"64.251.14.47:1984_solr",
> "base_url":"http://64.251.14.47:1984/solr";,
> "leader":"true"}}},
>   "shard3":{
> "range":"e666-1998",
> "state":"active",
> "replicas":{"core_node5":{
> "state":"active",
> "core":"collection1",
> "node_name":"64.251.14.47:1985_solr",
> "base_url":"http://64.251.14.47:1985/solr";,
> "leader":"true"}}},
>   "shard4":{
> "range":"1999-4ccb",
> "state":"active",
> "replicas":{
>   "core_node2":{
> "state":"active",
> "core":"collection1",
> "node_name":"64.251.14.47:1982_solr",
> "base_url":"http://64.251.14.47:1982/solr"},
>   "core_node6":{
> "state":"active",
> "core":"collection1",
> "node_name":"64.251.14.47:1981_solr",
> "base_url":"http://64.251.14.47:1981/solr";,
> "leader":"true"}}},
>   "shard5":{
> "range":"4ccc-7fff",
> "state":"active",
> "replicas":{"core_node3":{
> "state":"active",
> "core":"collection1",
> "node_name":"64.251.14.47:1983_solr",
> "base_url":"http://64.251.14.47:1983/solr";,
> "leader":"true",
> "router":"compositeId"},
>   "Web":{
> "shards":{
>   "shard1":{
> "range":"8000-b332",
> "state":"active",
> "replicas":{"core_node2":{
> "state":"active",
> "core":"Web_shard1_replica1",
> "node_name":"64.251.14.47:1983_solr",
> "base_url":"http://64.251.14.47:1983/solr";,
> "leader":"true"}}},
>   "shard2":{
> "range":"b333-e665",
> "state":"active",
> "replicas":{"core_node3":{
> "state":"active",
> "core":"Web_shard2_replica1",
> "node_name":"64.251.14.47:1984_solr",
> "base_url":"http://64.251.14.47:1984/solr";,
> "leader":"true"}}},
>   "shard3":{
> "range":"e666-1998",
> "state":"active",
> "replicas":{"core_node4":{
> "state":"active",
> "core":"Web_shard3_replica1",
> "node_name":"64.251.14.47:1982_solr",
> "base_url":"http://64.251.14.47:1982/solr";,
> "leader":"true"}}},
>   "shard4":{
> "range":"1999-4ccb",
> "state":"active",
> "replicas":{"core_node5":{
> "state":"active",
> "core":"Web_shard4_replica1",
> "node_name":"64.251.14.47:1985_solr",
> "base_url":"http://64.251.14.47:1985/solr";,
> "leader":"true"}}},
>   "shard5":{
> "range":null,
> "state":"active",
> "replicas":{"core_node1":{
> "state":"active",
> "core":"Web_shard5_replica1",
> "node_name":"64.251.14.47:1981_solr",
> "base_url":"http://64.251.14.47:1981/solr";,
> "leader":"true",
> "router":"compositeId"},
>   "Image":{
> "shards":{
>   "shard1":{
> "range":"8000-b332",
> "state":"active",
> "replicas":{"core_node1":{
> "state":"active",
> "core":"Image_shard1_replica1",
> "node_name":"64.251.14.47:1983_solr",
> "base_url":"http://64.251.14.47:1983/solr";,
> "leader":"true"}}},
>   "shard2":{
> "range":"b333-e665",
> "state":"active",
> "replicas":{"core_node2":{
> "state":"active",
>  

Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
Also, is there any easy way upgrading to 4.5 without having to change most
of my plugins & configuration files?


On Wed, Oct 16, 2013 at 4:18 PM, Chris  wrote:

> oops, the actual url is -http://64.251.14.47:1981/solr/
>
> Also, another issue that needs to be raised is the creation of cores from
> the "core admin" section of the gui, doesnt really work well, it creates
> files but then they do not work (again i am using 4.4)
>
>
> On Wed, Oct 16, 2013 at 4:12 PM, Chris  wrote:
>
>> Hi,
>>
>> Please find the clusterstate.json as below:
>>
>> I have created a dev environment on one of my servers so that you can see
>> the issue live - http://64.251.14.47:1984/solr/
>>
>> Also, There seems to be something wrong in zookeeper, when we try to add
>> documents using solrj, it works fine as long as load of insert is not much,
>> but once we start doing many inserts, then it throws a lot of errors...
>>
>> I am doing something like -
>>
>> CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL);
>> solrCoreCloud.setDefaultCollection("Image");
>> UpdateResponse up = solrCoreCloud.addBean(resultItem);
>> UpdateResponse upr = solrCoreCloud.commit();
>>
>>
>>
>> clusterstate.json ---
>>
>> {
>>   "collection1":{
>> "shards":{
>>   "shard2":{
>> "range":"b333-e665",
>> "state":"active",
>> "replicas":{"core_node4":{
>> "state":"active",
>> "core":"collection1",
>> "node_name":"64.251.14.47:1984_solr",
>> "base_url":"http://64.251.14.47:1984/solr";,
>> "leader":"true"}}},
>>   "shard3":{
>> "range":"e666-1998",
>> "state":"active",
>> "replicas":{"core_node5":{
>> "state":"active",
>> "core":"collection1",
>> "node_name":"64.251.14.47:1985_solr",
>> "base_url":"http://64.251.14.47:1985/solr";,
>> "leader":"true"}}},
>>   "shard4":{
>> "range":"1999-4ccb",
>> "state":"active",
>> "replicas":{
>>   "core_node2":{
>> "state":"active",
>> "core":"collection1",
>> "node_name":"64.251.14.47:1982_solr",
>> "base_url":"http://64.251.14.47:1982/solr"},
>>   "core_node6":{
>> "state":"active",
>> "core":"collection1",
>> "node_name":"64.251.14.47:1981_solr",
>> "base_url":"http://64.251.14.47:1981/solr";,
>> "leader":"true"}}},
>>   "shard5":{
>> "range":"4ccc-7fff",
>> "state":"active",
>> "replicas":{"core_node3":{
>> "state":"active",
>> "core":"collection1",
>> "node_name":"64.251.14.47:1983_solr",
>> "base_url":"http://64.251.14.47:1983/solr";,
>> "leader":"true",
>> "router":"compositeId"},
>>   "Web":{
>> "shards":{
>>   "shard1":{
>> "range":"8000-b332",
>> "state":"active",
>> "replicas":{"core_node2":{
>> "state":"active",
>> "core":"Web_shard1_replica1",
>> "node_name":"64.251.14.47:1983_solr",
>> "base_url":"http://64.251.14.47:1983/solr";,
>> "leader":"true"}}},
>>   "shard2":{
>> "range":"b333-e665",
>> "state":"active",
>> "replicas":{"core_node3":{
>> "state":"active",
>> "core":"Web_shard2_replica1",
>> "node_name":"64.251.14.47:1984_solr",
>> "base_url":"http://64.251.14.47:1984/solr";,
>> "leader":"true"}}},
>>   "shard3":{
>> "range":"e666-1998",
>> "state":"active",
>> "replicas":{"core_node4":{
>> "state":"active",
>> "core":"Web_shard3_replica1",
>> "node_name":"64.251.14.47:1982_solr",
>> "base_url":"http://64.251.14.47:1982/solr";,
>> "leader":"true"}}},
>>   "shard4":{
>> "range":"1999-4ccb",
>> "state":"active",
>> "replicas":{"core_node5":{
>> "state":"active",
>> "core":"Web_shard4_replica1",
>> "node_name":"64.251.14.47:1985_solr",
>> "base_url":"http://64.251.14.47:1985/solr";,
>> "leader":"true"}}},
>>   "shard5":{
>> "range":null,
>> "state":"active",
>> "replicas":{"core_node1":{
>> "state":"active",
>> "core":"Web_shard5_replica1",
>> "node_name":"64.251.14.47:1981_solr",
>> "base_url":"http://64.251.14.47:1981/solr";,
>> "leader":"true",
>> "router":"compositeId"},
>>   "Image":{
>> "shards":{
>>   "shard1":{
>> "range":"8000-b332",
>> "state":"active",
>> "replicas":{"core_node1":{
>> "state":"active",
>> "core":"Image_shard1_replica1",
>> 

Re: Concurent indexing

2013-10-16 Thread Chris Geeringh
Hi Erick, here is a paste from other thread (debugging update request) with
my input as I am seeing errors too:

I ran an import last night, and this morning my cloud wouldn't accept
updates. I'm running the latest 4.6 snapshot. I was importing with latest
solrj snapshot, and using java bin transport with CloudSolrServer.

The cluster had indexed ~1.3 million docs before no further updates were
accepted, querying still working.

I'll run jstack shortly and provide the results.

Here is my jstack output... Lots of blocked threads.

http://pastebin.com/1ktjBYbf



On 16 October 2013 11:46, Erick Erickson  wrote:

> Run jstack on the solr process (standard with Java) and
> look for the word "semaphore". You should see your
> servers blocked on this in the Solr code. That'll pretty
> much nail it.
>
> There's an open JIRA to fix the underlying cause, see:
> SOLR-5232, but that's currently slated for 4.6 which
> won't be cut for a while.
>
> Also, there's a patch that will fix this as a side effect,
> assuming you're using SolrJ, see. This is available in 4.5
> SOLR-4816
>
> Best,
> Erick
>
>
>
>
> On Tue, Oct 15, 2013 at 1:33 PM, michael.boom  wrote:
>
> > Here's some of the Solr's last words (log content before it stoped
> > accepting
> > updates), maybe someone can help me interpret that.
> > http://pastebin.com/mv7fH62H
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>


Re: Different document types in different collections OR same collection without sharing fields?

2013-10-16 Thread user 01
@Shrikanth: how do you manage multiple redundant configurations(isn' it?) ?
I thought indexes would be separate when fields aren't shared. I don't need
to import any data/ or re-indexing, if those are the only benefits for
separate collections.  I just index when a request comes/ new item is added
to DB.


On Wed, Oct 16, 2013 at 4:12 PM, shrikanth k wrote:

> Hi,
>
> Please refer below link for clarification on fields having null value.
>
>
> http://stackoverflow.com/questions/7332122/solr-what-are-the-default-values-for-fields-which-does-not-have-a-default-value
>
> logically it is better to have different collections for different domain
> data. Having 2 collections will improve the overall performances.
>
> Currently am holding 2 collections for different domain data. It eases
> importing data and re-indexing.
>
>
> regards,
> Shrikanth
>
>
>
> On Wed, Oct 16, 2013 at 3:48 PM, user 01  wrote:
>
> > Can some expert users please leave a comment on this ?
> >
> >
> > On Sun, Oct 6, 2013 at 2:54 AM, user 01  wrote:
> >
> > >  Using a single node Solr instance, I need to search for, lets say,
> > > electronics items & grocery items. But I never want to search both of
> > them
> > > together. When I search for electrnoics I don't expect a grocery item
> > ever
> > > & vice versa.
> > >
> > > Should I be defining both these document types within a single
> schema.xml
> > > or should I use different collection for each of these two(maintaining
> > > separate schema.xml & solrconfig.xml for each of two) ?
> > >
> > > I believe that if I add both to a single collection, without sharing
> > > fields among these two document types, I should be equally good as
> > > separating them in two collection(in terms of performance & all), as
> > their
> > > indexes/filter caches would be totally independent of each other when
> > they
> > > don't share fields?
> > >
> > >
> > > Also posted at SO: http://stackoverflow.com/q/19202882/530153
> > >
> >
>
>
>
> --
>


Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
>>> Also, another issue that needs to be raised is the creation of cores 
from
>>> the "core admin" section of the gui, doesnt really work well, it 
creates
>>> files but then they do not work (again i am using 4.4)

>From my experience "core admin" section of the GUI does not work well in 
SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0 
which acts much better.

I would use only HTTP requests ("cores and collections API") with 
SolrCloud and would use GUI only for viewing the state of cluster and 
cores.

Primoz




Re: req info : SOLRJ and TermVector

2013-10-16 Thread Koji Sekiguchi

(13/10/16 17:47), elfu wrote:

hi,

can i access TermVector information using solrj ?


There is TermVectorComponent to get termVector info:

http://wiki.apache.org/solr/TermVectorComponent

So yes, you can access it using solrj.

koji
--
http://soleami.com/blog/automatically-acquiring-synonym-knowledge-from-wikipedia.html


RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-16 Thread Akkinepalli, Bharat (ELS-CON)
Hi Otis,
Did you get a chance to look into the logs.  Please let me know if you need 
more information.  Thank you.

Regards,
Bharat Akkinepalli

-Original Message-
From: Akkinepalli, Bharat (ELS-CON) [mailto:b.akkinepa...@elsevier.com] 
Sent: Friday, October 11, 2013 2:16 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue with 
Commits after deleting documents using Delete by ID

Hi Otis,
Thanks for the response.  The log files can be found here.  

MasterLog : http://pastebin.com/DPLKMPcF Slave Log:  
http://pastebin.com/DX9sV6Jx

One more point worth mentioning here is that when we issue the commit with 
expungeDeletes=true, then the delete by id replication is successful. i.e. 
http://localhost:8983/solr/annotation/update?commit=true&expungeDeletes=true

Regards,
Bharat Akkinepalli

-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
Sent: Wednesday, October 09, 2013 6:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with 
Commits after deleting documents using Delete by ID

Bharat,

Can you look at the logs on the Master when you issue the delete and the 
subsequent commits and share that?

Otis
--
Solr & ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- 
http://sematext.com/spm



On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) 
 wrote:
> Hi,
> We have recently migrated from Solr 3.6 to Solr 4.4.  We are using the 
> Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have noticed the 
> following behavior/defect.
>
> Configuration:
> ===
>
> 1.   The Hard Commit and Soft Commit are disabled in the configuration 
> (we control the commits from the application)
>
> 2.   We have 1 Master and 2 Slaves configured and the pollInterval is 
> configured to 10 Minutes.
>
> 3.   The Master is configured to have the "replicateAfter" as commit & 
> startup
>
> Steps to reproduce the problem:
> ==
>
> 1.   Delete a document in Solr  (using delete by id).  URL - 
> http://localhost:8983/solr/annotation/update with body as  
> change.me
>
> 2.   Issue a commit in Master 
> (http://localhost:8983/solr/annotation/update?commit=true).
>
> 3.   The replication of the DELETE WILL NOT happen.  The master and slave 
> has the same Index version.
>
> 4.   If we try to issue another commit in Master, we see that it 
> replicates fine.
>
> Request you to please confirm if this is a known issue.  Thank you.
>
> Regards,
> Bharat Akkinepalli
>


Re: Find documents that are composed of % words

2013-10-16 Thread Aloke Ghoshal
Hi Shahzad,

Personally I am of the same opinion as others who have replied, that you
are better off going back to your clients at this stage itself, with all
the new found info/data points.

Further, to the questions that you put to me directly:

1) For option 1, as indicated earlier, you have to compute the
myfieldwordcount outside of Solr & push it in as any other field to Solr.
As far as I know, there is no filter that will do this for you out of the
box.

2) For option 2, you had to take a look at:
http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
Related links:
Function Query: http://wiki.apache.org/solr/FunctionQuery#norm
Norms:
http://lucene.apache.org/core/3_5_0/api/all/org/apache/lucene/search/Similarity.html#computeNorm(java.lang.String,
org.apache.lucene.index.FieldInvertState)
Changes to schema:
http://wiki.apache.org/solr/SchemaXml#Common_field_options (omitNorms
option)

For a field with default boost (= 1), norm = lengthNorm (approximately
1/sqrrt(numTerms)). Norm's been multiplied twice in the query to divide the
score (approx.) by numTerms.

Hope that helps.

Regards,
Aloke


On Fri, Oct 11, 2013 at 5:36 PM, shahzad73  wrote:

> Aloke Ghoshal i'm trying to work out your equation.   i am using standard
> scheme provided by nutch for solr and not aware of how to calculate
> myfieldwordcount   in first query.no idea where this count will come
> from.   is there any filter that will store number of tokens generated for
> a
> specific field and store it as another field.   that way we can use it .
> not sure what norm does in second equation  try to find information for
> this from online and did not find any yet.   please explain
>
>
> Shahzad
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Find-documents-that-are-composed-of-words-tp4094264p4094955.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Concurent indexing

2013-10-16 Thread Chris Geeringh
Here's another jstack http://pastebin.com/8JiQc3rb


On 16 October 2013 11:53, Chris Geeringh  wrote:

> Hi Erick, here is a paste from other thread (debugging update request)
> with my input as I am seeing errors too:
>
> I ran an import last night, and this morning my cloud wouldn't accept
> updates. I'm running the latest 4.6 snapshot. I was importing with latest
> solrj snapshot, and using java bin transport with CloudSolrServer.
>
> The cluster had indexed ~1.3 million docs before no further updates were
> accepted, querying still working.
>
> I'll run jstack shortly and provide the results.
>
> Here is my jstack output... Lots of blocked threads.
>
> http://pastebin.com/1ktjBYbf
>
>
>
> On 16 October 2013 11:46, Erick Erickson  wrote:
>
>> Run jstack on the solr process (standard with Java) and
>> look for the word "semaphore". You should see your
>> servers blocked on this in the Solr code. That'll pretty
>> much nail it.
>>
>> There's an open JIRA to fix the underlying cause, see:
>> SOLR-5232, but that's currently slated for 4.6 which
>> won't be cut for a while.
>>
>> Also, there's a patch that will fix this as a side effect,
>> assuming you're using SolrJ, see. This is available in 4.5
>> SOLR-4816
>>
>> Best,
>> Erick
>>
>>
>>
>>
>> On Tue, Oct 15, 2013 at 1:33 PM, michael.boom 
>> wrote:
>>
>> > Here's some of the Solr's last words (log content before it stoped
>> > accepting
>> > updates), maybe someone can help me interpret that.
>> > http://pastebin.com/mv7fH62H
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> >
>> http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >
>>
>
>


Re: Switching indexes

2013-10-16 Thread Christopher Gross
Shawn,

It all makes sense, I'm just dealing with production servers here so I'm
trying to be very careful (shutting down one node at a time is OK, just
don't want to do something catastrophic.)

OK, so I should use that aliasing feature.

On index1 I have:
core1
core1new
core2

On index2 and index3 I have:
core1
core2

If I do the "alias" command on index1 and have "core1" alias "core1new":
1) Will that then get rid of the existing core1 and have "core1new" data be
used for queries?
2) Will that change make core1 instances on index2 and index3 update to
have "core1new" data?

Thanks again!



-- Chris


On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey  wrote:

> On 10/15/2013 2:17 PM, Christopher Gross wrote:
>
>> I have 3 Solr nodes (and 5 ZK nodes).
>>
>> For #1, would I have to do that on all of them?
>> For #2, I'm not getting the auto-replication between node 1 and nodes 2 &
>> 3
>> for my new index.
>>
>> I have 2 indexes -- just call them "index" and "indexbk" (bk being the
>> backup containing the full data set) up and running on one node.
>> If I were to do a swap (via the Core Admin page), would that push the
>> changes for indexbk over to the other two nodes?  Would I need to do that
>> switch on the leader, or could that be done on one of the other nodes?
>>
>
> For #1, I don't know how you want to handle your sharding and/or
> replication.  I would assume that you probably have numShards=1 and
> replicationFactor=3, but I could be wrong. At any rate, where the
> collection lives is an implementation detail that's up to you.  SolrCloud
> keeps track of all your collections, whether they are on one server or all
> servers. Typically you can send requests (queries, API calls, etc) that
> deal with entire collections to any node in your cluster and they will be
> handled correctly.  If you need to deal with a specific core, that call
> needs to go to the correct node.
>
> For #2, when you create a core and want it to be a replica of something
> that already exists, you need to give it a name that's not in use on your
> cluster, such as index2_shard1_replica3.  You also tell it what collection
> it's part of, which for my example, would probably be index2.  Then you
> tell it what shard it will contain.  That will be shard1, shard2, etc.
>  Here's an example of a CREATE call:
>
> http://server:port/solr/admin/**cores?action=CREATE&name=**
> index2_shard1_replica3&**collection=index2&shard=shard1
>
> For the rest of your message: Core swapping and SolrCloud do NOT get
> along.  If you are using SolrCloud, CoreAdmin features like that need to
> disappear from your toolset. Attempting a core swap will make bad things
> (tm) happen.
>
> Collection aliasing is the way in SolrCloud that you can now do what used
> to be done with swapping.  You have collections named index1, index2,
> index3, etc ... and you keep an alias called just "index" that points to
> one of those other collections, so that you don't have to change your
> application - you just repoint the alias and all the application queries
> going to "index" will go to the correct place.
>
> I hope I haven't made things more confusing for you!
>
> Thanks,
> Shawn
>
>


Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
oh great. Thanks Primoz.

is there any simple way to do the upgrade to 4.5 without having to change
my configurations? update a few jar files etc?


On Wed, Oct 16, 2013 at 4:58 PM,  wrote:

> >>> Also, another issue that needs to be raised is the creation of cores
> from
> >>> the "core admin" section of the gui, doesnt really work well, it
> creates
> >>> files but then they do not work (again i am using 4.4)
>
> From my experience "core admin" section of the GUI does not work well in
> SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0
> which acts much better.
>
> I would use only HTTP requests ("cores and collections API") with
> SolrCloud and would use GUI only for viewing the state of cluster and
> cores.
>
> Primoz
>
>
>


Error when i want to create a CORE

2013-10-16 Thread raige
I install the version solr 4.5 on windows. I launch with Jetty web server the
example. I have no problem with collection 1 core. But, when i want to
create my core, the server send me this error : 
*
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load config file C:\Documents and
Settings\r.lucas\Bureau\Moteur\solr-4.5.0\example\solr\index1\solrconfig.xml*

could you help please



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
Hm, good question. I haven't really done any upgrading yet, because I just 
reinstall and reindex everything. I would replace jars with the new ones 
(if needed - check release notes for version 4.4.0 and 4.5.0 where all the 
versions of external tools [tika, maven, etc.] are stated) and deploy the 
updated WAR file to servlet container.

Primoz




From:   Chris 
To: solr-user 
Date:   16.10.2013 14:30
Subject:Re: Regarding Solr Cloud issue...



oh great. Thanks Primoz.

is there any simple way to do the upgrade to 4.5 without having to change
my configurations? update a few jar files etc?


On Wed, Oct 16, 2013 at 4:58 PM,  wrote:

> >>> Also, another issue that needs to be raised is the creation of cores
> from
> >>> the "core admin" section of the gui, doesnt really work well, it
> creates
> >>> files but then they do not work (again i am using 4.4)
>
> From my experience "core admin" section of the GUI does not work well in
> SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0
> which acts much better.
>
> I would use only HTTP requests ("cores and collections API") with
> SolrCloud and would use GUI only for viewing the state of cluster and
> cores.
>
> Primoz
>
>
>



Re: Error when i want to create a CORE

2013-10-16 Thread primoz . skale
Can you try with a directory path that contains *no* spaces.

Primoz



From:   raige 
To: solr-user@lucene.apache.org
Date:   16.10.2013 14:46
Subject:Error when i want to create a CORE



I install the version solr 4.5 on windows. I launch with Jetty web server 
the
example. I have no problem with collection 1 core. But, when i want to
create my core, the server send me this error : 
*
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load config file C:\Documents and
Settings\r.lucas\Bureau\Moteur\solr-4.5.0\example\solr\index1\solrconfig.xml*

could you help please



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894.html

Sent from the Solr - User mailing list archive at Nabble.com.



Re: Boosting a field with defType:dismax --> No results at all

2013-10-16 Thread Jack Krupansky

Get rid of the newlines before and after the value of the qf parameter.

-- Jack Krupansky

-Original Message- 
From: uwe72

Sent: Wednesday, October 16, 2013 5:36 AM
To: solr-user@lucene.apache.org
Subject: Boosting a field with defType:dismax --> No results at all

Hi there,

i want to boost a field, see below.

If i add the defType:dismax i don't get results at all anymore.

What i am doing wrong?

Regards
Uwe

   
   
   true
   text
   AND


   default

   true
   true
   1

   100
   true
   true
   1


   dismax
   
  SignalImpl.baureihe^1011 text^0.1
   




   
   
   spellcheck
   
   



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Error when i want to create a CORE

2013-10-16 Thread michael.boom
Assuming that you are using the Admin UI: 
The instanceDir must be already existing (in your case index1).
Inside it there should be conf/ directory holding the cofiguration files.
In the config field only insert the file name (like "solrconfig.xml") which
shoulf be found in the conf/ directory



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894p4095900.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Copy field append values ?

2013-10-16 Thread Jack Krupansky

Appended.

-- Jack Krupansky

-Original Message- 
From: vishgupt

Sent: Wednesday, October 16, 2013 6:25 AM
To: solr-user@lucene.apache.org
Subject: Solr Copy field append values ?

Hi ,
Schema like this

external_id is multivalued field.



I want to know will values of upc will be appended to exiting values of
external_id or override it ?

For example if I send a document having values

upc:131
external_id:423

for indexing in sorl with above mentioned schema.what will be value of
external_id field 131 or 131,423.

Thanks
Vishal





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Copy-field-append-values-tp4095862.html
Sent from the Solr - User mailing list archive at Nabble.com. 



RE: Switching indexes

2013-10-16 Thread Garth Grimm
The alias applies to the entire cloud, not a single core.

So you'd have your indexing application point to a "collection alias" named 
'index'.  And that alias would point to core1.
You'd have your query applications point to a "collection alias" named 'query', 
and that would point to core1, as well.

Then use the Collection API to create core1new across the entire cloud.  Then 
update the 'index' alias to point to core1new.  Feed documents in, run warm-up 
scripts, run smoke tests, etc., etc.
When you're ready, point the 'query' alias to core1new.

You're now running completely on core1new, and can use the Collection API to 
delete core1 from the cloud.  Or keep it around as a backup to which you can 
restore simply by changing 'query' alias.

-Original Message-
From: Christopher Gross [mailto:cogr...@gmail.com] 
Sent: Wednesday, October 16, 2013 7:05 AM
To: solr-user
Subject: Re: Switching indexes

Shawn,

It all makes sense, I'm just dealing with production servers here so I'm trying 
to be very careful (shutting down one node at a time is OK, just don't want to 
do something catastrophic.)

OK, so I should use that aliasing feature.

On index1 I have:
core1
core1new
core2

On index2 and index3 I have:
core1
core2

If I do the "alias" command on index1 and have "core1" alias "core1new":
1) Will that then get rid of the existing core1 and have "core1new" data be 
used for queries?
2) Will that change make core1 instances on index2 and index3 update to have 
"core1new" data?

Thanks again!



-- Chris


On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey  wrote:

> On 10/15/2013 2:17 PM, Christopher Gross wrote:
>
>> I have 3 Solr nodes (and 5 ZK nodes).
>>
>> For #1, would I have to do that on all of them?
>> For #2, I'm not getting the auto-replication between node 1 and nodes 
>> 2 &
>> 3
>> for my new index.
>>
>> I have 2 indexes -- just call them "index" and "indexbk" (bk being 
>> the backup containing the full data set) up and running on one node.
>> If I were to do a swap (via the Core Admin page), would that push the 
>> changes for indexbk over to the other two nodes?  Would I need to do 
>> that switch on the leader, or could that be done on one of the other nodes?
>>
>
> For #1, I don't know how you want to handle your sharding and/or 
> replication.  I would assume that you probably have numShards=1 and 
> replicationFactor=3, but I could be wrong. At any rate, where the 
> collection lives is an implementation detail that's up to you.  
> SolrCloud keeps track of all your collections, whether they are on one 
> server or all servers. Typically you can send requests (queries, API 
> calls, etc) that deal with entire collections to any node in your 
> cluster and they will be handled correctly.  If you need to deal with 
> a specific core, that call needs to go to the correct node.
>
> For #2, when you create a core and want it to be a replica of 
> something that already exists, you need to give it a name that's not 
> in use on your cluster, such as index2_shard1_replica3.  You also tell 
> it what collection it's part of, which for my example, would probably 
> be index2.  Then you tell it what shard it will contain.  That will be 
> shard1, shard2, etc.
>  Here's an example of a CREATE call:
>
> http://server:port/solr/admin/**cores?action=CREATE&name=**
> index2_shard1_replica3&**collection=index2&shard=shard1
>
> For the rest of your message: Core swapping and SolrCloud do NOT get 
> along.  If you are using SolrCloud, CoreAdmin features like that need 
> to disappear from your toolset. Attempting a core swap will make bad 
> things
> (tm) happen.
>
> Collection aliasing is the way in SolrCloud that you can now do what 
> used to be done with swapping.  You have collections named index1, 
> index2, index3, etc ... and you keep an alias called just "index" that 
> points to one of those other collections, so that you don't have to 
> change your application - you just repoint the alias and all the 
> application queries going to "index" will go to the correct place.
>
> I hope I haven't made things more confusing for you!
>
> Thanks,
> Shawn
>
>


AW: Boosting a field with defType:dismax --> No results at all

2013-10-16 Thread uwe72
Perfect!!! THANKS A LOT

 

That was the mistake.

 

Von: Jack Krupansky-2 [via Lucene]
[mailto:ml-node+s472066n409590...@n3.nabble.com] 
Gesendet: Mittwoch, 16. Oktober 2013 14:55
An: uwe72
Betreff: Re: Boosting a field with defType:dismax --> No results at all

 

Get rid of the newlines before and after the value of the qf parameter. 

-- Jack Krupansky 

-Original Message- 
From: uwe72 
Sent: Wednesday, October 16, 2013 5:36 AM 
To: [hidden email] 
Subject: Boosting a field with defType:dismax --> No results at all 

Hi there, 

i want to boost a field, see below. 

If i add the defType:dismax i don't get results at all anymore. 

What i am doing wrong? 

Regards 
Uwe 

 
 
true 
text 
AND 


default 

true 
true 
1 

100 
true 
true 
1 


dismax 
 
   SignalImpl.baureihe^1011 text^0.1 
 




 
 
spellcheck 
 
 



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No
-results-at-all-tp4095850.html
Sent from the Solr - User mailing list archive at Nabble.com. 




  _  

If you reply to this email, your message will be added to the discussion
below:

http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No
-results-at-all-tp4095850p4095901.html 

To unsubscribe from Boosting a field with defType:dismax --> No results at
all, click here
 .
 
 NAML 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095906.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
very well, i will try the same, maybe an auto update tool should be also
put on the line...just a thought ...


On Wed, Oct 16, 2013 at 6:20 PM,  wrote:

> Hm, good question. I haven't really done any upgrading yet, because I just
> reinstall and reindex everything. I would replace jars with the new ones
> (if needed - check release notes for version 4.4.0 and 4.5.0 where all the
> versions of external tools [tika, maven, etc.] are stated) and deploy the
> updated WAR file to servlet container.
>
> Primoz
>
>
>
>
> From:   Chris 
> To: solr-user 
> Date:   16.10.2013 14:30
> Subject:Re: Regarding Solr Cloud issue...
>
>
>
> oh great. Thanks Primoz.
>
> is there any simple way to do the upgrade to 4.5 without having to change
> my configurations? update a few jar files etc?
>
>
> On Wed, Oct 16, 2013 at 4:58 PM,  wrote:
>
> > >>> Also, another issue that needs to be raised is the creation of cores
> > from
> > >>> the "core admin" section of the gui, doesnt really work well, it
> > creates
> > >>> files but then they do not work (again i am using 4.4)
> >
> > From my experience "core admin" section of the GUI does not work well in
> > SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0
> > which acts much better.
> >
> > I would use only HTTP requests ("cores and collections API") with
> > SolrCloud and would use GUI only for viewing the state of cluster and
> > cores.
> >
> > Primoz
> >
> >
> >
>
>


Re: SolrCloud Query Balancing

2013-10-16 Thread michael.boom
Thanks!

Could you provide some examples or details of the configuration you use ?
I think this solution would suit me also.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095910.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to retrieve the query for a boolean keyword?

2013-10-16 Thread Silvia Suárez
Dear all,

I am using solrj as client for indexing and searching documents on the solr
server

My question:

How to retrieve the query for a boolean keyword?

For example:

I have this query:

text:(“vacuna” AND “esteve news”) OR text:(“vacuna”) OR text:(“esteve news”)

And searching in:

text--> Esteve news: Obtener una vacuna para frenar el...

Solr returns:

Esteve news: obtener una vacuna para frenar el ...

It is ok.

My question is:

Can I know with solr that results: Esteve news vacuna
are provided by the query with the AND operator?

is it posible to retrieve with solrj?

Thanks a lot in advance,

Sil,



*
*
*Tecnologías y SaaS para el análisis de marcas comerciales.*


Nota:
Usted ha recibido este mensaje al estar en la libreta de direcciones del
remitente, en los archivos de la empresa o mediante el sistema de
“responder” al ser usted la persona que contactó por este medio con el
remitente. En caso de no querer recibir ningún email mas del remitente o de
cualquier miembro de la organización a la que pertenece, por favor,
responda a este email solicitando la baja de su dirección en nuestros
archivos.

Advertencia legal:
Este mensaje y, en su caso, los ficheros anexos son confidenciales,
especialmente en lo que respecta a los datos personales, y se dirigen
exclusivamente al destinatario referenciado. Si usted no lo es y lo ha
recibido por error o tiene conocimiento del mismo por cualquier motivo, le
rogamos que nos lo comunique por este medio y proceda a destruirlo o
borrarlo, y que en todo caso se abstenga de utilizar, reproducir, alterar,
archivar o comunicar a terceros el presente mensaje y ficheros anexos, todo
ello bajo pena de incurrir en responsabilidades legales.


Re: SolrCloud Query Balancing

2013-10-16 Thread Shawn Heisey
On 10/16/2013 3:52 AM, michael.boom wrote:
> I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3
> machines along with 3 Zookeeper instances.
> 
> My web application makes queries to Solr specifying the hostname of one of
> the machines. So that machine will always get the request and the other ones
> will just serve as an aid.
> So I would like to setup a load balancer that would fix that, balancing the
> queries to all machines. 
> Maybe doing the same while indexing.

SolrCloud actually handles load balancing for you.  You'll find that
when you send requests to one server, they are actually being
re-directed across the entire cloud, unless you include a
"distrib=false" parameter on the request, but that would also limit the
search to one shard, which is probably not what you want.

The only thing that you don't get with a non-Java client is redundancy.
 If you can't build in failover capability yourself, which is a very
advanced programming technique, then you need a load balancer.

For my large non-Cloud Solr install, I use haproxy as a load balancer.
Most of the time, it doesn't actually balance the load, just makes sure
that Solr is always reachable even if part of it goes down.  The haproxy
program is simple and easy to use, but performs extremely well.  I've
got a pacemaker cluster making sure that the shared IP address, haproxy,
and other homegrown utility applications related to Solr are only
running on one machine.

Thanks,
Shawn



Re: SolrCloud Query Balancing

2013-10-16 Thread Henrik Ossipoff Hansen
I did not actually realize this, I apologize for my previous reply!

Haproxy would definitely be the right choice then for the posters setup for 
redundancy.

Den 16/10/2013 kl. 15.53 skrev Shawn Heisey :

> On 10/16/2013 3:52 AM, michael.boom wrote:
>> I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3
>> machines along with 3 Zookeeper instances.
>> 
>> My web application makes queries to Solr specifying the hostname of one of
>> the machines. So that machine will always get the request and the other ones
>> will just serve as an aid.
>> So I would like to setup a load balancer that would fix that, balancing the
>> queries to all machines. 
>> Maybe doing the same while indexing.
> 
> SolrCloud actually handles load balancing for you.  You'll find that
> when you send requests to one server, they are actually being
> re-directed across the entire cloud, unless you include a
> "distrib=false" parameter on the request, but that would also limit the
> search to one shard, which is probably not what you want.
> 
> The only thing that you don't get with a non-Java client is redundancy.
> If you can't build in failover capability yourself, which is a very
> advanced programming technique, then you need a load balancer.
> 
> For my large non-Cloud Solr install, I use haproxy as a load balancer.
> Most of the time, it doesn't actually balance the load, just makes sure
> that Solr is always reachable even if part of it goes down.  The haproxy
> program is simple and easy to use, but performs extremely well.  I've
> got a pacemaker cluster making sure that the shared IP address, haproxy,
> and other homegrown utility applications related to Solr are only
> running on one machine.
> 
> Thanks,
> Shawn
> 



howto increase indexing speed?

2013-10-16 Thread Giovanni Bricconi
I have a small solr setup, not even on a physical machine but a vmware
virtual machine with a single cpu that reads data using DIH from a
database. The machine has no phisical disks attached but stores data on a
netapp nas.

Currently this machine indexes 320 documents/sec, not bad but we plan to
double the index and we would like to keep nearly the same.

Doing some basic checks during the indexing I have found with iostat that
the usage of the disks is nearly 8% and the source database is running
fine, instead the  virtual cpu is 95% running on solr.

Now I can quite easily add another virtual cpu to the solr box, but as far
as I know this won't help because DIH doesn't work in parallel. Am I wrong?

What would you do? Rewrite the feeding process quitting dih and using solrj
to feed data in parallel? Would you instead keep DIH and switch to a
sharded configuration?

Thank you for any hints

Giovanni


Re: howto increase indexing speed?

2013-10-16 Thread primoz . skale
I think DIH uses only one core per instance. IMHO 300 doc/sec is quite 
good. If you would like to use more cores you need to use solrj. Or maybe 
more than one DIH and more cores of course.

Primoz



From:   Giovanni Bricconi 
To: solr-user 
Date:   16.10.2013 16:25
Subject:howto increase indexing speed?



I have a small solr setup, not even on a physical machine but a vmware
virtual machine with a single cpu that reads data using DIH from a
database. The machine has no phisical disks attached but stores data on a
netapp nas.

Currently this machine indexes 320 documents/sec, not bad but we plan to
double the index and we would like to keep nearly the same.

Doing some basic checks during the indexing I have found with iostat that
the usage of the disks is nearly 8% and the source database is running
fine, instead the  virtual cpu is 95% running on solr.

Now I can quite easily add another virtual cpu to the solr box, but as far
as I know this won't help because DIH doesn't work in parallel. Am I 
wrong?

What would you do? Rewrite the feeding process quitting dih and using 
solrj
to feed data in parallel? Would you instead keep DIH and switch to a
sharded configuration?

Thank you for any hints

Giovanni



Re: howto increase indexing speed?

2013-10-16 Thread Walter Underwood
You might consider local disks. I once ran Solr with the indexes on an 
NFS-mounted volume and the slowdown was severe.

wunder

On Oct 16, 2013, at 7:40 AM, primoz.sk...@policija.si wrote:

> I think DIH uses only one core per instance. IMHO 300 doc/sec is quite 
> good. If you would like to use more cores you need to use solrj. Or maybe 
> more than one DIH and more cores of course.
> 
> Primoz
> 
> 
> 
> From:   Giovanni Bricconi 
> To: solr-user 
> Date:   16.10.2013 16:25
> Subject:howto increase indexing speed?
> 
> 
> 
> I have a small solr setup, not even on a physical machine but a vmware
> virtual machine with a single cpu that reads data using DIH from a
> database. The machine has no phisical disks attached but stores data on a
> netapp nas.
> 
> Currently this machine indexes 320 documents/sec, not bad but we plan to
> double the index and we would like to keep nearly the same.
> 
> Doing some basic checks during the indexing I have found with iostat that
> the usage of the disks is nearly 8% and the source database is running
> fine, instead the  virtual cpu is 95% running on solr.
> 
> Now I can quite easily add another virtual cpu to the solr box, but as far
> as I know this won't help because DIH doesn't work in parallel. Am I 
> wrong?
> 
> What would you do? Rewrite the feeding process quitting dih and using 
> solrj
> to feed data in parallel? Would you instead keep DIH and switch to a
> sharded configuration?
> 
> Thank you for any hints
> 
> Giovanni
> 

--
Walter Underwood
wun...@wunderwood.org





Re: prepareCommit vs Commit

2013-10-16 Thread Phani Chaitanya
Thanks  Shalin. Will post it there too.



-
Phani Chaitanya
--
View this message in context: 
http://lucene.472066.n3.nabble.com/prepareCommit-vs-Commit-tp4095545p4095916.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-16 Thread Shalin Shekhar Mangar
The only delete I see in the master logs is:

INFO  - 2013-10-11 14:06:54.793;
org.apache.solr.update.processor.LogUpdateProcessor; [annotation]
webapp=/solr path=/update params={}
{delete=[change.me(-1448623278425899008)]} 0 60

When you commit, we have the following:

INFO  - 2013-10-11 14:07:03.809;
org.apache.solr.update.DirectUpdateHandler2; start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2013-10-11 14:07:03.813;
org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
Skipping IW.commit.

That suggests that the id you are trying to delete never existed in the
first place and hence there was nothing to commit. Hence replication was
not triggered. Am I missing something?


On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON) <
b.akkinepa...@elsevier.com> wrote:

> Hi Otis,
> Did you get a chance to look into the logs.  Please let me know if you
> need more information.  Thank you.
>
> Regards,
> Bharat Akkinepalli
>
> -Original Message-
> From: Akkinepalli, Bharat (ELS-CON) [mailto:b.akkinepa...@elsevier.com]
> Sent: Friday, October 11, 2013 2:16 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue
> with Commits after deleting documents using Delete by ID
>
> Hi Otis,
> Thanks for the response.  The log files can be found here.
>
> MasterLog : http://pastebin.com/DPLKMPcF Slave Log:
> http://pastebin.com/DX9sV6Jx
>
> One more point worth mentioning here is that when we issue the commit with
> expungeDeletes=true, then the delete by id replication is successful. i.e.
> http://localhost:8983/solr/annotation/update?commit=true&expungeDeletes=true
>
> Regards,
> Bharat Akkinepalli
>
> -Original Message-
> From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
> Sent: Wednesday, October 09, 2013 6:35 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue
> with Commits after deleting documents using Delete by ID
>
> Bharat,
>
> Can you look at the logs on the Master when you issue the delete and the
> subsequent commits and share that?
>
> Otis
> --
> Solr & ElasticSearch Support -- http://sematext.com/ Performance
> Monitoring -- http://sematext.com/spm
>
>
>
> On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) <
> b.akkinepa...@elsevier.com> wrote:
> > Hi,
> > We have recently migrated from Solr 3.6 to Solr 4.4.  We are using the
> Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have noticed
> the following behavior/defect.
> >
> > Configuration:
> > ===
> >
> > 1.   The Hard Commit and Soft Commit are disabled in the
> configuration (we control the commits from the application)
> >
> > 2.   We have 1 Master and 2 Slaves configured and the pollInterval
> is configured to 10 Minutes.
> >
> > 3.   The Master is configured to have the "replicateAfter" as commit
> & startup
> >
> > Steps to reproduce the problem:
> > ==
> >
> > 1.   Delete a document in Solr  (using delete by id).  URL -
> http://localhost:8983/solr/annotation/update with body as  
> change.me
> >
> > 2.   Issue a commit in Master (
> http://localhost:8983/solr/annotation/update?commit=true).
> >
> > 3.   The replication of the DELETE WILL NOT happen.  The master and
> slave has the same Index version.
> >
> > 4.   If we try to issue another commit in Master, we see that it
> replicates fine.
> >
> > Request you to please confirm if this is a known issue.  Thank you.
> >
> > Regards,
> > Bharat Akkinepalli
> >
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: AW: Boosting a field with defType:dismax --> No results at all

2013-10-16 Thread uwe72
We have just one more Problem:

When we search explicit, like *:* or partNumber:A32783627 we still don’t get
any results.

What we are doing here wrong? 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095918.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Switching indexes

2013-10-16 Thread Christopher Gross
Garth,

I think I get what you're saying, but I want to make sure.

I have 3 servers (index1, index2, index3), with Solr living on port 8080.

Each of those has 3 cores loaded with data:
core1 (old version)
core1new (new version)
core2 (unrelated to core1)

If I wanted to make it so that queries to core1 are really going to
core1new, I'd run:
http://index1:8080/solr/admin/cores?action=CREATEALIAS&name=core1&collections=core1new&shard=shard1

Correct?

-- Chris


On Wed, Oct 16, 2013 at 9:02 AM, Garth Grimm <
garthgr...@averyranchconsulting.com> wrote:

> The alias applies to the entire cloud, not a single core.
>
> So you'd have your indexing application point to a "collection alias"
> named 'index'.  And that alias would point to core1.
> You'd have your query applications point to a "collection alias" named
> 'query', and that would point to core1, as well.
>
> Then use the Collection API to create core1new across the entire cloud.
>  Then update the 'index' alias to point to core1new.  Feed documents in,
> run warm-up scripts, run smoke tests, etc., etc.
> When you're ready, point the 'query' alias to core1new.
>
> You're now running completely on core1new, and can use the Collection API
> to delete core1 from the cloud.  Or keep it around as a backup to which you
> can restore simply by changing 'query' alias.
>
> -Original Message-
> From: Christopher Gross [mailto:cogr...@gmail.com]
> Sent: Wednesday, October 16, 2013 7:05 AM
> To: solr-user
> Subject: Re: Switching indexes
>
> Shawn,
>
> It all makes sense, I'm just dealing with production servers here so I'm
> trying to be very careful (shutting down one node at a time is OK, just
> don't want to do something catastrophic.)
>
> OK, so I should use that aliasing feature.
>
> On index1 I have:
> core1
> core1new
> core2
>
> On index2 and index3 I have:
> core1
> core2
>
> If I do the "alias" command on index1 and have "core1" alias "core1new":
> 1) Will that then get rid of the existing core1 and have "core1new" data
> be used for queries?
> 2) Will that change make core1 instances on index2 and index3 update to
> have "core1new" data?
>
> Thanks again!
>
>
>
> -- Chris
>
>
> On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey  wrote:
>
> > On 10/15/2013 2:17 PM, Christopher Gross wrote:
> >
> >> I have 3 Solr nodes (and 5 ZK nodes).
> >>
> >> For #1, would I have to do that on all of them?
> >> For #2, I'm not getting the auto-replication between node 1 and nodes
> >> 2 &
> >> 3
> >> for my new index.
> >>
> >> I have 2 indexes -- just call them "index" and "indexbk" (bk being
> >> the backup containing the full data set) up and running on one node.
> >> If I were to do a swap (via the Core Admin page), would that push the
> >> changes for indexbk over to the other two nodes?  Would I need to do
> >> that switch on the leader, or could that be done on one of the other
> nodes?
> >>
> >
> > For #1, I don't know how you want to handle your sharding and/or
> > replication.  I would assume that you probably have numShards=1 and
> > replicationFactor=3, but I could be wrong. At any rate, where the
> > collection lives is an implementation detail that's up to you.
> > SolrCloud keeps track of all your collections, whether they are on one
> > server or all servers. Typically you can send requests (queries, API
> > calls, etc) that deal with entire collections to any node in your
> > cluster and they will be handled correctly.  If you need to deal with
> > a specific core, that call needs to go to the correct node.
> >
> > For #2, when you create a core and want it to be a replica of
> > something that already exists, you need to give it a name that's not
> > in use on your cluster, such as index2_shard1_replica3.  You also tell
> > it what collection it's part of, which for my example, would probably
> > be index2.  Then you tell it what shard it will contain.  That will be
> shard1, shard2, etc.
> >  Here's an example of a CREATE call:
> >
> > http://server:port/solr/admin/**cores?action=CREATE&name=**
> > index2_shard1_replica3&**collection=index2&shard=shard1
> >
> > For the rest of your message: Core swapping and SolrCloud do NOT get
> > along.  If you are using SolrCloud, CoreAdmin features like that need
> > to disappear from your toolset. Attempting a core swap will make bad
> > things
> > (tm) happen.
> >
> > Collection aliasing is the way in SolrCloud that you can now do what
> > used to be done with swapping.  You have collections named index1,
> > index2, index3, etc ... and you keep an alias called just "index" that
> > points to one of those other collections, so that you don't have to
> > change your application - you just repoint the alias and all the
> > application queries going to "index" will go to the correct place.
> >
> > I hope I haven't made things more confusing for you!
> >
> > Thanks,
> > Shawn
> >
> >
>


RE: Switching indexes

2013-10-16 Thread Garth Grimm
I'd suggest using the Collections API:
http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=alias&collections=collection1,collection2...

See the Collections Aliases section of http://wiki.apache.org/solr/SolrCloud.

BTW, once you make the aliases, Zookeeper will have entries in /aliases.json 
that will tell you what aliases are defined and what they point to.

-Original Message-
From: Christopher Gross [mailto:cogr...@gmail.com] 
Sent: Wednesday, October 16, 2013 10:44 AM
To: solr-user
Subject: Re: Switching indexes

Garth,

I think I get what you're saying, but I want to make sure.

I have 3 servers (index1, index2, index3), with Solr living on port 8080.

Each of those has 3 cores loaded with data:
core1 (old version)
core1new (new version)
core2 (unrelated to core1)

If I wanted to make it so that queries to core1 are really going to core1new, 
I'd run:
http://index1:8080/solr/admin/cores?action=CREATEALIAS&name=core1&collections=core1new&shard=shard1

Correct?

-- Chris


On Wed, Oct 16, 2013 at 9:02 AM, Garth Grimm < 
garthgr...@averyranchconsulting.com> wrote:

> The alias applies to the entire cloud, not a single core.
>
> So you'd have your indexing application point to a "collection alias"
> named 'index'.  And that alias would point to core1.
> You'd have your query applications point to a "collection alias" named 
> 'query', and that would point to core1, as well.
>
> Then use the Collection API to create core1new across the entire cloud.
>  Then update the 'index' alias to point to core1new.  Feed documents 
> in, run warm-up scripts, run smoke tests, etc., etc.
> When you're ready, point the 'query' alias to core1new.
>
> You're now running completely on core1new, and can use the Collection 
> API to delete core1 from the cloud.  Or keep it around as a backup to 
> which you can restore simply by changing 'query' alias.
>
> -Original Message-
> From: Christopher Gross [mailto:cogr...@gmail.com]
> Sent: Wednesday, October 16, 2013 7:05 AM
> To: solr-user
> Subject: Re: Switching indexes
>
> Shawn,
>
> It all makes sense, I'm just dealing with production servers here so 
> I'm trying to be very careful (shutting down one node at a time is OK, 
> just don't want to do something catastrophic.)
>
> OK, so I should use that aliasing feature.
>
> On index1 I have:
> core1
> core1new
> core2
>
> On index2 and index3 I have:
> core1
> core2
>
> If I do the "alias" command on index1 and have "core1" alias "core1new":
> 1) Will that then get rid of the existing core1 and have "core1new" 
> data be used for queries?
> 2) Will that change make core1 instances on index2 and index3 update 
> to have "core1new" data?
>
> Thanks again!
>
>
>
> -- Chris
>
>
> On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey  wrote:
>
> > On 10/15/2013 2:17 PM, Christopher Gross wrote:
> >
> >> I have 3 Solr nodes (and 5 ZK nodes).
> >>
> >> For #1, would I have to do that on all of them?
> >> For #2, I'm not getting the auto-replication between node 1 and 
> >> nodes
> >> 2 &
> >> 3
> >> for my new index.
> >>
> >> I have 2 indexes -- just call them "index" and "indexbk" (bk being 
> >> the backup containing the full data set) up and running on one node.
> >> If I were to do a swap (via the Core Admin page), would that push 
> >> the changes for indexbk over to the other two nodes?  Would I need 
> >> to do that switch on the leader, or could that be done on one of 
> >> the other
> nodes?
> >>
> >
> > For #1, I don't know how you want to handle your sharding and/or 
> > replication.  I would assume that you probably have numShards=1 and 
> > replicationFactor=3, but I could be wrong. At any rate, where the 
> > collection lives is an implementation detail that's up to you.
> > SolrCloud keeps track of all your collections, whether they are on 
> > one server or all servers. Typically you can send requests (queries, 
> > API calls, etc) that deal with entire collections to any node in 
> > your cluster and they will be handled correctly.  If you need to 
> > deal with a specific core, that call needs to go to the correct node.
> >
> > For #2, when you create a core and want it to be a replica of 
> > something that already exists, you need to give it a name that's not 
> > in use on your cluster, such as index2_shard1_replica3.  You also 
> > tell it what collection it's part of, which for my example, would 
> > probably be index2.  Then you tell it what shard it will contain.  
> > That will be
> shard1, shard2, etc.
> >  Here's an example of a CREATE call:
> >
> > http://server:port/solr/admin/**cores?action=CREATE&name=**
> > index2_shard1_replica3&**collection=index2&shard=shard1
> >
> > For the rest of your message: Core swapping and SolrCloud do NOT get 
> > along.  If you are using SolrCloud, CoreAdmin features like that 
> > need to disappear from your toolset. Attempting a core swap will 
> > make bad things
> > (tm) happen.
> >
> > Collection aliasing is the way in SolrCloud that you can 

RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-16 Thread Akkinepalli, Bharat (ELS-CON)
Hi Shalin,
I am not sure why the log specifies "No uncommitted changes" appear.  The data 
is available in Solr at the time I perform a delete.

please find the below steps I have performed:
> Inserted a document in master (with id= change.me.1)
> issued a commit on master
> Triggered replication on slave
> Ensured that the document is replicated successfully.
> Issued a delete by ID.
> Issued a commit on master
> Replication did NOT happen.

The logs are as follows:
Master - http://pastebin.com/265CtCEp 
Slave - http://pastebin.com/Qx0xLwmK 

Regards,
Bharat Akkinepalli.

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Wednesday, October 16, 2013 11:28 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with 
Commits after deleting documents using Delete by ID

The only delete I see in the master logs is:

INFO  - 2013-10-11 14:06:54.793;
org.apache.solr.update.processor.LogUpdateProcessor; [annotation] webapp=/solr 
path=/update params={} {delete=[change.me(-1448623278425899008)]} 0 60

When you commit, we have the following:

INFO  - 2013-10-11 14:07:03.809;
org.apache.solr.update.DirectUpdateHandler2; start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2013-10-11 14:07:03.813;
org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
Skipping IW.commit.

That suggests that the id you are trying to delete never existed in the first 
place and hence there was nothing to commit. Hence replication was not 
triggered. Am I missing something?


On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON) < 
b.akkinepa...@elsevier.com> wrote:

> Hi Otis,
> Did you get a chance to look into the logs.  Please let me know if you 
> need more information.  Thank you.
>
> Regards,
> Bharat Akkinepalli
>
> -Original Message-
> From: Akkinepalli, Bharat (ELS-CON) 
> [mailto:b.akkinepa...@elsevier.com]
> Sent: Friday, October 11, 2013 2:16 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue 
> with Commits after deleting documents using Delete by ID
>
> Hi Otis,
> Thanks for the response.  The log files can be found here.
>
> MasterLog : http://pastebin.com/DPLKMPcF Slave Log:
> http://pastebin.com/DX9sV6Jx
>
> One more point worth mentioning here is that when we issue the commit 
> with expungeDeletes=true, then the delete by id replication is successful. 
> i.e.
> http://localhost:8983/solr/annotation/update?commit=true&expungeDelete
> s=true
>
> Regards,
> Bharat Akkinepalli
>
> -Original Message-
> From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
> Sent: Wednesday, October 09, 2013 6:35 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue 
> with Commits after deleting documents using Delete by ID
>
> Bharat,
>
> Can you look at the logs on the Master when you issue the delete and 
> the subsequent commits and share that?
>
> Otis
> --
> Solr & ElasticSearch Support -- http://sematext.com/ Performance 
> Monitoring -- http://sematext.com/spm
>
>
>
> On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) < 
> b.akkinepa...@elsevier.com> wrote:
> > Hi,
> > We have recently migrated from Solr 3.6 to Solr 4.4.  We are using 
> > the
> Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have 
> noticed the following behavior/defect.
> >
> > Configuration:
> > ===
> >
> > 1.   The Hard Commit and Soft Commit are disabled in the
> configuration (we control the commits from the application)
> >
> > 2.   We have 1 Master and 2 Slaves configured and the pollInterval
> is configured to 10 Minutes.
> >
> > 3.   The Master is configured to have the "replicateAfter" as commit
> & startup
> >
> > Steps to reproduce the problem:
> > ==
> >
> > 1.   Delete a document in Solr  (using delete by id).  URL -
> http://localhost:8983/solr/annotation/update with body as  
>  change.me
> >
> > 2.   Issue a commit in Master (
> http://localhost:8983/solr/annotation/update?commit=true).
> >
> > 3.   The replication of the DELETE WILL NOT happen.  The master and
> slave has the same Index version.
> >
> > 4.   If we try to issue another commit in Master, we see that it
> replicates fine.
> >
> > Request you to please confirm if this is a known issue.  Thank you.
> >
> > Regards,
> > Bharat Akkinepalli
> >
>



--
Regards,
Shalin Shekhar Mangar.


Re: Regarding Solr Cloud issue...

2013-10-16 Thread Shawn Heisey
On 10/16/2013 4:51 AM, Chris wrote:
> Also, is there any easy way upgrading to 4.5 without having to change most
> of my plugins & configuration files?

Upgrading is something that should be done carefully.  If you can, it's
always recommended that you try it out on dev hardware with your real
index data beforehand, so you can deal with any problems that arise
without causing problems for your production cluster.  Upgrading
SolrCloud is particularly tricky, because for a while you will be
running different versions on different machines in your cluster.

If you're using your own custom software to go with Solr, or you're
using third-party plugins that aren't included in the Solr download,
upgrading might take more effort than usual.  Also, if you are doing
anything in your config/schema that changes the format of the Lucene
index, you may find that it can't be upgraded without completely
rebuilding the index.  Examples of this are changing the postings format
or docValues format.  This is a very nasty complication with SolrCloud,
because those configurations affect the entire cluster.  In that case,
the whole index may need to be rebuilt without custom formats before
upgrading is attempted.

If you don't have any of the complications mentioned in the preceding
paragraph, upgrading is usually a very simple process:

*) Shut down Solr.
*) Delete the extracted WAR file directory.
*) Replace solr.war with the new war from dist/ in the download.
**) Usually it must actually be named solr.war, which means renaming it.
*) Delete and replace other jars copied from the download.
*) Change luceneMatchVersion in all solrconfig.xml files. **
*) Start Solr back up.

** With SolrCloud, you can't actually change the luceneMatchVersion
until all of your servers have been upgraded.

A full reindex is strongly recommended.  With SolrCloud, it normally
needs to wait until all servers are upgraded.  In situations where it
won't work at all without a reindex, upgrading SolrCloud can be very
challenging.

It's strongly recommended that you look over CHANGES.txt and compare the
new example config/schema with the example from the old version, to see
if there are any changes that you might want to incorporate into your
own config.  As with luceneMatchVersion, if you're running SolrCloud,
those changes might need to wait until you're fully upgraded.

Side note: When upgrading to a new minor version, config changes aren't
normally required.  They will usually be required when upgrading major
versions, such as 3.x to 4.x.

If you *do* have custom plugins that aren't included in the Solr
download, you may have to recompile them for the new version, or wait
for the vendor to create a new version before you upgrade.

This is only the tip of the iceberg, but a lot of the rest of it depends
greatly on your configurations.

Thanks,
Shawn



AW: Boosting a field with defType:dismax --> No results at all

2013-10-16 Thread uwe72
We have just one more Problem:

 

When we search explicit, like *:* or partNumber:A32783627 we still don't
get any results.

 

What we are doing here wrong? 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095927.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Query Balancing

2013-10-16 Thread Shawn Heisey
On 10/16/2013 8:01 AM, Henrik Ossipoff Hansen wrote:
> I did not actually realize this, I apologize for my previous reply!
> 
> Haproxy would definitely be the right choice then for the posters setup for 
> redundancy.

Any load balancer software, or even an appliance load balancer like
those made by F5, would probably work.  I don't think there's anything
wrong with nginx.  I've never used it, but I've heard it mentioned often
in a load balancer context, so it's probably great software.  The
original poster should use whatever they are comfortable with, and if
they have no experience with any particular solution, they can ask
advice from people who have used one or more of the possibilities.

Never be afraid to offer advice.  I've been wrong plenty of times in
what I've posted on this list, and I've learned a TON because of it.

Thanks,
Shawn



Re: Switching indexes

2013-10-16 Thread Shawn Heisey
On 10/16/2013 9:44 AM, Christopher Gross wrote:
> Garth,
> 
> I think I get what you're saying, but I want to make sure.
> 
> I have 3 servers (index1, index2, index3), with Solr living on port 8080.
> 
> Each of those has 3 cores loaded with data:
> core1 (old version)
> core1new (new version)
> core2 (unrelated to core1)
> 
> If I wanted to make it so that queries to core1 are really going to
> core1new, I'd run:
> http://index1:8080/solr/admin/cores?action=CREATEALIAS&name=core1&collections=core1new&shard=shard1

Alias is a *Collections* API concept, not a CoreAdmin API concept.

One question is this:  Do you have a *collection* named core1, or just a
*core* named core1?  I'm pretty sure that it's possible on a SolrCloud
system to have cores that are not participating in the cloud infrastructure.

Collections are made up of shards.  Shards have replicas.  Each replica
is a core.

I'd like to see whether you have configurations loaded into zookeeper.
In the admin UI, click on Cloud, then Tree.  Click the arrow to the left
of "/configs" to open it.  If you see folders underneath /configs, then
you do have at least one configurations in zookeeper, and you will have
the name(s) they are using.

You can also click the arrow next to /collections and see whether you
have any collections.

The Cloud->Graph page shows you a visual representation of your cloud.

Let us know what you find.  If you have anything there, I can give you
some API URL calls that will hopefully fully illustrate what I'm saying.

Thanks,
Shawn



Re: AW: Boosting a field with defType:dismax --> No results at all

2013-10-16 Thread Jack Krupansky

dismax doesn't support wildcard, fuzzy, or fielded terms. edismax does.

My e-book details differences between the query parsers.

-- Jack Krupansky

-Original Message- 
From: uwe72

Sent: Wednesday, October 16, 2013 12:26 PM
To: solr-user@lucene.apache.org
Subject: AW: Boosting a field with defType:dismax --> No results at all

We have just one more Problem:



When we search explicit, like *:* or partNumber:A32783627 we still don't
get any results.



What we are doing here wrong?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095927.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: How to retrieve the query for a boolean keyword?

2013-10-16 Thread Stavros Delsiavas
I believe it is not possible. But you can easily split this in two query 
statements.


First one:

text:(“vacuna” AND “esteve news”)

and the second:

(text:(“vacuna”) OR text:(“esteve news”)) AND -text:(“vacuna” AND 
“esteve news”)


The minus "-" excludes all entries of the first statemant. This is 
important to ensure that you don't get entries twice. So the first will 
contain all entries with both words, and the second query all left 
entries that contain exactly ONE of those words.


I hope this helps.


Am 16.10.2013 15:49, schrieb Silvia Suárez:

Dear all,

I am using solrj as client for indexing and searching documents on the solr
server

My question:

How to retrieve the query for a boolean keyword?

For example:

I have this query:

text:(“vacuna” AND “esteve news”) OR text:(“vacuna”) OR text:(“esteve news”)

And searching in:

text--> Esteve news: Obtener una vacuna para frenar el...

Solr returns:

Esteve news: obtener una vacuna para frenar el ...

It is ok.

My question is:

Can I know with solr that results: Esteve news vacuna
are provided by the query with the AND operator?

is it posible to retrieve with solrj?

Thanks a lot in advance,

Sil,



*
*
*Tecnologías y SaaS para el análisis de marcas comerciales.*


Nota:
Usted ha recibido este mensaje al estar en la libreta de direcciones del
remitente, en los archivos de la empresa o mediante el sistema de
“responder” al ser usted la persona que contactó por este medio con el
remitente. En caso de no querer recibir ningún email mas del remitente o de
cualquier miembro de la organización a la que pertenece, por favor,
responda a este email solicitando la baja de su dirección en nuestros
archivos.

Advertencia legal:
Este mensaje y, en su caso, los ficheros anexos son confidenciales,
especialmente en lo que respecta a los datos personales, y se dirigen
exclusivamente al destinatario referenciado. Si usted no lo es y lo ha
recibido por error o tiene conocimiento del mismo por cualquier motivo, le
rogamos que nos lo comunique por este medio y proceda a destruirlo o
borrarlo, y que en todo caso se abstenga de utilizar, reproducir, alterar,
archivar o comunicar a terceros el presente mensaje y ficheros anexos, todo
ello bajo pena de incurrir en responsabilidades legales.





Re: field "title_ngram" was indexed without position data; cannot run PhraseQuery

2013-10-16 Thread MC

Hello,
Thank you all for your help. There was indeed a property which was not 
set right in schema.xml:

omitTermFreqAndPositions="true"
After changing it to false phrase lookup started working OK.
Thanks,

M


On 10/15/13 12:01 PM, Jack Krupansky wrote:

Show us the field and field type from your schema.

Likely you are "omitting" position info for the field, and the field 
type has "autoGeneratePhraseQueries="true"" - the ngram analyzer 
generates a sequence of terms for a single source term and then the 
query parser generates a PhraseQuery for that sequence, but that 
requires position info in the index but you have omitted them. That's 
one theory.


So, if that theory is correct, either retain position info by getting 
rid of the "omit", or remove the autoGeneratePhraseQueries.


-- Jack Krupansky

-Original Message- From: Jason Hellman
Sent: Tuesday, October 15, 2013 11:19 AM
To: solr-user@lucene.apache.org
Subject: Re: field "title_ngram" was indexed without position data; 
cannot run PhraseQuery


If you consider what n-grams do this should make sense to you. 
Consider the following piece of data:


White iPod

If the field is fed through a bigram filter (n-gram with size of 2) 
the resulting token stream would appear as such:


wh hi it te
ip po od

The usual use of n-grams is to match those partial tokens, essentially 
giving you a great deal of power in creating non-wildcard partial 
matches. How you use this is up to your imagination, but one easy use 
is in partial matches in autosuggest features.


I can't speak for the intent behind the way it's coded, but it makes a 
great deal of sense to me that positional data would be seen as 
unnecessary since the intent of n-grams typically doesn't collide with 
phrase searches.  If you need both behaviors it's far better to use 
copyField and have one field dedicated to standard tokenization and 
token filters, and another field for n-grams.


I hope that's useful to you.

On Oct 15, 2013, at 6:14 AM, MC  wrote:


Hello,

Could someone explain (or perhaps provide a documentation link) what 
does the following error mean:
"field "title_ngram" was indexed without position data; cannot run 
PhraseQuery"


I'll do some more searching online, I was just wondering if anyone 
has encountered this error before, and what the possible solution 
might be. I've recently upgraded my version of solr from 3.6.0 to 
4.5.0, I'm not sure if this has any bearing or not.

Thanks,

M







Re: Switching indexes

2013-10-16 Thread Christopher Gross
Ok, so I think I was confusing the terminology (still in a 3.X mindset I
guess.)

>From the Cloud->Tree, I do see that I have "collections" for what I was
calling "core1", "core2", etc.

So, to redo the above,
Servers: index1, index2, index3
Collections: (on each) coll1, coll2
Collection (core?) on index1: coll1new

Each Collection has 1 shard (too small to make sharding worthwhile).

So should I run something like this:
http://index1:8080/solr/admin/collections?action=CREATEALIAS&name=coll1&collections=col11new

Or will I need coll1new to be on each of the index1, index2 and index3
instances of Solr?


-- Chris


On Wed, Oct 16, 2013 at 12:40 PM, Shawn Heisey  wrote:

> On 10/16/2013 9:44 AM, Christopher Gross wrote:
> > Garth,
> >
> > I think I get what you're saying, but I want to make sure.
> >
> > I have 3 servers (index1, index2, index3), with Solr living on port 8080.
> >
> > Each of those has 3 cores loaded with data:
> > core1 (old version)
> > core1new (new version)
> > core2 (unrelated to core1)
> >
> > If I wanted to make it so that queries to core1 are really going to
> > core1new, I'd run:
> >
> http://index1:8080/solr/admin/cores?action=CREATEALIAS&name=core1&collections=core1new&shard=shard1
>
> Alias is a *Collections* API concept, not a CoreAdmin API concept.
>
> One question is this:  Do you have a *collection* named core1, or just a
> *core* named core1?  I'm pretty sure that it's possible on a SolrCloud
> system to have cores that are not participating in the cloud
> infrastructure.
>
> Collections are made up of shards.  Shards have replicas.  Each replica
> is a core.
>
> I'd like to see whether you have configurations loaded into zookeeper.
> In the admin UI, click on Cloud, then Tree.  Click the arrow to the left
> of "/configs" to open it.  If you see folders underneath /configs, then
> you do have at least one configurations in zookeeper, and you will have
> the name(s) they are using.
>
> You can also click the arrow next to /collections and see whether you
> have any collections.
>
> The Cloud->Graph page shows you a visual representation of your cloud.
>
> Let us know what you find.  If you have anything there, I can give you
> some API URL calls that will hopefully fully illustrate what I'm saying.
>
> Thanks,
> Shawn
>
>


Re: AW: Boosting a field with defType:dismax --> No results at all

2013-10-16 Thread uwe72
Works like this?

edismax
SignalImpl.baureihe^1011 text^0.1

Another option:

How about just but to the desired fields a high boosting factor while adding
the field to the document, using solr?!

Can this work?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095938.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Switching indexes

2013-10-16 Thread Shawn Heisey
On 10/16/2013 11:51 AM, Christopher Gross wrote:
> Ok, so I think I was confusing the terminology (still in a 3.X mindset I
> guess.)
> 
> From the Cloud->Tree, I do see that I have "collections" for what I was
> calling "core1", "core2", etc.
> 
> So, to redo the above,
> Servers: index1, index2, index3
> Collections: (on each) coll1, coll2
> Collection (core?) on index1: coll1new
> 
> Each Collection has 1 shard (too small to make sharding worthwhile).
> 
> So should I run something like this:
> http://index1:8080/solr/admin/collections?action=CREATEALIAS&name=coll1&collections=col11new
> 
> Or will I need coll1new to be on each of the index1, index2 and index3
> instances of Solr?

I don't think you can create an alias if a collection already exists
with that name - so having a collection named core1 means you wouldn't
want an alias named core1.  I could be wrong, but just to keep things
clean, I wouldn't recommend it, even if it's possible.

That CREATEALIAS command will only work if coll1new shows up in
/collections and shows green on the cloud graph.  If it does, and you're
using an alias name that doesn't already exist as a collection, then
you're good.

Whether coll1new is living on one server, two servers, or all three
servers doesn't matter for CREATEALIAS, or for most other
collection-related topics.  Any query or update can be sent to any
server in the cloud and it will be routed to the correct place according
to the clusterstate.

Where things live and how many replicas there are *does* matter for a
discussion about redundancy.  Generally speaking, you're going to want
your shards to have at least two replicas, so that if a Solr instance
goes down, or is taken down for maintenance, your cloud remains fully
operational.  In your situation, you probably want three replicas - so
each collection lives on all three servers.

So my general advice:

Decide what name you want your application to use, make sure none of
your existing collections are using that name, and set up an alias with
that name pointing to whichever collection is current.  Then change your
application configurations or code to point at the alias instead of
directly at the collection.

When you want to do your reindex, first create a new collection using
the collections API.  Index to that new collection.  When it's ready to
go, use CREATEALIAS to update the alias, and your application will start
using the new index.

Thanks,
Shawn



SolrCloud Performance Issue

2013-10-16 Thread shamik
Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching 2000ms
often. I expected some surge due to additional servers, network latency,
etc. but this difference is really baffling. The hardware is similar in both
cases, except for the fact that couple of SolrCloud node is sharing
zookeeper as well. m1x.large I/O is high, so shouldn't be a bottleneck as
well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries executed
from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:

 
3 
false 


 
1000 



1024










true

200

400





line
xref
draw




line
draw
linelanguage:english
lineSource2:documentation
lineSource2:CloudHelp
drawlanguage:english
drawSource2:documentation
drawSource2:CloudHelp



2


The custom request handler :



explicit
0.01
velocity
browse
text/html;charset=UTF-8 
  
layout
cloudhelp

edismax
*:*
15
id,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score
text^1.5 title^2 IndexTerm^.9 
keywords^1.2
ADSKCommandSrch^2 ADSKContextId^1
Source2:CloudHelp^3 
Source2:youtube^0.85 
recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0 
text


on
1
100
language
Source2
DocumentationBook
ADSKProductDisplay
audience


true
text title
250
ShortDesc


true
default
true
false
false
1


spellcheck



One thing I've noticed is that the queryresultcache hit rate is really low,
not sure our queries are always that unique. I'm using edismax and there's a
recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0 , can this
contribute ?

Sorry about the long post, but I'm struggling to nail down the issue here,
especially when queries are running fine in a master-slave environment with
similar hardware and network.

Any pointers will be highly appreciated.

Regards,
Shamik




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095940.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Switching indexes

2013-10-16 Thread Christopher Gross
Thanks Shawn, the explanations help bring me forward to the "SolrCloud"
mentality.

So it sounds like going forward that I should have a more complicated name
(ex: coll1-20131015) aliased to coll1, to make it easier to switch in the
future.

Now, if I already have an index (copied from one location to another), it
sounds like I should just remove my existing (bad/old data) coll1, create
the "replicated" one (calling it coll1-), then alias coll1 to that
one.

This type of information would have been awesome to know before I got
started, but I can make do with what I've got going now.

Thanks again!


-- Chris


On Wed, Oct 16, 2013 at 2:40 PM, Shawn Heisey  wrote:

> On 10/16/2013 11:51 AM, Christopher Gross wrote:
> > Ok, so I think I was confusing the terminology (still in a 3.X mindset I
> > guess.)
> >
> > From the Cloud->Tree, I do see that I have "collections" for what I was
> > calling "core1", "core2", etc.
> >
> > So, to redo the above,
> > Servers: index1, index2, index3
> > Collections: (on each) coll1, coll2
> > Collection (core?) on index1: coll1new
> >
> > Each Collection has 1 shard (too small to make sharding worthwhile).
> >
> > So should I run something like this:
> >
> http://index1:8080/solr/admin/collections?action=CREATEALIAS&name=coll1&collections=col11new
> >
> > Or will I need coll1new to be on each of the index1, index2 and index3
> > instances of Solr?
>
> I don't think you can create an alias if a collection already exists
> with that name - so having a collection named core1 means you wouldn't
> want an alias named core1.  I could be wrong, but just to keep things
> clean, I wouldn't recommend it, even if it's possible.
>
> That CREATEALIAS command will only work if coll1new shows up in
> /collections and shows green on the cloud graph.  If it does, and you're
> using an alias name that doesn't already exist as a collection, then
> you're good.
>
> Whether coll1new is living on one server, two servers, or all three
> servers doesn't matter for CREATEALIAS, or for most other
> collection-related topics.  Any query or update can be sent to any
> server in the cloud and it will be routed to the correct place according
> to the clusterstate.
>
> Where things live and how many replicas there are *does* matter for a
> discussion about redundancy.  Generally speaking, you're going to want
> your shards to have at least two replicas, so that if a Solr instance
> goes down, or is taken down for maintenance, your cloud remains fully
> operational.  In your situation, you probably want three replicas - so
> each collection lives on all three servers.
>
> So my general advice:
>
> Decide what name you want your application to use, make sure none of
> your existing collections are using that name, and set up an alias with
> that name pointing to whichever collection is current.  Then change your
> application configurations or code to point at the alias instead of
> directly at the collection.
>
> When you want to do your reindex, first create a new collection using
> the collections API.  Index to that new collection.  When it's ready to
> go, use CREATEALIAS to update the alias, and your application will start
> using the new index.
>
> Thanks,
> Shawn
>
>


Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Shawn Heisey
On 10/16/2013 4:46 AM, Stavros Delisavas wrote:
> My local solr gives me:
> http://pastebin.com/Q6d9dFmZ
> 
> and my webserver this:
> http://pastebin.com/q87WEjVA
> 
> I copied only the first few hundret lines (of more than 8000) because
> the webserver output was to big even for pastebin.
> 
> 
> 
> On 16.10.2013 12:27, Erik Hatcher wrote:
>> What does the debug output say from debugQuery=true say between the two?

What's really needed here is the first part of the  section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn



Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Stavros Delsiavas

Okay I understand,

here's the rawquerystring. It was at about line 3000:


 title:(into AND the AND wild*)
 title:(into AND the AND wild*)
 +title:wild*
 +title:wild*

At this place the debug output DOES differ from the one on my local 
system. But I don't understand why...

This is the local debug output:


  title:(into AND the AND wild*)
  title:(into AND the AND wild*)
  +title:into +title:the +title:wild*
  +title:into +title:the 
+title:wild*


Why is that? Any ideas?




Am 16.10.2013 21:03, schrieb Shawn Heisey:

On 10/16/2013 4:46 AM, Stavros Delisavas wrote:

My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:

What does the debug output say from debugQuery=true say between the two?

What's really needed here is the first part of the  section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn





Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Jack Krupansky
So, the stopwords.txt file is different between the two systems - the first 
has stop words but the second does not. Did you expect stop words to be 
removed, or not?


-- Jack Krupansky

-Original Message- 
From: Stavros Delsiavas

Sent: Wednesday, October 16, 2013 5:02 PM
To: solr-user@lucene.apache.org
Subject: Re: Local Solr and Webserver-Solr act differently ("and" treated 
like "or")


Okay I understand,

here's the rawquerystring. It was at about line 3000:


 title:(into AND the AND wild*)
 title:(into AND the AND wild*)
 +title:wild*
 +title:wild*

At this place the debug output DOES differ from the one on my local
system. But I don't understand why...
This is the local debug output:


  title:(into AND the AND wild*)
  title:(into AND the AND wild*)
  +title:into +title:the +title:wild*
  +title:into +title:the
+title:wild*

Why is that? Any ideas?




Am 16.10.2013 21:03, schrieb Shawn Heisey:

On 10/16/2013 4:46 AM, Stavros Delisavas wrote:

My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:

What does the debug output say from debugQuery=true say between the two?

What's really needed here is the first part of the  section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn





SolrCloud Performance Issue

2013-10-16 Thread Shamik Bandopadhyay
Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching
2000ms often. I expected some surge due to additional servers, network
latency, etc. but this difference is really baffling. The hardware is
similar in both cases, except for the fact that couple of SolrCloud node is
sharing zookeeper as well. m1x.large I/O is high, so shouldn't be a
bottleneck as well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries
executed from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:


3
false



1000



1024










true

200

400





line
xref
draw




line
draw
linelanguage:english
lineSource2:documentation
lineSource2:CloudHelp
drawlanguage:english
drawSource2:documentation
drawSource2:CloudHelp



2


The custom request handler :



explicit
0.01
velocity
browse
text/html;charset=UTF-8
layout
cloudhelp

edismax
*:*
15
id,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score
text^1.5 title^2 IndexTerm^.9
keywords^1.2 ADSKCommandSrch^2 ADSKContextId^1
Source2:CloudHelp^3
Source2:youtube^0.85
recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0
text


on
1
100
language
Source2
DocumentationBook
ADSKProductDisplay
audience


true
text title
250
ShortDesc


true
default
true
false
false
1


spellcheck



One thing I've noticed is that the queryresultcache hit rate is really low,
not sure our queries are always that unique. I'm using edismax and there's
a recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0 , can
this contribute ?

Sorry about the long post, but I'm struggling to nail down the issue here,
especially when queries are running fine in a master-slave environment with
similar hardware and network.

Any pointers will be highly appreciated.

Regards,
Shamik


Solr - Read sort data from external source

2013-10-16 Thread qrcde
Hello,

I am trying to write some code to read rank data from external db, I saw
some example done using database -
http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-using-external.html,
where they fetch whole database during index searcher creation and cache it.

But is there any way to pass parameter or choose different database during
FieldComparator based on query. Lets say I want to pass versions, the sort
order in version 1 will be different then sort order in v2.

Or if I use ExternalFileField is there way to load different file base on
query parameter? 

Regards



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Read-sort-data-from-external-source-tp4095970.html
Sent from the Solr - User mailing list archive at Nabble.com.


Skipping caches on a /select

2013-10-16 Thread Tim Vaillancourt
Hey guys,

I am debugging some /select queries on my Solr tier and would like to see
if there is a way to tell Solr to skip the caches on a given /select query
if it happens to ALREADY be in the cache. Live queries are being inserted
and read from the caches, but I want my debug queries to bypass the cache
entirely.

I do know about the "cache=false" param (that causes the results of a
select to not be INSERTED in to the cache), but what I am looking for
instead is a way to tell Solr to not read the cache at all, even if there
actually is a cached result for my query.

Is there a way to do this (without disabling my caches in solrconfig.xml),
or is this feature request?

Thanks!

Tim Vaillancourt


Re: SolrCloud on SSL

2013-10-16 Thread Tim Vaillancourt
Not important, but I'm also curious why you would want SSL on Solr (adds
overhead, complexity, harder-to-troubleshoot, etc)?

To avoid the overhead, could you put Solr on a separate VLAN (with ACLs to
client servers)?

Cheers,

Tim


On 12 October 2013 17:30, Shawn Heisey  wrote:

> On 10/11/2013 9:38 AM, Christopher Gross wrote:
> > On Fri, Oct 11, 2013 at 11:08 AM, Shawn Heisey 
> wrote:
> >
> >> On 10/11/2013 8:17 AM, Christopher Gross wrote: 
> >>> Is there a spot in a Solr configuration that I can set this up to use
> >> HTTPS?
> >>
> >> From what I can tell, not yet.
> >>
> >> https://issues.apache.org/jira/browse/SOLR-3854
> >> https://issues.apache.org/jira/browse/SOLR-4407
> >> https://issues.apache.org/jira/browse/SOLR-4470
> >>
> >>
> > Dang.
>
> Christopher,
>
> I was just looking through Solr source code for a completely different
> issue, and it seems that there *IS* a way to do this in your configuration.
>
> If you were to use "https://hostname"; or "https://ipaddress"; as the
> "host" parameter in your solr.xml file on each machine, it should do
> what you want.  The parameter is described here, but not the behavior
> that I have discovered:
>
> http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params
>
> Boring details: In the org.apache.solr.cloud package, there is a
> ZkController class.  The getHostAddress method is where I discovered
> that you can do this.
>
> If you could try this out and confirm that it works, I will get the wiki
> page updated and look into the Solr reference guide as well.
>
> Thanks,
> Shawn
>
>


Re: Skipping caches on a /select

2013-10-16 Thread Yonik Seeley
On Wed, Oct 16, 2013 at 6:18 PM, Tim Vaillancourt  wrote:
> I am debugging some /select queries on my Solr tier and would like to see
> if there is a way to tell Solr to skip the caches on a given /select query
> if it happens to ALREADY be in the cache. Live queries are being inserted
> and read from the caches, but I want my debug queries to bypass the cache
> entirely.
>
> I do know about the "cache=false" param (that causes the results of a
> select to not be INSERTED in to the cache), but what I am looking for
> instead is a way to tell Solr to not read the cache at all, even if there
> actually is a cached result for my query.

Yeah, cache=false for "q" or "fq" should already not use the cache at
all (read or write).

-Yonik


Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-16 Thread Shalin Shekhar Mangar
Thanks Bharat. This is a bug. I've opened LUCENE-5289.

https://issues.apache.org/jira/browse/LUCENE-5289


On Wed, Oct 16, 2013 at 9:35 PM, Akkinepalli, Bharat (ELS-CON) <
b.akkinepa...@elsevier.com> wrote:

> Hi Shalin,
> I am not sure why the log specifies "No uncommitted changes" appear.  The
> data is available in Solr at the time I perform a delete.
>
> please find the below steps I have performed:
> > Inserted a document in master (with id= change.me.1)
> > issued a commit on master
> > Triggered replication on slave
> > Ensured that the document is replicated successfully.
> > Issued a delete by ID.
> > Issued a commit on master
> > Replication did NOT happen.
>
> The logs are as follows:
> Master - http://pastebin.com/265CtCEp
> Slave - http://pastebin.com/Qx0xLwmK
>
> Regards,
> Bharat Akkinepalli.
>
> -Original Message-
> From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
> Sent: Wednesday, October 16, 2013 11:28 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue
> with Commits after deleting documents using Delete by ID
>
> The only delete I see in the master logs is:
>
> INFO  - 2013-10-11 14:06:54.793;
> org.apache.solr.update.processor.LogUpdateProcessor; [annotation]
> webapp=/solr path=/update params={} {delete=[change.me(-1448623278425899008)]}
> 0 60
>
> When you commit, we have the following:
>
> INFO  - 2013-10-11 14:07:03.809;
> org.apache.solr.update.DirectUpdateHandler2; start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> INFO  - 2013-10-11 14:07:03.813;
> org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
> Skipping IW.commit.
>
> That suggests that the id you are trying to delete never existed in the
> first place and hence there was nothing to commit. Hence replication was
> not triggered. Am I missing something?
>
>
> On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON) <
> b.akkinepa...@elsevier.com> wrote:
>
> > Hi Otis,
> > Did you get a chance to look into the logs.  Please let me know if you
> > need more information.  Thank you.
> >
> > Regards,
> > Bharat Akkinepalli
> >
> > -Original Message-
> > From: Akkinepalli, Bharat (ELS-CON)
> > [mailto:b.akkinepa...@elsevier.com]
> > Sent: Friday, October 11, 2013 2:16 PM
> > To: solr-user@lucene.apache.org
> > Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue
> > with Commits after deleting documents using Delete by ID
> >
> > Hi Otis,
> > Thanks for the response.  The log files can be found here.
> >
> > MasterLog : http://pastebin.com/DPLKMPcF Slave Log:
> > http://pastebin.com/DX9sV6Jx
> >
> > One more point worth mentioning here is that when we issue the commit
> > with expungeDeletes=true, then the delete by id replication is
> successful. i.e.
> > http://localhost:8983/solr/annotation/update?commit=true&expungeDelete
> > s=true
> >
> > Regards,
> > Bharat Akkinepalli
> >
> > -Original Message-
> > From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
> > Sent: Wednesday, October 09, 2013 6:35 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue
> > with Commits after deleting documents using Delete by ID
> >
> > Bharat,
> >
> > Can you look at the logs on the Master when you issue the delete and
> > the subsequent commits and share that?
> >
> > Otis
> > --
> > Solr & ElasticSearch Support -- http://sematext.com/ Performance
> > Monitoring -- http://sematext.com/spm
> >
> >
> >
> > On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) <
> > b.akkinepa...@elsevier.com> wrote:
> > > Hi,
> > > We have recently migrated from Solr 3.6 to Solr 4.4.  We are using
> > > the
> > Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have
> > noticed the following behavior/defect.
> > >
> > > Configuration:
> > > ===
> > >
> > > 1.   The Hard Commit and Soft Commit are disabled in the
> > configuration (we control the commits from the application)
> > >
> > > 2.   We have 1 Master and 2 Slaves configured and the pollInterval
> > is configured to 10 Minutes.
> > >
> > > 3.   The Master is configured to have the "replicateAfter" as
> commit
> > & startup
> > >
> > > Steps to reproduce the problem:
> > > ==
> > >
> > > 1.   Delete a document in Solr  (using delete by id).  URL -
> > http://localhost:8983/solr/annotation/update with body as
> >  change.me
> > >
> > > 2.   Issue a commit in Master (
> > http://localhost:8983/solr/annotation/update?commit=true).
> > >
> > > 3.   The replication of the DELETE WILL NOT happen.  The master and
> > slave has the same Index version.
> > >
> > > 4.   If we try to issue another commit in Master, we see that it
> > replicates fine.
> > >
> > > Request you to please confirm if this is a known issue.  Thank you.
> > >
> > > Regards,
> > > Bharat Ak

Re: Local Solr and Webserver-Solr act differently ("and" treated like "or")

2013-10-16 Thread Stavros Delsiavas
Unfortunatly, I don't really know what stopwords are. I would like it to 
not ignore any words of my query.

How/Where can I change this stopwords-behaviour?


Am 16.10.2013 23:45, schrieb Jack Krupansky:
So, the stopwords.txt file is different between the two systems - the 
first has stop words but the second does not. Did you expect stop 
words to be removed, or not?


-- Jack Krupansky

-Original Message- From: Stavros Delsiavas
Sent: Wednesday, October 16, 2013 5:02 PM
To: solr-user@lucene.apache.org
Subject: Re: Local Solr and Webserver-Solr act differently ("and" 
treated like "or")


Okay I understand,

here's the rawquerystring. It was at about line 3000:


 title:(into AND the AND wild*)
 title:(into AND the AND wild*)
 +title:wild*
 +title:wild*

At this place the debug output DOES differ from the one on my local
system. But I don't understand why...
This is the local debug output:


  title:(into AND the AND wild*)
  title:(into AND the AND wild*)
  +title:into +title:the +title:wild*
  +title:into +title:the
+title:wild*

Why is that? Any ideas?




Am 16.10.2013 21:03, schrieb Shawn Heisey:

On 10/16/2013 4:46 AM, Stavros Delisavas wrote:

My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:
What does the debug output say from debugQuery=true say between the 
two?

What's really needed here is the first part of the  section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn







Re: SolrCloud Performance Issue

2013-10-16 Thread primoz . skale
Query result cache hit might be low due to using NOW in bf. NOW is always 
translated to current time and that of course changes from ms to ms... :)

Primoz



From:   Shamik Bandopadhyay 
To: solr-user@lucene.apache.org
Date:   17.10.2013 00:14
Subject:SolrCloud Performance Issue



Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching
2000ms often. I expected some surge due to additional servers, network
latency, etc. but this difference is really baffling. The hardware is
similar in both cases, except for the fact that couple of SolrCloud node 
is
sharing zookeeper as well. m1x.large I/O is high, so shouldn't be a
bottleneck as well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries
executed from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:


3
false



1000



1024










true

200

400





line
xref
draw




line
draw
linelanguage:english
lineSource2:documentation
lineSource2:CloudHelp
drawlanguage:english
drawSource2:documentation
drawSource2:CloudHelp



2


The custom request handler :



explicit
0.01
velocity
browse
text/html;charset=UTF-8
layout
cloudhelp

edismax
*:*
15
id,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score
text^1.5 title^2 IndexTerm^.9
keywords^1.2 ADSKCommandSrch^2 ADSKContextId^1
Source2:CloudHelp^3
Source2:youtube^0.85
recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0
text


on
1
100
language
Source2
DocumentationBook
ADSKProductDisplay
audience


true
text title
250
ShortDesc


true
default
true
false
false
1


spellcheck



One thing I've noticed is that the queryresultcache hit rate is really 
low,
not sure our queries are always that unique. I'm using edismax and there's
a recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0 , can
this contribute ?

Sorry about the long post, but I'm struggling to nail down the issue here,
especially when queries are running fine in a master-slave environment 
with
similar hardware and network.

Any pointers will be highly appreciated.

Regards,
Shamik