Specific cores/collections to specific nodes

2017-12-11 Thread Leo Prince
Hi,

I just wanted to delegate specific Solr nodes to specific collections. As
per my understanding, when a collection creates in Solr7, core is
automatically creates in the backend. Since my cores are getting different
volume of traffic, I wanted to delegate specific collections to specific
nodes. One core is read intensive with less write and other is write
intensive with less read so dont want to mix their IO together. Earlier in
Solr4, we were able to hard code it in tomcat solr.xml, similarly is there
any ways that we can do this in solr7..?

Thanks in advance.
Leo Prince.


Solr :: How to trigger the DIH from SolrNet API with C# code

2017-12-11 Thread Karan Saini
Hi guys,

*Solr Version :: 6.6.1*
API :: SolrNet with C# based application

I wish to invoke or trigger the data import handler from the C# code with
the help of SolrNet. But i am unable to locate any tutorial in the SolrNet
API.

Please suggest how do i invoke the data import action from the C# based
application ?

Regards,
Karan


Re: Learning to Rank (LTR) with grouping

2017-12-11 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)
Hi Roopa,

If you look at the diff: 

https://github.com/apache/lucene-solr/pull/162/files

I didn't change much in SolrIndexSearcher, you can try to skip the file when 
applying the patch and redo the changes after.

Alternatively, the feature branch is available here: 

https://github.com/bloomberg/lucene-solr/commits/master-solr-8776

you could try to merge with that or cheery-pick my changes. 

Are you interested in reranking the groups or also in reranking the documents 
inside each group? 

Cheers,
Diego


From: solr-user@lucene.apache.org At: 12/09/17 19:07:25To:  
solr-user@lucene.apache.org
Subject: Re: Learning to Rank (LTR) with grouping

Hi I tried to apply this JIRA SOLR-8776 as a patch as this feature is
critical.

Here are the steps I took on my mac:

On branch branch_6_5

Your branch is up-to-date with 'origin/branch_6_5'

patch -p1 -i 162.patch --dry-run


I am getting Failures for certain Hunks

Example:

patching file
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

Hunk #1 FAILED at 1471.


Could you please give your input on how to apply this ticket as a patch for
branch_6_5 ?


Thank you,

Roopa

On Fri, Dec 8, 2017 at 6:52 PM, Roopa Rao  wrote:

> Hi Diego,
>
> Thank you, I will look into this and see how I could patch this.
>
> Thank you for your quick response,
> Roopa
>
>
> On Fri, Dec 8, 2017 at 5:44 PM, Diego Ceccarelli <
> diego.ceccare...@gmail.com> wrote:
>
>> Hi Roopa,
>>
>> LTR is implemented using RankQuery, and at the moment grouping doens't
>> support RankQuery.
>> I opened a jira item time ago
>> (https://issues.apache.org/jira/browse/SOLR-8776) and I would be happy
>> to receive feedback on that.  You can find the code here
>> https://github.com/apache/lucene-solr/pull/162.
>>
>> Cheers,
>> diego
>>
>> On Fri, Dec 8, 2017 at 9:15 PM, Roopa Rao  wrote:
>> > Hi,
>> >
>> > I am using grouping and LTR together and the results are not getting
>> > re-rank as it does without grouping.
>> >
>> > I am passing &rq parameter.
>> >
>> > Does LTR work with grouping on?
>> > Solr version 6.5
>> >
>> > Thank you,
>> > Roopa
>>
>
>




solr.TrieDoubleField deprecated with 7.1.0 but wildcard "*" search behaviour is different with solr.DoublePointField

2017-12-11 Thread Torsten Krah
Hi,

some question about the new DoublePointField which should be used
instead of the TrieDoubleField in 7.1.

https://lucene.apache.org/solr/guide/7_1/field-types-included-with-solr.html

If i am using the deprecated one its possible to get a match for a
double field like this:

test_d:*

even in 7.1.0.

But with the new DoublePointField, which you should use instead, you
won't get that match - you have to use e.g. [* TO *].

Short recipe can be found here to have a look yourself:

https://stackoverflow.com/questions/47473188/solr-7-1-querying-double-field-for-any-value-not-possible-with-anymore/47752445

Is this an intended change in runtime / query behaviour or some bug or
is it possible to restore that behaviour with the new field too?

kind regards

Torsten


smime.p7s
Description: S/MIME cryptographic signature


Solrcloud Drupal 7

2017-12-11 Thread Per Qvindesland
Hi All

I have a solr cloud with zookeeper 3.4.10 and solr 5.4.1, i on 3 centos 7 
instances in AWS, I intend to use the cluster for Drupal 7 search.

At the moment we have 2 instances running in production with each having their 
own solr installation (no cloud) but I would like to improve the redundancy and 
maybe even the performance, I followed 
http://www.francelabs.com/blog/tutorial-deploying-solrcloud-6-on-amazon-ec2/ to 
get the cloud up and running but I need some guidance on how to add in a new 
shard/collection so I can point the drupal instances to the new solr cloud, 
does anyone have any information on how to do this? as you can see i have no 
experience with solr cloud :)

Regards
Per





Re: Learning to Rank (LTR) with grouping

2017-12-11 Thread Roopa Rao
Hi Diego,

Thank you,

I am interested in reranking the documents inside one of the groups.

I will try the options you mentioned here.

Thank you,
Roopa

On Mon, Dec 11, 2017 at 6:57 AM, Diego Ceccarelli (BLOOMBERG/ LONDON) <
dceccarel...@bloomberg.net> wrote:

> Hi Roopa,
>
> If you look at the diff:
>
> https://github.com/apache/lucene-solr/pull/162/files
>
> I didn't change much in SolrIndexSearcher, you can try to skip the file
> when applying the patch and redo the changes after.
>
> Alternatively, the feature branch is available here:
>
> https://github.com/bloomberg/lucene-solr/commits/master-solr-8776
>
> you could try to merge with that or cheery-pick my changes.
>
> Are you interested in reranking the groups or also in reranking the
> documents inside each group?
>
> Cheers,
> Diego
>
>
> From: solr-user@lucene.apache.org At: 12/09/17 19:07:25To:
> solr-user@lucene.apache.org
> Subject: Re: Learning to Rank (LTR) with grouping
>
> Hi I tried to apply this JIRA SOLR-8776 as a patch as this feature is
> critical.
>
> Here are the steps I took on my mac:
>
> On branch branch_6_5
>
> Your branch is up-to-date with 'origin/branch_6_5'
>
> patch -p1 -i 162.patch --dry-run
>
>
> I am getting Failures for certain Hunks
>
> Example:
>
> patching file
> solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
>
> Hunk #1 FAILED at 1471.
>
>
> Could you please give your input on how to apply this ticket as a patch for
> branch_6_5 ?
>
>
> Thank you,
>
> Roopa
>
> On Fri, Dec 8, 2017 at 6:52 PM, Roopa Rao  wrote:
>
> > Hi Diego,
> >
> > Thank you, I will look into this and see how I could patch this.
> >
> > Thank you for your quick response,
> > Roopa
> >
> >
> > On Fri, Dec 8, 2017 at 5:44 PM, Diego Ceccarelli <
> > diego.ceccare...@gmail.com> wrote:
> >
> >> Hi Roopa,
> >>
> >> LTR is implemented using RankQuery, and at the moment grouping doens't
> >> support RankQuery.
> >> I opened a jira item time ago
> >> (https://issues.apache.org/jira/browse/SOLR-8776) and I would be happy
> >> to receive feedback on that.  You can find the code here
> >> https://github.com/apache/lucene-solr/pull/162.
> >>
> >> Cheers,
> >> diego
> >>
> >> On Fri, Dec 8, 2017 at 9:15 PM, Roopa Rao  wrote:
> >> > Hi,
> >> >
> >> > I am using grouping and LTR together and the results are not getting
> >> > re-rank as it does without grouping.
> >> >
> >> > I am passing &rq parameter.
> >> >
> >> > Does LTR work with grouping on?
> >> > Solr version 6.5
> >> >
> >> > Thank you,
> >> > Roopa
> >>
> >
> >
>
>
>


Re: Solrcloud Drupal 7

2017-12-11 Thread Fengtan
Hi,

Drupal can talk to a SolrCloud cluster the same way it talks to a
standalone Solr server, i.e. by using the Search API suite
https://www.drupal.org/project/search_api

You will have to create the collection yourself on Solr though (Drupal will
not do it for you).

If you want to take advantage of SolrCloud's fault tolerance capabilities
then you may have to:
* either use an external load balancer in front of your SolrCloud cluster
* or implement a smart client in Drupal -- I opened this ticket some time
ago https://www.drupal.org/project/search_api_solr/issues/2858645



On Mon, Dec 11, 2017 at 7:48 AM, Per Qvindesland  wrote:

> Hi All
>
> I have a solr cloud with zookeeper 3.4.10 and solr 5.4.1, i on 3 centos 7
> instances in AWS, I intend to use the cluster for Drupal 7 search.
>
> At the moment we have 2 instances running in production with each having
> their own solr installation (no cloud) but I would like to improve the
> redundancy and maybe even the performance, I followed
> http://www.francelabs.com/blog/tutorial-deploying-
> solrcloud-6-on-amazon-ec2/ to get the cloud up and running but I need
> some guidance on how to add in a new shard/collection so I can point the
> drupal instances to the new solr cloud, does anyone have any information on
> how to do this? as you can see i have no experience with solr cloud :)
>
> Regards
> Per
>
>
>
>


Re: Solr :: How to trigger the DIH from SolrNet API with C# code

2017-12-11 Thread Shawn Heisey
On 12/11/2017 4:40 AM, Karan Saini wrote:
> *Solr Version :: 6.6.1*
> API :: SolrNet with C# based application
>
> I wish to invoke or trigger the data import handler from the C# code with
> the help of SolrNet. But i am unable to locate any tutorial in the SolrNet
> API.
>
> Please suggest how do i invoke the data import action from the C# based
> application ?

That would be a question for whoever wrote SolrNet.  It was *not* the
Solr project.

https://github.com/SolrNet/SolrNet/issues

If people involved with that project need help with what Solr expects
and what it sends back, they can come to this list.

Thanks,
Shawn



Re: Solrcloud Drupal 7

2017-12-11 Thread Per Qvindesland
Hi 

Thanks for the information.

Once the collection is created how do you drop such as the schema.xml to the 
folder locations, I have used rsync -av search_api_solr/solr-conf/5.x 
/var/solr/data/search_shard2_replica1/ on one instance but I can’t see the 
files being replicated between the instances.

Regards
Per




> On 11 Dec 2017, at 14:57, Fengtan  wrote:
> 
> Hi,
> 
> Drupal can talk to a SolrCloud cluster the same way it talks to a
> standalone Solr server, i.e. by using the Search API suite
> https://www.drupal.org/project/search_api
> 
> You will have to create the collection yourself on Solr though (Drupal will
> not do it for you).
> 
> If you want to take advantage of SolrCloud's fault tolerance capabilities
> then you may have to:
> * either use an external load balancer in front of your SolrCloud cluster
> * or implement a smart client in Drupal -- I opened this ticket some time
> ago https://www.drupal.org/project/search_api_solr/issues/2858645
> 
> 
> 
> On Mon, Dec 11, 2017 at 7:48 AM, Per Qvindesland  wrote:
> 
>> Hi All
>> 
>> I have a solr cloud with zookeeper 3.4.10 and solr 5.4.1, i on 3 centos 7
>> instances in AWS, I intend to use the cluster for Drupal 7 search.
>> 
>> At the moment we have 2 instances running in production with each having
>> their own solr installation (no cloud) but I would like to improve the
>> redundancy and maybe even the performance, I followed
>> http://www.francelabs.com/blog/tutorial-deploying-
>> solrcloud-6-on-amazon-ec2/ to get the cloud up and running but I need
>> some guidance on how to add in a new shard/collection so I can point the
>> drupal instances to the new solr cloud, does anyone have any information on
>> how to do this? as you can see i have no experience with solr cloud :)
>> 
>> Regards
>> Per
>> 
>> 
>> 
>> 



Re: Specific cores/collections to specific nodes

2017-12-11 Thread Erick Erickson
Take a look at the collections API CREATE command, especially the
"createNodeSet". One variant lets you specify the nodes used to
distribute the collection with limited control over what core goes
where through the createNodeSet.shuffle parameter.

Alternatively you can use "EMPTY" for the createNodeSet and Solr will
create the skeleton of the collection in ZK but not create any actual
replicas. You can then use ADDREPLICA to place each and every node
exactly where you want it with the "node" parameter of ADDREPLCIA.

In both cases, the "node" is the string you see under the "live_nodes" znode.

Best,
Erick

On Mon, Dec 11, 2017 at 3:08 AM, Leo Prince
 wrote:
> Hi,
>
> I just wanted to delegate specific Solr nodes to specific collections. As
> per my understanding, when a collection creates in Solr7, core is
> automatically creates in the backend. Since my cores are getting different
> volume of traffic, I wanted to delegate specific collections to specific
> nodes. One core is read intensive with less write and other is write
> intensive with less read so dont want to mix their IO together. Earlier in
> Solr4, we were able to hard code it in tomcat solr.xml, similarly is there
> any ways that we can do this in solr7..?
>
> Thanks in advance.
> Leo Prince.


Re: Solrcloud Drupal 7

2017-12-11 Thread Fengtan
One way to provide Solr with the config files is to upload them to
ZooKeeper
https://lucene.apache.org/solr/guide/6_6/using-zookeeper-to-manage-configuration-files.html#UsingZooKeepertoManageConfigurationFiles-UploadingConfigurationFilesusingbin_solrorSolrJ

This can be achieved by copying the config files from
search_api_solr/solr-conf/x.x to your Solr server and then running the
'bin/solr zk upconfig' utility to upload them into a new configset. Once
you have done that, you can create a new collection with that configset.

Not sure if the zk upconfig utility is available in Solr 5 though, so you
may want to have a look at the reference guide for Solr 5:
https://archive.apache.org/dist/lucene/solr/ref-guide/



On Mon, Dec 11, 2017 at 11:06 AM, Per Qvindesland  wrote:

> Hi
>
> Thanks for the information.
>
> Once the collection is created how do you drop such as the schema.xml to
> the folder locations, I have used rsync -av search_api_solr/solr-conf/5.x
> /var/solr/data/search_shard2_replica1/ on one instance but I can’t see
> the files being replicated between the instances.
>
> Regards
> Per
>
>
>
>
> > On 11 Dec 2017, at 14:57, Fengtan  wrote:
> >
> > Hi,
> >
> > Drupal can talk to a SolrCloud cluster the same way it talks to a
> > standalone Solr server, i.e. by using the Search API suite
> > https://www.drupal.org/project/search_api
> >
> > You will have to create the collection yourself on Solr though (Drupal
> will
> > not do it for you).
> >
> > If you want to take advantage of SolrCloud's fault tolerance capabilities
> > then you may have to:
> > * either use an external load balancer in front of your SolrCloud cluster
> > * or implement a smart client in Drupal -- I opened this ticket some time
> > ago https://www.drupal.org/project/search_api_solr/issues/2858645
> >
> >
> >
> > On Mon, Dec 11, 2017 at 7:48 AM, Per Qvindesland  wrote:
> >
> >> Hi All
> >>
> >> I have a solr cloud with zookeeper 3.4.10 and solr 5.4.1, i on 3 centos
> 7
> >> instances in AWS, I intend to use the cluster for Drupal 7 search.
> >>
> >> At the moment we have 2 instances running in production with each having
> >> their own solr installation (no cloud) but I would like to improve the
> >> redundancy and maybe even the performance, I followed
> >> http://www.francelabs.com/blog/tutorial-deploying-
> >> solrcloud-6-on-amazon-ec2/ to get the cloud up and running but I need
> >> some guidance on how to add in a new shard/collection so I can point the
> >> drupal instances to the new solr cloud, does anyone have any
> information on
> >> how to do this? as you can see i have no experience with solr cloud :)
> >>
> >> Regards
> >> Per
> >>
> >>
> >>
> >>
>
>


Solr ssl issue while creating collection

2017-12-11 Thread Sundaram, Dinesh
Hi,


How do I change the protocol to https everywhere including replica.
NOTE: I have just only one node 8983. started solr using this command.
bin/solr start -cloud -p 8983 -noprompt

1. Configure SSL using 
https://lucene.apache.org/solr/guide/7_1/enabling-ssl.html
2. Restart solr
3. Validate solr with https url https://localhost:8983/solr - works fine
4. Create a collection https://localhost:8983/solr/#/~collections
5. here is the response :
Connection to Solr lost
Please check the Solr instance.
6.Server solr.log: here notice the replica call goes to http port instead of 
https

2017-12-11 11:52:27.929 ERROR 
(OverseerThreadFactory-8-thread-1-processing-n:localhost:8983_solr) [ ] 
o.a.s.c.OverseerCollectionMessageHandler Error from shard: 
http://localhost:8983/solr
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://localhost:8983/solr
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:640)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
at 
org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.http.client.ClientProtocolException
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:187)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:525)
... 12 more
Caused by: org.apache.http.ProtocolException: The server failed to respond with 
a valid HTTP response
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:149)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at 
org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118)
at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
... 15 more


Dinesh Sundaram
MBS Platform Engineering

Mastercard
[cid:image001.png@01D37283.5B72AA80]

CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the 
use of the intended recipient and may contain information that is privileged, 
confidential or exempt from disclosure under applicable law. If you are not the 
intended recipient, any disclosure, distribution or other use of this e-mail 
message or attachments is prohibited. If you have received this e-mail message 
in error, please delete and notify the sender immediately. Thank you.


RE: Solr ssl issue while creating collection

2017-12-11 Thread Sundaram, Dinesh
Hi,


How do I change the protocol to https everywhere including replica.
NOTE: I have just only one node 8983. started solr using this command.
bin/solr start -cloud -p 8983 -noprompt

1. Configure SSL using 
https://lucene.apache.org/solr/guide/7_1/enabling-ssl.html
2. Restart solr
3. Validate solr with https url https://localhost:8983/solr - works fine
4. Create a collection https://localhost:8983/solr/#/~collections
5. here is the response :
Connection to Solr lost
Please check the Solr instance.
6.Server solr.log: here notice the replica call goes to http port instead of 
https

2017-12-11 11:52:27.929 ERROR 
(OverseerThreadFactory-8-thread-1-processing-n:localhost:8983_solr) [ ] 
o.a.s.c.OverseerCollectionMessageHandler Error from shard: 
http://localhost:8983/solr
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://localhost:8983/solr
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:640)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
at 
org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:172)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.http.client.ClientProtocolException
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:187)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:525)
... 12 more
Caused by: org.apache.http.ProtocolException: The server failed to respond with 
a valid HTTP response
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:149)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at 
org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:118)
at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
... 15 more


Dinesh Sundaram
MBS Platform Engineering

Mastercard
[cid:image002.png@01D37287.37BC38F0]

CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for the 
use of the intended recipient and may contain information that is privileged, 
confidential or exempt from disclosure under applicable law. If you are not the 
intended recipient, any disclosure, distribution or other use of this e-mail 
message or attachments is prohibited. If you have received this e-mail message 
in error, please delete and notify the sender immediately. Thank you.


Incremental Indexing on Solr 6.5.

2017-12-11 Thread Fiz Newyorker
Hi Team,

I am working on Solr 6.5 and indexing data from MongoDB 3.2.5. I want to
the best practices to implement incremental indexing.

Like . Every 30 mins the Updated the Data in Mongo DB needs to indexed on
Solr. How to implement this. ? How would Solr know whenever there is an
update on Mongodb ?  Indexing should run automatically.

Right now I am doing indexing manually.


Thanks
Fiz Ahmed


RE: FW: Need Help Configuring Solr To Work With Nutch

2017-12-11 Thread Mukhopadhyay, Aratrika
Thank you Erick Erickson and Rick Leir . My issue was permission related where 
the solr user was not running the indexing job through Nutch and therefore was 
being unable to write anything to Solr. I changed the ownership of the nutch's 
runtime directory to solr and all is well and working. I thank you for your 
help. Your tip about the numDocs put me on the right track . 

Aratrika 

-Original Message-
From: Rick Leir [mailto:rl...@leirtech.com] 
Sent: Saturday, December 09, 2017 10:25 AM
To: solr-user@lucene.apache.org
Subject: RE: FW: Need Help Configuring Solr To Work With Nutch

Ara
The config for soft commit would not be in schema.xml, please look in 
solrconfig.xml.

Look in solr.log for evidence of commits occurring. Explore the SolrAdmin 
console, what are the document counts?

You can post snippets from your config files here.
Cheers --Rick


On December 8, 2017 4:23:00 PM EST, "Mukhopadhyay, Aratrika" 
 wrote:
>Rick ,
>Thanks for your reply. I do not see any errors or exceptions in the 
>solr logs. I have read that the my schema in nutch needs to match the 
>schema in solr. When I change the schema in in the config directory and 
>restart solr my changes are lost. Leaving the schema alone is the only 
>way I can get the indexing job to run but I cant query for the data in 
>solr. Would you like me to send you specific configuration files ? I 
>cant seem to get this to work.
>
>Kind regards,
>Aratrika Mukhopadhyay
>
>-Original Message-
>From: Rick Leir [mailto:rl...@leirtech.com]
>Sent: Friday, December 08, 2017 4:06 PM
>To: solr-user@lucene.apache.org
>Subject: Re: FW: Need Help Configuring Solr To Work With Nutch
>
>Ara
>Softcommit might be the default in Solrconfig.xml, and if not then you 
>should probably make it so. Then you need to have a look in solr.log if 
>things are not working as you expect.
>Cheers -- Rick
>
>On December 8, 2017 3:23:35 PM EST, "Mukhopadhyay, Aratrika"
> wrote:
>>Erick,
>>Do I need to set the softCommit = true and prepareCommit to true in my
>
>>solrconfig ? I am still at a loss as to what is happening. Thanks
>again
>>for your help.
>>
>>Aratrika
>>
>>From: Mukhopadhyay, Aratrika
>>Sent: Friday, December 08, 2017 11:34 AM
>>To: solr-user 
>>Subject: RE: Need Help Configuring Solr To Work With Nutch
>>
>>
>>Hello Erick ,
>>
>>   This is what I see in the logs :
>>
>>[cid:image001.png@01D37018.62D3CC90]
>>
>>
>>
>>I am sorry it sbeen a while since I worked with solr. I did not do 
>>anything to specifically commit the changes to the core. Thanks for 
>>your prompt attention to this matter.
>>
>>
>>
>>Aratrika Mukhopadhyay
>>
>>
>>
>>-Original Message-
>>From: Erick Erickson [mailto:erickerick...@gmail.com]
>>Sent: Friday, December 08, 2017 11:06 AM
>>To: solr-user
>>mailto:solr-user@lucene.apache.org>>
>>Subject: Re: Need Help Configuring Solr To Work With Nutch
>>
>>
>>
>>1> do you see update messages in the Solr logs?
>>
>>2> did you issue a commit?
>>
>>
>>
>>Best,
>>
>>Erick
>>
>>
>>
>>On Fri, Dec 8, 2017 at 7:27 AM, Mukhopadhyay, Aratrika < 
>>aratrika.mukhopadh...@mail.house.gov>house.gov>>
>>wrote:
>>
>>
>>
>>> Good Morning,
>>
>>>
>>
>>>I am running nutch 2.3 , hbase 0.98 and I am integrating
>>
>>> nutch with solr 6.4. I have a successful crawl in nutch and when I
>>see
>>
>>> that it is indexing the content into solr. However I cannot query
>and
>>get any results.
>>
>>> Its as if Nutch isn’t writing anything to solr at all. I am stuck
>and
>>
>>> need someone who is familiar with solr/nutch to provide assistance.
>>
>>> Can someone please help ?
>>
>>>
>>
>>>
>>
>>>
>>
>>> This is what I see when I index into solr. I see no errors.
>>
>>>
>>
>>>
>>
>>>
>>
>>>
>>
>>>
>>
>>>
>>
>>>
>>
>>> Regards,
>>
>>>
>>
>>> Aratrika Mukhopadhyay
>>
>>>
>
>--
>Sorry for being brief. Alternate email is rickleir at yahoo dot com

--
Sorry for being brief. Alternate email is rickleir at yahoo dot com 


Solr suggester build takes a very long time

2017-12-11 Thread ruby
So when following command is run to build solr suggester:
?suggest.build=true

It takes a very long time to finish. I found out that this is because each
time dictionary is built, it does not build delta, it rebuilds the entire
dictionary.

Is there a way to speed up the suggester build time?

TIA




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: solr.TrieDoubleField deprecated with 7.1.0 but wildcard "*" search behaviour is different with solr.DoublePointField

2017-12-11 Thread Chris Hostetter

AFAICT The behavior you're describing with Trie fields was never 
intentionally supported/documented? 

It appears that it only worked as a fluke side effect of how the default 
implementation of FieldType.getprefixQuery() was inherited by Trie fields 
*and* because "indexed=true" TrieFields use Terms (just like StrField) ... 
so prefix of "" (the empty string) matched all of the Trie terms in a 
field.

(note that the syntax you're describing does *not* work for Trie fields 
that are "indexed=false docValues=true")

In general, there seems to be a bit of a mess in terms of trying to 
specify "prefix queries" (which is what "foo_d:*" really is under the 
covers) or "wild card" queries against numeric fields. I created a jira to 
try and come to a concensus about how this should behave moving forward...

https://issues.apache.org/jira/browse/SOLR-11746

...but i would suggest you move away from depending on that syntax and use 
the officially supported/documented range query syntax (foo_d[* TO *]) 
instead.




: some question about the new DoublePointField which should be used
: instead of the TrieDoubleField in 7.1.
...
: If i am using the deprecated one its possible to get a match for a
: double field like this:
: 
: test_d:*
: 
: even in 7.1.0.
: 
: But with the new DoublePointField, which you should use instead, you
: won't get that match - you have to use e.g. [* TO *].

: Is this an intended change in runtime / query behaviour or some bug or
: is it possible to restore that behaviour with the new field too?




-Hoss
http://www.lucidworks.com/


How to implement Incremental Indexing.

2017-12-11 Thread Fiz Newyorker
Hello Solr Group Team,

I am working on Solr 6.5 and indexing data from MongoDB 3.2.5. I want to
know the best practices to implement incremental indexing.

Every 30 mins the Updated Data in Mongo DB needs to indexed on Solr. How to
implement this. ? How would Solr know whenever there is an update on
Mongodb ?  Indexing should run automatically. Should I setup any crone Jobs
?

 Please let me know how to proceed further on the above requirement.

Right now I am doing indexing manually.

Thanks
Fiz Ahmed


Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-11 Thread Joe Obernberger
Thank you Erick.  Perhaps it makes more sense to not use any replicas 
when using HDFS for storage (and having a very large index) since it is 
already replicated.  It seems to me that if there were no replicas, and 
a leader went down, that another node could take over by just going 
through the regular startup cycle (replaying logs etc.) similar to the 
auto add replicas capability.  Not sure how one would handle a node 
coming back.


I think there could be a lot to be gained by taking advantage of a 
global file system with Solr.  Would be fun!


-Joe


On 12/9/2017 10:26 PM, Erick Erickson wrote:

The complications are things like this:

Say an update comes in and gets written to the tlog and indexed but
not committed. Now the leader goes down. How does the replica that
takes over leadership
1> understand the current state of the index, i.e. that there are
uncommitted updates
2> replay the updates from the tlog correctly?

Not to mention that during leader election one of the read-only
replicas must become a read/write replica when it takes over
leadership.

The current mechanism does, indeed, use Zk to elect a new leader, the
devil is in the details of how in-flight updates get handled properly.

There's no a-priori reason all those details couldn't be worked out,
it's just gnarly. Nobody has yet stepped up to commit the
time/resources to work them all out. My guess is that the cost of
having a bunch more disks is cheaper than the engineering time it
would take to changes this. The standard answer is "patches welcome"
;).

Best,
Erick

On Sat, Dec 9, 2017 at 1:02 PM, Hendrik Haddorp  wrote:

Ok, thanks for the answer. The leader election and update notification sound
like they should work using ZooKeeper (leader election recipe and a normal
watch) but I guess there are some details that make things more complicated.

On 09.12.2017 20:19, Erick Erickson wrote:

This has been bandied about on a number of occasions, it boils down to
nobody has stepped up to make it happen. It turns out there are a
number of tricky issues:


how does leadership change if the leader goes down?
the raw complexity of getting it right. Getting it wrong corrupts indexes
how do you resolve leadership in the first place so only the leader
writes to the index?
how would that affect performance if N replicas were autowarming at the
same time, thus reading from HDFS?
how do the read-only replicas know to open a new searcher?
I'm sure there are a bunch more.

So this is one of those things that everyone agrees is interesting,
but nobody is willing to code and it's not actually clear that it
makes sense in the Solr context. It'd be a pity to put in all the work
then discover that the performance issues prohibited using it.

If you _guarantee_ that the index doesn't change, there's the
NoLockFactory you could specify. That would allow you to share a
common index, woe be unto you if you start updating the index though.

Best,
Erick

On Sat, Dec 9, 2017 at 4:46 AM, Hendrik Haddorp 
wrote:

Hi,

for the HDFS case wouldn't it be nice if there was a mode in which the
replicas just read the same index files as the leader? I mean after all
the
data is already on a shared readable file system so why would one even
need
to replicate the transaction log files?

regards,
Hendrik


On 08.12.2017 21:07, Erick Erickson wrote:

bq: Will TLOG replicas use less network bandwidth?

No, probably more bandwidth. TLOG replicas work like this:
1> the raw docs are forwarded
2> the old-style master/slave replication is used

So what you do save is CPU processing on the TLOG replica in exchange
for increased bandwidth.

Since the only thing forwarded in NRT replicas (outside of recovery)
is the raw documents, I expect that TLOG replicas would _increase_
network usage. The deal is that TLOG replicas can take over leadership
if the leader goes down so they must have an
up-to-date-after-last-index-sync set of tlogs.

At least that's my current understanding...

Best,
Erick

On Fri, Dec 8, 2017 at 12:01 PM, Joe Obernberger
 wrote:

Anyone have any thoughts on this?  Will TLOG replicas use less network
bandwidth?

-Joe


On 12/4/2017 12:54 PM, Joe Obernberger wrote:

Hi All - this same problem happened again, and I think I partially
understand what is going on.  The part I don't know is what caused any
of
the replicas to go into full recovery in the first place, but once
they
do,
they cause network interfaces on servers to go fully utilized in both
in/out
directions.  It appears that when a solr replica needs to recover, it
calls
on the leader for all the data.  In HDFS, the data from the leader's
point
of view goes:

HDFS --> Solr Leader Process -->Network--> Replica Solr Process
-->HDFS

Do I have this correct?  That poor network in the middle becomes a
bottleneck and causes other replicas to go into recovery, which causes
more
network traffic.  Perhaps going to TLOG replicas with 7.1 would be
better
with HDFS?  Would it be possible for the leader to send a mes

Re: How to implement Incremental Indexing.

2017-12-11 Thread Rick Leir
Fiz
Here is a blog article that seems to cover your plans
https://www.toadworld.com/platforms/nosql/b/weblog/archive/2017/02/03/indexing-mongodb-data-in-apache-solr

Also look at github, there are several projects which could do it for you.
Cheers -- Rick

On December 11, 2017 5:19:43 PM EST, Fiz Newyorker  wrote:
>Hello Solr Group Team,
>
>I am working on Solr 6.5 and indexing data from MongoDB 3.2.5. I want
>to
>know the best practices to implement incremental indexing.
>
>Every 30 mins the Updated Data in Mongo DB needs to indexed on Solr.
>How to
>implement this. ? How would Solr know whenever there is an update on
>Mongodb ?  Indexing should run automatically. Should I setup any crone
>Jobs
>?
>
> Please let me know how to proceed further on the above requirement.
>
>Right now I am doing indexing manually.
>
>Thanks
>Fiz Ahmed

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Solr ssl issue while creating collection

2017-12-11 Thread Shawn Heisey
On 12/11/2017 12:24 PM, Sundaram, Dinesh wrote:
> 1. Configure SSL
> using https://lucene.apache.org/solr/guide/7_1/enabling-ssl.html
>
> 2. Restart solr 
> 3. Validate solr with https url https://localhost:8983/solr - works fine
> 4. Create a collection https://localhost:8983/solr/#/~collections
> 
> 5. here is the response : 
> Connection to Solr lost 
> Please check the Solr instance.
> 6.Server solr.log: here notice the replica call goes to http port
> instead of https
>
> 2017-12-11 11:52:27.929 ERROR
> (OverseerThreadFactory-8-thread-1-processing-n:localhost:8983_solr) [
> ] o.a.s.c.OverseerCollectionMessageHandler Error from
> shard: http://localhost:8983/solr
>

This acts like either you did not set the urlScheme cluster property in
zookeeper to https, or that you did not restart your Solr instances
after making that change.  Setting the property is described on the page
you referenced in the "SSL with SolrCloud" section.

Note that it also appears your Solr instances have registered themselves
with the "localhost" name instead of an actual IP address or a "real"
hostname.  This is going to be a problem if you ever run more than one
Solr machine in your cloud, or if you use a smart client (like
CloudSolrClient included with SolrJ) and access Solr from a different
machine.

Thanks,
Shawn



Solr Aggregation queries are way slower than Elastic Search

2017-12-11 Thread RAUNAK AGRAWAL
Hi,

We have a use case where there are 4-5 dimensions and around 3500 metrics
in a single document. We have indexed only 2 dimensions and made all the
metrics as doc_values so that we can run the aggregation queries.

We have 6 million such documents and we are using solr cloud(6.6) on a 6
node cluster with 8 Vcores and 24 GB RAM each.

On the same set of hardware in elastic search we were getting the response
in about 10ms whereas in solr we are getting response in around 300-400 ms.

This is how I am querying the data.

private SolrQuery buildQuery(Integer variable1, List groups,
List metrics) {
SolrQuery query = new SolrQuery();
String groupQuery = " (" + groups.stream().map(g -> "group:" + g).collect
(Collectors.joining(" OR ")) + ")";
String finalQuery = "variable1:" + variable1 + " AND " + groupQuery;
query.set("q", finalQuery);
query.setRows(0);
metrics.forEach(
metric -> query.setGetFieldStatistics("{!sum=true }" + metric)
);
return query;
}

Any help will be appreciated regarding this.


Thanks,

Raunak


Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-11 Thread Yonik Seeley
I think the SolrJ below uses the old stats component.
Hopefully the JSON Facet API would be faster for this, but it's not
completely clear what the main query here looks like, and if it's the
source of any bottleneck rather than the aggregations.
What does the generated query string actually look like (it may be
easiest just to pull this from the logs).

-Yonik


On Mon, Dec 11, 2017 at 7:32 PM, RAUNAK AGRAWAL
 wrote:
> Hi,
>
> We have a use case where there are 4-5 dimensions and around 3500 metrics
> in a single document. We have indexed only 2 dimensions and made all the
> metrics as doc_values so that we can run the aggregation queries.
>
> We have 6 million such documents and we are using solr cloud(6.6) on a 6
> node cluster with 8 Vcores and 24 GB RAM each.
>
> On the same set of hardware in elastic search we were getting the response
> in about 10ms whereas in solr we are getting response in around 300-400 ms.
>
> This is how I am querying the data.
>
> private SolrQuery buildQuery(Integer variable1, List groups,
> List metrics) {
> SolrQuery query = new SolrQuery();
> String groupQuery = " (" + groups.stream().map(g -> "group:" + g).collect
> (Collectors.joining(" OR ")) + ")";
> String finalQuery = "variable1:" + variable1 + " AND " + groupQuery;
> query.set("q", finalQuery);
> query.setRows(0);
> metrics.forEach(
> metric -> query.setGetFieldStatistics("{!sum=true }" + metric)
> );
> return query;
> }
>
> Any help will be appreciated regarding this.
>
>
> Thanks,
>
> Raunak


Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-11 Thread RAUNAK AGRAWAL
Hi Yonik,

I will try the JSON Facet API and update here but my hunch is that querying
mechanism is not the problem. Rather the problem lies with the solr
aggregations.

Thanks

On Tue, Dec 12, 2017 at 6:31 AM, Yonik Seeley  wrote:

> I think the SolrJ below uses the old stats component.
> Hopefully the JSON Facet API would be faster for this, but it's not
> completely clear what the main query here looks like, and if it's the
> source of any bottleneck rather than the aggregations.
> What does the generated query string actually look like (it may be
> easiest just to pull this from the logs).
>
> -Yonik
>
>
> On Mon, Dec 11, 2017 at 7:32 PM, RAUNAK AGRAWAL
>  wrote:
> > Hi,
> >
> > We have a use case where there are 4-5 dimensions and around 3500 metrics
> > in a single document. We have indexed only 2 dimensions and made all the
> > metrics as doc_values so that we can run the aggregation queries.
> >
> > We have 6 million such documents and we are using solr cloud(6.6) on a 6
> > node cluster with 8 Vcores and 24 GB RAM each.
> >
> > On the same set of hardware in elastic search we were getting the
> response
> > in about 10ms whereas in solr we are getting response in around 300-400
> ms.
> >
> > This is how I am querying the data.
> >
> > private SolrQuery buildQuery(Integer variable1, List groups,
> > List metrics) {
> > SolrQuery query = new SolrQuery();
> > String groupQuery = " (" + groups.stream().map(g -> "group:" +
> g).collect
> > (Collectors.joining(" OR ")) + ")";
> > String finalQuery = "variable1:" + variable1 + " AND " + groupQuery;
> > query.set("q", finalQuery);
> > query.setRows(0);
> > metrics.forEach(
> > metric -> query.setGetFieldStatistics("{!sum=true }" +
> metric)
> > );
> > return query;
> > }
> >
> > Any help will be appreciated regarding this.
> >
> >
> > Thanks,
> >
> > Raunak
>