Using solrcloud6 and solrj client with HAProxy.

2016-09-01 Thread Piyush Kunal
We are using a solrcloud 6.1 cluster with zookeeper.
We have 6 nodes running behind the cluster.
If I use solrj client with zookeeper, it will round robin across all the 
servers and distribute equal load across them.

But I want to give priority to some nodes (with better configuration) to have 
more load. 
Previously we used to use a HAproxy above all the nodes which we can easily 
configure to put high loads on some nodes.
But if zookeeper is doing the load-balancing, is there some way where we can 
give more load to some nodes? (by using HAProxy or something)



Can we get Filed List(fl) along with suggestions in Solr?

2016-09-01 Thread Pradeep Chandra
Hi,

I am getting suggestions for solr using /suggest. I want some fields along
with the suggestions also. How can I get that? Any suggestions. Thanks in
advance

Thanks and Regards
M Pradeep Chandra


Re: Multiple rollups/facets in one streaming aggregation?

2016-09-01 Thread subramani.new
Hello,

  I am exploring solr and its new feature ( Parallel Sql Interface and
Stream api ).

I have tried most of the api's and works fine. But, I am facing issue with
multivalue field.

My Json input has multi value fields. I trying to aggregate those fields but
I am unable to.

Exception :
can not sort on multivalued field

My use case :  
 
input:
{
id: 1
field1:[1,2,3],
app.name:[watsapp,facebook,... ]
}
{
id: 2
field1:[1,2,3],
app.name:[watsapp,facebook,... ]
}
 
Expected result :
watsapp: 2
facebook : 2

I have 2 TB data . I wanted to execute in aggmode=map_reduce. Any
suggestion?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-rollups-facets-in-one-streaming-aggregation-tp4291952p4294270.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multiple rollups/facets in one streaming aggregation?

2016-09-01 Thread Alessandro Benedetti
It is not possible to sort on multivalued fields out of the box ... (
indeed if you don't specify any logic, which value should considered as
reference to sort ? )
Are you trying to sort or this happens by default ?

Cheers

On Thu, Sep 1, 2016 at 10:18 AM, subramani.new 
wrote:

> Hello,
>
>   I am exploring solr and its new feature ( Parallel Sql Interface and
> Stream api ).
>
> I have tried most of the api's and works fine. But, I am facing issue with
> multivalue field.
>
> My Json input has multi value fields. I trying to aggregate those fields
> but
> I am unable to.
>
> Exception :
> can not sort on multivalued field
>
> My use case :
>
> input:
> {
> id: 1
> field1:[1,2,3],
> app.name:[watsapp,facebook,... ]
> }
> {
> id: 2
> field1:[1,2,3],
> app.name:[watsapp,facebook,... ]
> }
>
> Expected result :
> watsapp: 2
> facebook : 2
>
> I have 2 TB data . I wanted to execute in aggmode=map_reduce. Any
> suggestion?
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Multiple-rollups-facets-in-one-streaming-
> aggregation-tp4291952p4294270.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Re: Default Field Cache

2016-09-01 Thread Alessandro Benedetti
Are you looking for this ?

org/apache/solr/core/SolrConfig.java:243

CacheConfig conf = CacheConfig.getConfig(this, "query/fieldValueCache");
if (conf == null) {
  Map args = new HashMap<>();
  args.put(NAME, "fieldValueCache");
  args.put("size", "1");
  args.put("initialSize", "10");
  args.put("showItems", "-1");
  conf = new CacheConfig(FastLRUCache.class, args, null);
}
fieldValueCacheConfig = conf;


Cheers


On Thu, Sep 1, 2016 at 2:41 AM, Rallavagu  wrote:

> But, the configuration is commented out (disabled). As comments section
> mentioned
>
> "The fieldValueCache is created by default even if not configured here"
>
> I would like to know what would be the configuration of default
> fieldValueCache created.
>
>
> On 8/31/16 6:37 PM, Zheng Lin Edwin Yeo wrote:
>
>> If I didn't get your question wrong, what you have listed is already the
>> default configuration that comes with your version of Solr.
>>
>> Regards,
>> Edwin
>>
>> On 30 August 2016 at 07:49, Rallavagu  wrote:
>>
>> Solr 5.4.1
>>>
>>> 
>>> 
>>>
>>> Wondering what is the default configuration for "fieldValueCache".
>>>
>>>
>>


-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Streaming expressions, delete

2016-09-01 Thread Markus Jelsma
Hi,

I've read up on the streaming expressions on the cwiki. The update decorator is 
ideal for quick imports from various sources such as jdbc, and combined with 
daemon it can be used for periodic delta imports too, which is very nice. But 
besides the update function, i would also expect a delete streaming function. 
If items, for example, are flagged as delete in a jdbc source, it should be 
able to select those id's, and map them to Solr document IDs.

Did i miss something?

Thanks,
Markus


Replication Index fetch failed

2016-09-01 Thread Arkadi Colson

Hi

Replication seems to be in an endless loop. Anybody any idea?
See below for logs.

If you need more info, just let me know...

INFO  - 2016-09-01 14:30:42.563; [c:lvs s:shard1 r:core_node10 
x:lvs_shard1_replica1] org.apache.solr.core.SolrDeletionPolicy; 
SolrDeletionPolicy.onCommit: commits: num=2
commit{dir=NRTCachingDirectory(MMapDirectory@/var/solr/data/lvs_shard1_replica1/data/index.20160901140036922 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@59509f2; 
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_vpo,generation=41100}
commit{dir=NRTCachingDirectory(MMapDirectory@/var/solr/data/lvs_shard1_replica1/data/index.20160901140036922 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@59509f2; 
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_vpp,generation=41101}
INFO  - 2016-09-01 14:30:42.563; [c:lvs s:shard1 r:core_node10 
x:lvs_shard1_replica1] org.apache.solr.core.SolrDeletionPolicy; newest 
commit generation = 41101
INFO  - 2016-09-01 14:30:42.565; [c:lvs s:shard1 r:core_node10 
x:lvs_shard1_replica1] org.apache.solr.update.DirectUpdateHandler2; 
end_commit_flush
INFO  - 2016-09-01 14:30:42.565; [c:lvs s:shard1 r:core_node10 
x:lvs_shard1_replica1] org.apache.solr.update.DirectUpdateHandler2; 
start 
commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2016-09-01 14:30:42.603; [c:lvs s:shard1 r:core_node10 
x:lvs_shard1_replica1] org.apache.solr.core.SolrDeletionPolicy; 
SolrDeletionPolicy.onCommit: commits: num=2
commit{dir=NRTCachingDirectory(MMapDirectory@/var/solr/data/lvs_shard1_replica1/data/index.20160901140036922 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@59509f2; 
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_vpp,generation=41101}
commit{dir=NRTCachingDirectory(MMapDirectory@/var/solr/data/lvs_shard1_replica1/data/index.20160901140036922 
lockFactory=org.apache.lucene.store.NativeFSLockFactory@59509f2; 
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_vpq,generation=41102}
INFO  - 2016-09-01 14:30:42.603; [c:lvs s:shard1 r:core_node10 
x:lvs_shard1_replica1] org.apache.solr.core.SolrDeletionPolicy; newest 
commit generation = 41102
INFO  - 2016-09-01 14:30:42.664; [c:lvs s:shard1 r:core_node10 
x:lvs_shard1_replica1] org.apache.solr.search.SolrIndexSearcher; Opening 
[Searcher@3436f207[lvs_shard1_replica1] realtime]
INFO  - 2016-09-01 14:30:42.674; [c:lvs s:shard1 r:core_node10 
x:lvs_shard1_replica1] org.apache.solr.update.DirectUpdateHandler2; 
end_commit_flush
ERROR - 2016-09-01 14:30:43.653; [c:intradesk s:shard1 r:core_node5 
x:intradesk_shard1_replica1] org.apache.solr.common.SolrException; Index 
fetch failed :org.apache.solr.common.SolrException: Unable to download 
_6f46_cj.liv completely. Downloaded 0!=5596
at 
org.apache.solr.handler.IndexFetcher$FileFetcher.cleanup(IndexFetcher.java:1554)
at 
org.apache.solr.handler.IndexFetcher$FileFetcher.fetchFile(IndexFetcher.java:1437)
at 
org.apache.solr.handler.IndexFetcher.downloadIndexFiles(IndexFetcher.java:852)
at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:428)
at 
org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:251)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:388)
at 
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:156)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:408)
at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:221)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

ERROR - 2016-09-01 14:30:43.654; [c:intradesk s:shard1 r:core_node5 
x:intradesk_shard1_replica1] org.apache.solr.common.SolrException; Error 
while trying to recover:org.apache.solr.common.SolrException: 
Replication for recovery failed.
at 
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:159)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:408)
at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:221)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.l

Re: Using solrcloud6 and solrj client with HAProxy.

2016-09-01 Thread Shawn Heisey
On 9/1/2016 2:26 AM, Piyush Kunal wrote:
> But I want to give priority to some nodes (with better configuration)
> to have more load. Previously we used to use a HAproxy above all the
> nodes which we can easily configure to put high loads on some nodes.
> But if zookeeper is doing the load-balancing, is there some way where
> we can give more load to some nodes? (by using HAProxy or something)

You can use the load balancer URL with HttpSolrClient instead of using
the zookeeper information with CloudSolrClient.  At that point, you will
be dependent on haproxy to balance the load.  It probably will not react
as fast to servers going down as SolrCloud's built-in load balancing,
but it should work well.

Indexing with CloudSolrClient will be faster, because CloudSolrClient is
aware of where the shard leaders are and can send directly to them.

Using the Http client for queries and the Cloud client for indexing
might give you the best of both.

Thanks,
Shawn



Re: Using solrcloud6 and solrj client with HAProxy.

2016-09-01 Thread Shawn Heisey
On 9/1/2016 6:48 AM, Shawn Heisey wrote:
> You can use the load balancer URL with HttpSolrClient instead of using
> the zookeeper information with CloudSolrClient. At that point, you
> will be dependent on haproxy to balance the load. It probably will not
> react as fast to servers going down as SolrCloud's built-in load
> balancing, but it should work well. 

I need to tell you about one possible hiccup with this approach --
SolrCloud will *still* load balance the queries across the cloud, even
without CloudSolrClient, unless you tell it not to.

There is a parameter "preferLocalShards".  You can find documentation on
this parameter here:

https://cwiki.apache.org/confluence/display/solr/Distributed+Requests

The documentation covers when it is a good or bad idea to use the
parameter.  Pairing an external load balancer with SolrCloud is the
right time to use preferLocalShards.

Thanks,
Shawn



How using fl in query affects query time

2016-09-01 Thread kshitij tyagi
Hi,


I am having around 100 fields in single document. I want to know that if I
use fl and get only single field from query will that reduce query time??

or getting all the fields through query and getting one field using fl in
query both will have same query time??


Re: How using fl in query affects query time

2016-09-01 Thread Alexandre Rafalovitch
I believe enableLazyFieldLoading setting is supposed to help with the
partial-fields use-case. But not with query time itself, but with
re-hydrating stored fields to return. Which I guess is part of the
query time from the user's point of view.

https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig#QuerySettingsinSolrConfig-enableLazyFieldLoading

Regards,
   Alex.

Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 1 September 2016 at 20:04, kshitij tyagi  wrote:
> Hi,
>
>
> I am having around 100 fields in single document. I want to know that if I
> use fl and get only single field from query will that reduce query time??
>
> or getting all the fields through query and getting one field using fl in
> query both will have same query time??


Re: How using fl in query affects query time

2016-09-01 Thread kshitij tyagi
thanks alex.

On Thu, Sep 1, 2016 at 6:54 PM, Alexandre Rafalovitch 
wrote:

> I believe enableLazyFieldLoading setting is supposed to help with the
> partial-fields use-case. But not with query time itself, but with
> re-hydrating stored fields to return. Which I guess is part of the
> query time from the user's point of view.
>
> https://cwiki.apache.org/confluence/display/solr/Query+
> Settings+in+SolrConfig#QuerySettingsinSolrConfig-enableLazyFieldLoading
>
> Regards,
>Alex.
> 
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 1 September 2016 at 20:04, kshitij tyagi 
> wrote:
> > Hi,
> >
> >
> > I am having around 100 fields in single document. I want to know that if
> I
> > use fl and get only single field from query will that reduce query time??
> >
> > or getting all the fields through query and getting one field using fl in
> > query both will have same query time??
>


Re: Default Field Cache

2016-09-01 Thread Rallavagu

Yes. Thanks.

On 9/1/16 4:53 AM, Alessandro Benedetti wrote:

Are you looking for this ?

org/apache/solr/core/SolrConfig.java:243

CacheConfig conf = CacheConfig.getConfig(this, "query/fieldValueCache");
if (conf == null) {
  Map args = new HashMap<>();
  args.put(NAME, "fieldValueCache");
  args.put("size", "1");
  args.put("initialSize", "10");
  args.put("showItems", "-1");
  conf = new CacheConfig(FastLRUCache.class, args, null);
}
fieldValueCacheConfig = conf;


Cheers


On Thu, Sep 1, 2016 at 2:41 AM, Rallavagu  wrote:


But, the configuration is commented out (disabled). As comments section
mentioned

"The fieldValueCache is created by default even if not configured here"

I would like to know what would be the configuration of default
fieldValueCache created.


On 8/31/16 6:37 PM, Zheng Lin Edwin Yeo wrote:


If I didn't get your question wrong, what you have listed is already the
default configuration that comes with your version of Solr.

Regards,
Edwin

On 30 August 2016 at 07:49, Rallavagu  wrote:

Solr 5.4.1





Wondering what is the default configuration for "fieldValueCache".









what to try next for replica that will not stay up.

2016-09-01 Thread Jon Hawkesworth
Hi

If anyone has any suggestions of things I could try to resolve my issue where 
one replica on one of my solcloud 6.0.1 shards refuses to stay up, I'd love to 
hear them.  In fact, I'll get you something off your amazon wishlist, within 
reason, if you can solve this puzzle.

Today we pruned the dead replica, restarted the machine where it ran and once 
the node had rejoined the cluster, we added a new replica.
The replica was marked as Active for about 10 minutes then went down

I put some example logging from below, but it looks much the same as last time.

There's a bunch of warnings about a checksum being different even though the 
file size is the same and then RecoveryStrategy
reports 'Could not publish as ACTIVE after succesful recovery'

I think I've found where that message comes from in the code here: 
https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java;h=abd00aef19a731b42b314f8b526cdb2d77baf89f;hb=refs/heads/master
(I am running 6.0.1 though so could have changed in latest devel).

So it seems this chunk of code...

451 if (successfulRecovery) {
452   LOG.info("Registering as Active after recovery.");
453   try {
454 zkController.publish(core.getCoreDescriptor(), 
Replica.State.ACTIVE);
455   } catch (Exception e) {
456 LOG.error("Could not publish as ACTIVE after succesful 
recovery", e);
457 successfulRecovery = false;
458   }
459
 460   if (successfulRecovery) {
461 close = true;
462 recoveryListener.recovered();
463   }
464 }

results in this:

org.apache.solr.common.SolrException: Cannot publish state of core 
'documents_shard1_replica2' as active without recovering first!
   at 
org.apache.solr.cloud.ZkController.publish(ZkController.java:1141)
   at 
org.apache.solr.cloud.ZkController.publish(ZkController.java:1097)
   at 
org.apache.solr.cloud.ZkController.publish(ZkController.java:1093)
   at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:457)
   at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown 
Source)
   at java.util.concurrent.FutureTask.run(Unknown Source)
   at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown 
Source)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown 
Source)
   at java.lang.Thread.run(Unknown Source)

I don't yet understand the interaction with zookeeper but there's some 
disagreement about whether recovery has happened or not (if it hadn't from 
solr's point of view the successfulRecovery boolean would presumably be false.

Should I raise a JIRA?  Is there any other useful information I could gather?

I haven't really had any similar problems with the other 3 shards, just shard1.

The nodes that it is running on are all pretty similar - all vms built to the 
same specification and the deployment of java and solrcloud is automated so 
there shouldn't be any differences in the stack.

Many thanks,

Jon




Example log output below


WARN false


IndexFetcher

File _jnux.si did not match. expected checksum is 1186898951 and actual is 
checksum 1994281621. expected length is 417 and actual length is 417
9/1/2016, 12:37:06 PM

WARN false


IndexFetcher

File _jnuy.nvd did not match. expected checksum is 2200422612 and actual is 
checksum 3635321041. expected length is 63 and actual length is 65
9/1/2016, 12:37:06 PM

WARN false


IndexFetcher

File _jnuy.fdx did not match. expected checksum is 281622189 and actual is 
checksum 838341528. expected length is 84 and actual length is 84
9/1/2016, 12:37:06 PM

WARN false


IndexFetcher

File _jnuy.nvm did not match. expected checksum is 1875012021 and actual is 
checksum 524812847. expected length is 108 and actual length is 108
9/1/2016, 12:37:06 PM

WARN false


IndexFetcher

File _jnuy.fnm did not match. expected checksum is 1681449973 and actual is 
checksum 3351426142. expected length is 1265 and actual length is 1265
9/1/2016, 12:37:06 PM

WARN false


IndexFetcher

File _jnuy_Lucene54_0.dvm did not match. expected checksum is 355987228 and 
actual is checksum 847034886. expected length is 380 and actual length is 404
9/1/2016, 12:37:06 PM

WARN false


IndexFetcher

File _jnuy_Lucene50_0.pos did not match. expected checksum is 806636274 and 
actual is checksum 2272195325. expected length is 1059 and actual length is 1172
9/1/2016, 12:37:06 PM

WARN false


IndexFetcher

File _jnuy_Lucene50_0.doc did not match. expected checksum is 4041316671 and 
actual is checksum 3122885740. expected length is 212 and actual length is 281
9/1/2016, 

Re: what to try next for replica that will not stay up.

2016-09-01 Thread John Bickerstaff
This may be too simplistic of course, but what if you totally wipe Solr off
that machine, re-install it from scratch, bring it up and let it join the
cluster, then add the replica?

If it was me (since I have the luxury of VM's) I'd turn it off and build a
new VM from the ground up, but I get that this is a luxury not everyone
has...

Possibly it will work, and if it doesn't, that would, to my mind, point to
something somewhere else and not on that machine

Of course, if porting all the data ver to the new replica is too expensive
in time or bandwidth, that wouldn't be a good option...

Feel free to ignore - I haven't read the entire thread carefully...

On Thu, Sep 1, 2016 at 9:45 AM, Jon Hawkesworth <
jon.hawkeswo...@medquist.onmicrosoft.com> wrote:

> Hi
>
>
>
> If anyone has any suggestions of things I could try to resolve my issue
> where one replica on one of my solcloud 6.0.1 shards refuses to stay up,
> I'd love to hear them.  In fact, I'll get you something off your amazon
> wishlist, within reason, if you can solve this puzzle.
>
>
>
> Today we pruned the dead replica, restarted the machine where it ran and
> once the node had rejoined the cluster, we added a new replica.
>
> The replica was marked as Active for about 10 minutes then went down
>
>
>
> I put some example logging from below, but it looks much the same as last
> time.
>
>
>
> There's a bunch of warnings about a checksum being different even though
> the file size is the same and then RecoveryStrategy
>
> reports 'Could not publish as ACTIVE after succesful recovery'
>
>
>
> I think I've found where that message comes from in the code here:
> https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=
> blob;f=solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java;h=
> abd00aef19a731b42b314f8b526cdb2d77baf89f;hb=refs/heads/master
>
> (I am running 6.0.1 though so could have changed in latest devel).
>
>
>
> So it seems this chunk of code…
>
>
>
> 451 if (successfulRecovery) {
>
> 452   LOG.info("Registering as Active after recovery.");
>
> 453   try {
>
> 454 zkController.publish(core.getCoreDescriptor(),
> Replica.State.ACTIVE);
>
> 455   } catch (Exception e) {
>
> 456 LOG.error("Could not publish as ACTIVE after succesful
> recovery", e);
>
> 457 successfulRecovery = false;
>
> 458   }
>
> 459
>
>  460   if (successfulRecovery) {
>
> 461 close = true;
>
> 462 recoveryListener.recovered();
>
> 463   }
>
> 464 }
>
>
>
> results in this:
>
>
>
> org.apache.solr.common.SolrException: Cannot publish state of core
> 'documents_shard1_replica2' as active without recovering first!
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1141)
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1097)
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1093)
>
>at org.apache.solr.cloud.RecoveryStrategy.doRecovery(
> RecoveryStrategy.java:457)
>
>at org.apache.solr.cloud.RecoveryStrategy.run(
> RecoveryStrategy.java:224)
>
>at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> Source)
>
>at java.util.concurrent.FutureTask.run(Unknown Source)
>
>at org.apache.solr.common.util.ExecutorUtil$
> MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>
>at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source)
>
>at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source)
>
>at java.lang.Thread.run(Unknown Source)
>
>
>
> I don't yet understand the interaction with zookeeper but there's some
> disagreement about whether recovery has happened or not (if it hadn't from
> solr's point of view the successfulRecovery boolean would presumably be
> false.
>
>
>
> Should I raise a JIRA?  Is there any other useful information I could
> gather?
>
>
>
> I haven't really had any similar problems with the other 3 shards, just
> shard1.
>
>
>
> The nodes that it is running on are all pretty similar - all vms built to
> the same specification and the deployment of java and solrcloud is
> automated so there shouldn't be any differences in the stack.
>
>
>
> Many thanks,
>
>
>
> Jon
>
>
>
>
>
>
>
>
>
> Example log output below
>
>
>
>
>
> WARN false
>
>
>
>
>
> IndexFetcher
>
>
>
> File _jnux.si did not match. expected checksum is 1186898951 and actual
> is checksum 1994281621. expected length is 417 and actual length is 417
>
> 9/1/2016, 12:37:06 PM
>
>
>
> WARN false
>
>
>
>
>
> IndexFetcher
>
>
>
> File _jnuy.nvd did not match. expected checksum is 2200422612 and actual
> is checksum 3635321041. expected length is 63 and actual length is 65
>
> 9/1/2016, 12:37:06 PM
>
>
>
> WARN false
>
>
>
>
>
> IndexFetcher
>
>
>
> File _jnuy.fdx did not match. expected che

RE: Modifying fl in QParser

2016-09-01 Thread Beale, Jim (US-KOP)
All,

Thank you all for your responses.

Our custom QParser identifies several of many dynamic fields for construction 
of the actual Lucene query.  

For instance, given a custom Solr request consisting of 
"q={!xyzQP}scid:500247", a new query is to be constructed using information 
from a SQL query which selects certain dynamic fields, e.g. p_001, q_004, r_007 
along the lines of "+p_001:abc +q_004:abc +r_007:abc".  The requirement is to 
then configure fl to only include the stored fields psf_001, qsf_004 and 
rsf_007 in the response, but not the many other stored fields that are not 
relevant to the query.

What is the best way to accomplish this?  It would be convenient to be able to 
modify fl in the QParser.

Also, note that the index is not sharded. 

Thanks!

Jim

-Original Message-
From: Rohit Kanchan [mailto:rohitkan2...@gmail.com] 
Sent: Wednesday, August 31, 2016 12:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Modifying fl in QParser

We are dealing with same thing, we have overridden QueryComponent (type of
SearchComponent) and added a field to retrieve there. That same field we are 
setting in SolrParams from query request. According to your business you need 
to figure out  how you can override QueryComponent. I hope this helps in 
solving your problem.

Thanks
Rohit Kanchan


On Tue, Aug 30, 2016 at 5:11 PM, Erik Hatcher 
wrote:

> Personally, I don’t think a QParser(Plugin) is the right place to modify
> other parameters, only to create a Query object.   A QParser could be
> invoked from an fq, not just a q, and will get invoked on multiple 
> nodes in SolrCloud, for example - this is why I think it’s not a good 
> idea to do anything but return a Query.
>
> It is possible (in fact I’m dealing with this very situation with a client
> as we speak) to set parameters this way, but I don’t recommend it.   Create
> a SearchComponent to do this job instead.
>
> Erik
>
>
>
> > On Aug 9, 2016, at 10:23 AM, Beale, Jim (US-KOP) 
> > 
> wrote:
> >
> > Hi,
> >
> > Is it possible to modify the SolrParam, fl, to append selected 
> > dynamic
> fields, while rewriting a query in QParser.parse()?
> >
> > Thanks in advance!
> >
> >
> > Jim Beale
> > Senior Lead Developer
> > 2201 Renaissance Boulevard, King of Prussia, PA, 19406
> > Mobile: 610-220-3067
> >
> >
> >
> > The information contained in this email message, including any
> attachments, is intended solely for use by the individual or entity 
> named above and may be confidential. If the reader of this message is 
> not the intended recipient, you are hereby notified that you must not 
> read, use, disclose, distribute or copy any part of this 
> communication. If you have received this communication in error, 
> please immediately notify me by email and destroy the original message, 
> including any attachments. Thank you.
> **hibu IT Code:141459300**
>
>


ShardDoc.sortFieldValues are not exposed in v5.2.1

2016-09-01 Thread tedsolr
I'm attempting to perform my own merge of IDs with a MergeStrategy in v5.2.1.
I'm a bit hamstrung because the ShardFieldSortedHitQueue is not public. When
trying to build my own priority queue I found out that the field
sortFieldValues in ShardDoc is package restricted. Now, in v6.1 I see that
both the HitQueue and the field are public.

Would it be possible to patch 5.2.1, or maybe the latest v5, to expose these
very useful objects? I can't upgrade to v6 due to the java 8 requirement.

thanks, Ted



--
View this message in context: 
http://lucene.472066.n3.nabble.com/ShardDoc-sortFieldValues-are-not-exposed-in-v5-2-1-tp4294336.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: what to try next for replica that will not stay up.

2016-09-01 Thread Jon Hawkesworth
Well, thanks for your suggestion, it may well come to that.
My impression the problem is with the shard in some way in that we have created 
the replica on other nodes and it has done the same thing.  We went from 2 
nodes to 4 to add processing capacity to the cluster a couple of weeks back, 
but that hasn't helped with this issue.
Jon

-Original Message-
From: John Bickerstaff [mailto:j...@johnbickerstaff.com] 
Sent: Thursday, September 1, 2016 5:06 PM
To: solr-user@lucene.apache.org
Subject: Re: what to try next for replica that will not stay up.

This may be too simplistic of course, but what if you totally wipe Solr off 
that machine, re-install it from scratch, bring it up and let it join the 
cluster, then add the replica?

If it was me (since I have the luxury of VM's) I'd turn it off and build a new 
VM from the ground up, but I get that this is a luxury not everyone has...

Possibly it will work, and if it doesn't, that would, to my mind, point to 
something somewhere else and not on that machine

Of course, if porting all the data ver to the new replica is too expensive in 
time or bandwidth, that wouldn't be a good option...

Feel free to ignore - I haven't read the entire thread carefully...

On Thu, Sep 1, 2016 at 9:45 AM, Jon Hawkesworth < 
jon.hawkeswo...@medquist.onmicrosoft.com> wrote:

> Hi
>
>
>
> If anyone has any suggestions of things I could try to resolve my 
> issue where one replica on one of my solcloud 6.0.1 shards refuses to 
> stay up, I'd love to hear them.  In fact, I'll get you something off 
> your amazon wishlist, within reason, if you can solve this puzzle.
>
>
>
> Today we pruned the dead replica, restarted the machine where it ran 
> and once the node had rejoined the cluster, we added a new replica.
>
> The replica was marked as Active for about 10 minutes then went down
>
>
>
> I put some example logging from below, but it looks much the same as 
> last time.
>
>
>
> There's a bunch of warnings about a checksum being different even 
> though the file size is the same and then RecoveryStrategy
>
> reports 'Could not publish as ACTIVE after succesful recovery'
>
>
>
> I think I've found where that message comes from in the code here:
> https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=
> blob;f=solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java;
> h= abd00aef19a731b42b314f8b526cdb2d77baf89f;hb=refs/heads/master
>
> (I am running 6.0.1 though so could have changed in latest devel).
>
>
>
> So it seems this chunk of code…
>
>
>
> 451 if (successfulRecovery) {
>
> 452   LOG.info("Registering as Active after recovery.");
>
> 453   try {
>
> 454 zkController.publish(core.getCoreDescriptor(),
> Replica.State.ACTIVE);
>
> 455   } catch (Exception e) {
>
> 456 LOG.error("Could not publish as ACTIVE after succesful
> recovery", e);
>
> 457 successfulRecovery = false;
>
> 458   }
>
> 459
>
>  460   if (successfulRecovery) {
>
> 461 close = true;
>
> 462 recoveryListener.recovered();
>
> 463   }
>
> 464 }
>
>
>
> results in this:
>
>
>
> org.apache.solr.common.SolrException: Cannot publish state of core 
> 'documents_shard1_replica2' as active without recovering first!
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1141)
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1097)
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1093)
>
>at org.apache.solr.cloud.RecoveryStrategy.doRecovery(
> RecoveryStrategy.java:457)
>
>at org.apache.solr.cloud.RecoveryStrategy.run(
> RecoveryStrategy.java:224)
>
>at 
> java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> Source)
>
>at java.util.concurrent.FutureTask.run(Unknown Source)
>
>at org.apache.solr.common.util.ExecutorUtil$
> MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source)
>
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source)
>
>at java.lang.Thread.run(Unknown Source)
>
>
>
> I don't yet understand the interaction with zookeeper but there's some 
> disagreement about whether recovery has happened or not (if it hadn't 
> from solr's point of view the successfulRecovery boolean would 
> presumably be false.
>
>
>
> Should I raise a JIRA?  Is there any other useful information I could 
> gather?
>
>
>
> I haven't really had any similar problems with the other 3 shards, 
> just shard1.
>
>
>
> The nodes that it is running on are all pretty similar - all vms built 
> to the same specification and the deployment of java and solrcloud is 
> automated so there shouldn't be any differences in the stack.
>
>
>
> Many thanks,
>
>
>
>

Always add the marker when elevating documents

2016-09-01 Thread Alexandre Drouin
Hi,

I followed the instructions on the wiki 
(https://wiki.apache.org/solr/QueryElevationComponent) to add a 
QueryElevationComponent searchComponent in my Solr 4.10.2 server and it is 
working as expected.  

I saw in the documentation that it is possible to see which documents were 
elevated by adding [elevated] to the fl parameter and I would like to know if 
there is a way to always have the [elevated] property the results without 
having to add it to the fl parameter.
If it is not possible, is it safe to use "fl=*,[elevated]" to specify all 
fields plus the elevated marker?

Thanks!
Alexandre Drouin


Feedback on Match Query Parser (for fixing multiterm synonyms and other things)

2016-09-01 Thread Doug Turnbull
I wanted to solicit feedback on my query parser, the match query parser (
https://github.com/o19s/match-query-parser). It's a work in progress, so
any thoughts from the community would be welcome.

The point of this query parser is that it's not a query parser!

Instead, it's a way of selecting any analyzer to apply to the query string. I
use it for all kinds of things, finely controlling a bigram phrase search,
searching with stemmed vs exact variants of the query.

But it's biggest value to me is as a fix for multiterm synonyms. Because
I'm not giving the user's query to any underlying query parser -- I'm
always just doing analysis. So I know my selected analyzer will not be
disrupted by whitespace-based query parsing prior to query analysis.

Those of you also in the Elasticsearch community may be familiar with the
match query (
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html
). This is similar, except it also lets you select whether to turn the
resulting tokens into a term query body:(sea\ biscuit likes to fish) or a
phrase query body:"sea biscuit" likes to fish. See the examples above for
more.

It's also similar to Solr's field query parser. However the field query
parser tries to turn the fully analyzed token stream into a phrase query.
Moreover, the field query parser can only select the field's own query-time
analyzer, while the match query parser let's you select an arbitrary
analyzer. So match has more bells and whistles and acts as a compliment to
the field qp.

Thanks for any thoughts, feedback, or critiques

Best,
-Doug


Re: what to try next for replica that will not stay up.

2016-09-01 Thread Shalin Shekhar Mangar
My guess is that recovery is indeed successful but the leader is repeatedly
marking this replicas as 'down' using what we call
Leader-Initiated-Recovery or LIR. We need to understand why that is
happening. Are there GC issues on this new node?

Can we see the logs on the leader from a bit before and after a line that
should look like the following:
"Put replica core=documents_shard1_replica2 coreNodeName="...

Can we also see the complete cluster state?

On Thu, Sep 1, 2016 at 9:15 PM, Jon Hawkesworth <
jon.hawkeswo...@medquist.onmicrosoft.com> wrote:

> Hi
>
>
>
> If anyone has any suggestions of things I could try to resolve my issue
> where one replica on one of my solcloud 6.0.1 shards refuses to stay up,
> I'd love to hear them.  In fact, I'll get you something off your amazon
> wishlist, within reason, if you can solve this puzzle.
>
>
>
> Today we pruned the dead replica, restarted the machine where it ran and
> once the node had rejoined the cluster, we added a new replica.
>
> The replica was marked as Active for about 10 minutes then went down
>
>
>
> I put some example logging from below, but it looks much the same as last
> time.
>
>
>
> There's a bunch of warnings about a checksum being different even though
> the file size is the same and then RecoveryStrategy
>
> reports 'Could not publish as ACTIVE after succesful recovery'
>
>
>
> I think I've found where that message comes from in the code here:
> https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=
> blob;f=solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java;h=
> abd00aef19a731b42b314f8b526cdb2d77baf89f;hb=refs/heads/master
>
> (I am running 6.0.1 though so could have changed in latest devel).
>
>
>
> So it seems this chunk of code…
>
>
>
> 451 if (successfulRecovery) {
>
> 452   LOG.info("Registering as Active after recovery.");
>
> 453   try {
>
> 454 zkController.publish(core.getCoreDescriptor(),
> Replica.State.ACTIVE);
>
> 455   } catch (Exception e) {
>
> 456 LOG.error("Could not publish as ACTIVE after succesful
> recovery", e);
>
> 457 successfulRecovery = false;
>
> 458   }
>
> 459
>
>  460   if (successfulRecovery) {
>
> 461 close = true;
>
> 462 recoveryListener.recovered();
>
> 463   }
>
> 464 }
>
>
>
> results in this:
>
>
>
> org.apache.solr.common.SolrException: Cannot publish state of core
> 'documents_shard1_replica2' as active without recovering first!
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1141)
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1097)
>
>at org.apache.solr.cloud.ZkController.publish(
> ZkController.java:1093)
>
>at org.apache.solr.cloud.RecoveryStrategy.doRecovery(
> RecoveryStrategy.java:457)
>
>at org.apache.solr.cloud.RecoveryStrategy.run(
> RecoveryStrategy.java:224)
>
>at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> Source)
>
>at java.util.concurrent.FutureTask.run(Unknown Source)
>
>at org.apache.solr.common.util.ExecutorUtil$
> MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>
>at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source)
>
>at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source)
>
>at java.lang.Thread.run(Unknown Source)
>
>
>
> I don't yet understand the interaction with zookeeper but there's some
> disagreement about whether recovery has happened or not (if it hadn't from
> solr's point of view the successfulRecovery boolean would presumably be
> false.
>
>
>
> Should I raise a JIRA?  Is there any other useful information I could
> gather?
>
>
>
> I haven't really had any similar problems with the other 3 shards, just
> shard1.
>
>
>
> The nodes that it is running on are all pretty similar - all vms built to
> the same specification and the deployment of java and solrcloud is
> automated so there shouldn't be any differences in the stack.
>
>
>
> Many thanks,
>
>
>
> Jon
>
>
>
>
>
>
>
>
>
> Example log output below
>
>
>
>
>
> WARN false
>
>
>
>
>
> IndexFetcher
>
>
>
> File _jnux.si did not match. expected checksum is 1186898951 and actual
> is checksum 1994281621. expected length is 417 and actual length is 417
>
> 9/1/2016, 12:37:06 PM
>
>
>
> WARN false
>
>
>
>
>
> IndexFetcher
>
>
>
> File _jnuy.nvd did not match. expected checksum is 2200422612 and actual
> is checksum 3635321041. expected length is 63 and actual length is 65
>
> 9/1/2016, 12:37:06 PM
>
>
>
> WARN false
>
>
>
>
>
> IndexFetcher
>
>
>
> File _jnuy.fdx did not match. expected checksum is 281622189 and actual is
> checksum 838341528. expected length is 84 and actual length is 84
>
> 9/1/2016, 12:37:06 PM
>
>
>
> WARN false
>
>
>
>
>
> IndexFetcher
>
>
>
> File _jnuy.nvm did not match.

Re: ShardDoc.sortFieldValues are not exposed in v5.2.1

2016-09-01 Thread Shalin Shekhar Mangar
This was made public in https://issues.apache.org/jira/browse/SOLR-7968
which is released already in 5.5

On Fri, Sep 2, 2016 at 12:01 AM, tedsolr  wrote:

> I'm attempting to perform my own merge of IDs with a MergeStrategy in
> v5.2.1.
> I'm a bit hamstrung because the ShardFieldSortedHitQueue is not public.
> When
> trying to build my own priority queue I found out that the field
> sortFieldValues in ShardDoc is package restricted. Now, in v6.1 I see that
> both the HitQueue and the field are public.
>
> Would it be possible to patch 5.2.1, or maybe the latest v5, to expose
> these
> very useful objects? I can't upgrade to v6 due to the java 8 requirement.
>
> thanks, Ted
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/ShardDoc-
> sortFieldValues-are-not-exposed-in-v5-2-1-tp4294336.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: Replication Index fetch failed

2016-09-01 Thread Shalin Shekhar Mangar
On Thu, Sep 1, 2016 at 6:05 PM, Arkadi Colson  wrote:

> ERROR - 2016-09-01 14:30:43.653; [c:intradesk s:shard1 r:core_node5
> x:intradesk_shard1_replica1] org.apache.solr.common.SolrException; Index
> fetch failed :org.apache.solr.common.SolrException: Unable to download
> _6f46_cj.liv completely. Downloaded 0!=5596
> at org.apache.solr.handler.IndexFetcher$FileFetcher.cleanup(
> IndexFetcher.java:1554)
> at org.apache.solr.handler.IndexFetcher$FileFetcher.fetchFile(
> IndexFetcher.java:1437)
> at org.apache.solr.handler.IndexFetcher.downloadIndexFiles(Inde
> xFetcher.java:852)
> at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexF
> etcher.java:428)
> at org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexF
> etcher.java:251)
>


There should be another exception in the logs that looks like the following:
"Could not download file"...

That one will have a more useful stack trace. Can you please find it and
paste it on email?

-- 
Regards,
Shalin Shekhar Mangar.