RE: Solr 7.7.1 Junit failure with ArrayIndexOutOfBoundsException on CoreContainer.createAndLoad call

2019-05-12 Thread Irfan Nagoo
Hi,



I got this issue fixed. The problem was the existing bug in Jacoco plugin that 
we were using along with gradle to execute the test cases. It looks like the 
version of Jacoco that we were using didn’t go well with the “default methods 
in an interface” feature of Java. Here are the bug details: 
https://github.com/jacoco/jacoco/issues/226



We were using Jacoco 0.7.0.201403182114 version and after moving to version 
0.7.1.201405082137 (which has the fix), the issue got resolved.





Thanks, Irfan




From: Irfan Nagoo 
Sent: Saturday, May 11, 2019 5:28:27 PM
To: solr-user@lucene.apache.org
Subject: Solr 7.7.1 Junit failure with ArrayIndexOutOfBoundsException on 
CoreContainer.createAndLoad call

CAUTION: This email originated from outside of the VentivTech organization.


Hi,

We are getting ArrayIndexOutOfBoundsException from the Solr core library on 
calling CoreContainer.createAndLoad API while executing a Junit test case 
through gradle. We can see from the logs that solr.xml is getting initialized. 
Here is the code that we have in the Junit which is causing this issue:

CoreContainer container = 
CoreContainer.createAndLoad(Paths.get(SOLR_HOME.getAbsolutePath()), 
Paths.get(solrConfig.getURI()));
EmbeddedSolrServer  solrServer = new EmbeddedSolrServer(container, CORE_NAME);


Here is the exception stacktrack we get when the test case is executed:

java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.solr.api.ApiSupport.registerV2(ApiSupport.java:42)
at org.apache.solr.core.PluginBag.put(PluginBag.java:215)
at org.apache.solr.core.PluginBag.put(PluginBag.java:194)
at 
org.apache.solr.core.CoreContainer.(CoreContainer.java:314)
at 
org.apache.solr.core.CoreContainer.(CoreContainer.java:308)
at 
org.apache.solr.core.CoreContainer.(CoreContainer.java:300)
at 
org.apache.solr.core.CoreContainer.(CoreContainer.java:296)
at 
org.apache.solr.core.CoreContainer.createAndLoad(CoreContainer.java:480)
at 
org.company.alpha.solr.TestSolrCoreContainer.test(TestSolrCoreContainer.java:48)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)

After going through the source code of ApiSupport.registerV2, there is no array 
used anywhere. After doing some google, I found out that it could be due any 
old version of ASM jar in classpath. I tried replacing the ASM jar with 5.1 
version but that didn’t work. Please let me know if anybody has any idea on 
this issue?


Thanks, Irfan

IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages to clients of Ventiv Technology Inc., may contain information that is 
confidential and legally privileged. Please do not read, copy, forward, or 
store this message unless you are an intended recipient of it. If you have 
received this message in error, please forward it to the sender and delete it 
completely from your computer system.
IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages to clients of Ventiv Technology Inc., may contain information that is 
confidential and legally privileged. Please do not read, copy, forward, or 
store this message unless you are an intended recipient of it. If you have 
received this message in error, please forward it to the sender and delete it 
completely from your computer system.


Re: Performance of /export requests

2019-05-12 Thread Justin Sweeney
Thanks for the quick response. We are generally seeing exports from Solr 5
and 7 to be roughly the same, but I’ll check out Solr 8.

Joel - We are generally sorting a on tlong field and criteria can vary from
searching everything (*:*) to searching on a combination of a few tint and
string types.

All of our 16 fields are docvalues. Is there any performance degradation as
the number of docvalues fields increases or should that not have an impact?
Also, is the 30k sliding window configurable? In many cases we are
streaming back a few thousand, maybe up to 10k and then cutting off the
stream. If we could configure the size of that window, could that speed
things up some?

Thanks again for the info.

On Sat, May 11, 2019 at 2:38 PM Joel Bernstein  wrote:

> Can you share the sort criteria and search query? The main strategy for
> improving performance of the export handler is adding more shards. This is
> different than with typical distributed search, where deep paging issues
> get worse as you add more shards. With the export handler if you double the
> shards you double the pushing power. There are no deep paging drawbacks to
> adding more shards.
>
> On Sat, May 11, 2019 at 2:17 PM Toke Eskildsen  wrote:
>
> > Justin Sweeney  wrote:
> >
> > [Index: 10 shards, 450M docs]
> >
> > > We are creating a CloudSolrStream and when we call
> CloudSolrStream.open()
> > > we see that call being slower than we had hoped. For some queries, that
> > > call can take 800 ms. [...]
> >
> > As far as I can see in the code, CloudSolrStream.open() opens streams
> > against the relevant shards and checks if there is a result. The last
> step
> > is important as that means the first batch of tuples must be calculated
> in
> > the shards. Streaming works internally by having a sliding window of 30K
> > tuples through the result set in each shard, so open() results in (up to)
> > 30K tuples being calculated. On the other hand, getting the first 30K
> > tuples should be very fast after open().
> >
> > > We are currently using Solr 5, but we’ve also tried with Solr 7 and
> seen
> > > similar results.
> >
> > Solr 7 has a performance regression for export (or rather a regression
> for
> > DocValues that is very visible when using export. See
> > https://issues.apache.org/jira/browse/SOLR-13013), so I would expect it
> > to be slower than Solr 5. You could try with Solr 8 where this regression
> > should be mitigated somewhat.
> >
> > - Toke Eskildsen
> >
>


Re: Range query syntax on a polygon field is returning all documents

2019-05-12 Thread David Smiley
I answered in StackOverflow but will paste it here:

Geo3D requires that polygons adhere to the "right hand rule", and thus the
exterior ring must be in counter-clockwise order and holes must be
clockwise.  If you make this mistake then the meaning of the shape is
inverted, and thus that little rectangle in Alberta Canada represents the
inverse of that place.  Consequently most shapes will cover nearly the
entire globe!  There is certainly a documentation issue needed in Solr to
this effect.  Even I didn't know until I debugged this today!  It appears
some of the GIS industry is migrating to this rule as well:
http://mapster.me/right-hand-rule-geojson-fixer/

Separately: I would be very curious to see how Geo3D compares to JTS after
you get it working.  Additionally, you likely ought to use
solr.RptWithGeometrySpatialField instead of
solr.SpatialRecursivePrefixTreeFieldType to get the full accuracy of the
vector geometry instead of settling on a grid representation of shapes,
otherwise your queries might get false-positives for just being close to an
indexed shape.  Another thing to try is using prefixTree="s2" which is a
not-yet-documented prefixTree that supposedly is much more efficient for
Geo3D specifically.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Mar 20, 2019 at 2:00 PM David Smiley 
wrote:

> Hi Mitchell,
>
> Seems like there's a bug based on what you've shown.
> * Can you please try RptWithGeometrySpatialField instead
> of SpatialRecursivePrefixTreeFieldType to see if the problem goes away?
> This could point to a precision issue; though still what you've seen is
> suspicious.
> * Can you try one other query syntax e.g. bbox query parser to see if the
> problem goes away?  I doubt this is it but you seem to point to the syntax
> being related.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Mar 18, 2019 at 12:24 AM Mitchell Bösecke <
> mitchell.bose...@forcorp.com> wrote:
>
>> Hi everyone,
>>
>> I'm trying to index geodetic polygons and then query them out using an
>> arbitrary rectangle. When using the Geo3D spatial context factory, the
>> data
>> indexes just fine but using a range query (as per the solr documentation)
>> does not seem to filter the results appropriately (I get all documents
>> back).
>>
>> When I switch it to JTS, everything works as expected. However, it
>> significantly slowed down the initial indexing time. A sample size of 3000
>> documents took 3 seconds with Geo3D and 50 seconds with JTS.
>>
>> I've documented my journey in detail on stack overflow:
>> https://stackoverflow.com/q/55212622/1017571
>>
>>1. Can I not use the range query syntax with Geo3D? I.e. am I
>>misreading the documentation?
>>2. Is it expected that using JTS will *significantly* slow down the
>>indexing time?
>>
>> Thanks for any insight.
>>
>> --
>> Mitchell Bosecke, B.Sc.
>> Senior Application Developer
>>
>> FORCORP
>> Suite 200, 15015 - 123 Ave NW,
>> Edmonton, AB, T5V 1J7
>> www.forcorp.com
>> (d) 780.733.0494
>> (o) 780.452.5878 ext. 263
>> (f) 780.453.3986
>>
>


Re: Performance of /export requests

2019-05-12 Thread Joel Bernstein
Your query and sort criteria sound like they should be fast.

In general if you are cutting off the stream at 10K don't use the /export
handler. Use the /select handler, it will be faster for sure. The reason
for the 30K sliding winding was that it maximized throughput over a long
export (many millions of documents). If you're not doing a long export than
the export handler is likely not the most efficient approach.

Each field being exported slows down the export handler, and 16 is a lot if
fields to export. Again the only way increase the performance of exporting
16 fields is to add more shards.

Are you exporting with Streaming Expressions?




On Sun, May 12, 2019 at 8:44 AM Justin Sweeney 
wrote:

> Thanks for the quick response. We are generally seeing exports from Solr 5
> and 7 to be roughly the same, but I’ll check out Solr 8.
>
> Joel - We are generally sorting a on tlong field and criteria can vary from
> searching everything (*:*) to searching on a combination of a few tint and
> string types.
>
> All of our 16 fields are docvalues. Is there any performance degradation as
> the number of docvalues fields increases or should that not have an impact?
> Also, is the 30k sliding window configurable? In many cases we are
> streaming back a few thousand, maybe up to 10k and then cutting off the
> stream. If we could configure the size of that window, could that speed
> things up some?
>
> Thanks again for the info.
>
> On Sat, May 11, 2019 at 2:38 PM Joel Bernstein  wrote:
>
> > Can you share the sort criteria and search query? The main strategy for
> > improving performance of the export handler is adding more shards. This
> is
> > different than with typical distributed search, where deep paging issues
> > get worse as you add more shards. With the export handler if you double
> the
> > shards you double the pushing power. There are no deep paging drawbacks
> to
> > adding more shards.
> >
> > On Sat, May 11, 2019 at 2:17 PM Toke Eskildsen  wrote:
> >
> > > Justin Sweeney  wrote:
> > >
> > > [Index: 10 shards, 450M docs]
> > >
> > > > We are creating a CloudSolrStream and when we call
> > CloudSolrStream.open()
> > > > we see that call being slower than we had hoped. For some queries,
> that
> > > > call can take 800 ms. [...]
> > >
> > > As far as I can see in the code, CloudSolrStream.open() opens streams
> > > against the relevant shards and checks if there is a result. The last
> > step
> > > is important as that means the first batch of tuples must be calculated
> > in
> > > the shards. Streaming works internally by having a sliding window of
> 30K
> > > tuples through the result set in each shard, so open() results in (up
> to)
> > > 30K tuples being calculated. On the other hand, getting the first 30K
> > > tuples should be very fast after open().
> > >
> > > > We are currently using Solr 5, but we’ve also tried with Solr 7 and
> > seen
> > > > similar results.
> > >
> > > Solr 7 has a performance regression for export (or rather a regression
> > for
> > > DocValues that is very visible when using export. See
> > > https://issues.apache.org/jira/browse/SOLR-13013), so I would expect
> it
> > > to be slower than Solr 5. You could try with Solr 8 where this
> regression
> > > should be mitigated somewhat.
> > >
> > > - Toke Eskildsen
> > >
> >
>


SolrCloud limitations?

2019-05-12 Thread Juergen Melzer (DPDHL IT Services)
Hi all

At the moment we have 6 servers for doing the search. We want to go up to 12 or 
15 servers.
So my question is:
Are there any limitations for the SolrCloud and number of replicates?



Regards
Juergen



Re: Solr query takes a too much time in Solr 6.1.0

2019-05-12 Thread Bernd Fehling

Your "sort" parameter has "sort=id+desc,id+desc".
1. It doesn't make sense to have a sort on "id" in descending order twice.
2. Be aware that the id field has the highest cadinality.
3. To speedup sorting have a separate field with docValues=true for sorting.
   E.g.




Regards
Bernd


Am 10.05.19 um 15:32 schrieb vishal patel:

We have 2 shards and 2 replicas in Live environment. we have multiple 
collections.
Some times some query takes much time(QTime=52552).  There are so many 
documents indexing and searching within milliseconds.
When we executed the same query again using admin panel, it does not take a 
much time and it completes within 20 milliseconds.

My Solr Logs :
2019-05-10 09:48:56.744 INFO  (qtp1239731077-128223) [c:actionscomments s:shard1 r:core_node1 
x:actionscomments] o.a.s.c.S.Request [actionscomments]  webapp=/solr path=/select 
params={q=%2Bproject_id:(2102117)%2Brecipient_id:(4642365)+%2Bentity_type:(1)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2B(is_formtype_active:true)+%2B(appType:1)&shards=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments&indent=off&shards.tolerant=true&fl=id&start=0&sort=id+desc,id+desc&fq=&rows=1}
 hits=198 status=0 QTime=52552
2019-05-10 09:48:56.744 INFO  (qtp1239731077-127998) [c:actionscomments s:shard1 r:core_node1 
x:actionscomments] o.a.s.c.S.Request [actionscomments]  webapp=/solr path=/select 
params={q=%2Bproject_id:(2102117)%2Brecipient_id:(4642365)+%2Bentity_type:(1)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2Bdue_date:[2019-05-09T19:30:00Z+TO+2019-05-09T19:30:00Z%2B1DAY]+%2B(is_formtype_active:true)+%2B(appType:1)&shards=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments&indent=off&shards.tolerant=true&fl=id&start=0&sort=id+desc,id+desc&fq=&rows=1}
 hits=0 status=0 QTime=51970
2019-05-10 09:48:56.746 INFO  (qtp1239731077-128224) [c:actionscomments s:shard1 r:core_node1 
x:actionscomments] o.a.s.c.S.Request [actionscomments]  webapp=/solr path=/select 
params={q=%2Bproject_id:(2121600+2115171+2104206)%2Brecipient_id:(2834330)+%2Bentity_type:(2)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2Bdue_date:[2019-05-10T00:00:00Z+TO+2019-05-10T00:00:00Z%2B1DAY]&shards=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments&indent=off&shards.tolerant=true&fl=id&start=0&sort=id+desc,id+desc&fq=&rows=1}
 hits=98 status=0 QTime=51402


My schema fields below :












What could be a problem here? why the query takes too much time at that time?

Sent from Outlook