RE: Sending compressed (gzip) UpdateRequest with SolrJ

2021-01-08 Thread Gael Jourdan-Weil
You're right Matthew.

Jetty supports it for responses but for requests it doesn't seem to be the 
default.
However I found a configuration not documented that needs to be set in the 
GzipHandler for it to work: inflateBufferSize.

For SolrJ it still hacky to send gzip requests, maybe easier to use a regular 
http call..

---

De : matthew sporleder 
Envoyé : jeudi 7 janvier 2021 16:43
À : solr-user@lucene.apache.org 
Objet : Re: Sending compressed (gzip) UpdateRequest with SolrJ 
 
jetty supports http gzip and I've added it to solr before in my own
installs (and submitted patches to do so by default to solr) but I
don't know about the handling for solrj.

IME compression helps a little, sometimes a lot, and never hurts.
Even the admin interface benefits a lot from regular old http gzip

On Thu, Jan 7, 2021 at 8:03 AM Gael Jourdan-Weil
 wrote:
>
> Answering to myself on this one.
>
> Solr uses Jetty 9.x which does not support compressed requests by itself 
> meaning, the application behind Jetty (that is Solr) has to decompress by 
> itself which is not the case for now.
> Thus even without using SolrJ, sending XML compressed in GZIP to Solr (with 
> cURL for instance) is not possible for now.
>
> Seems quite surprising to me though.
>
> -
>
> Hello,
>
> I was wondering if someone ever had the need to send compressed (gzip) update 
> requests (adding/deleting documents), especially using SolrJ.
>
> Somehow I expected it to be done by default, but didn't find any 
> documentation about it and when looking at the code it seems there is no 
> option to do it. Or is javabin compressed by default?
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BinaryRequestWriter.java#L49
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/RequestWriter.java#L55
>  (if not using Javabin)
> - 
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L587
>
> By the way, is there any documentation about javabin? I could only find one 
> on the "old wiki".
>
> Thanks,
> Gaël

Remote error message: empty String & null:java.lang.NumberFormatException: empty String

2021-01-08 Thread Doss
We have 12 node SOLR cloud with 3 zookeeper ensemble
RAM: 80 CPU:40 Heap:16GB Records: 4 Million

We do real time update and deletes (by ID), and we do us Inplace updates
for 4 fields

We have one index with 4 shards: 1 shard in 3 nodes

Often we are getting the following errors

1. *2021-01-08 17:11:14.305 ERROR (qtp1720891078-7429) [c:profilesindex
s:shard4 r:core_node42 x:profilesindex_shard4_replica_n41]
o.a.s.s.HttpSolrCall
null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
Async exception during distributed update: Error from server at
http://171.0.0.145:8983/solr/profilesindex_shard3_replica_n49/
: null*

request: http://171.0.0.145:8983/solr/profilesindex_shard3_replica_n49/
Remote error message: empty String
at
org.apache.solr.update.processor.DistributedZkUpdateProcessor.doDistribFinish(DistributedZkUpdateProcessor.java:1193)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1125)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:78)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2606)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:812)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:588)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:415)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)
at
org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:500)
at
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
at
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
at java.base/java.lang.Thread.run(Thread.java:832)
=
2.  *2021-01-08 17:15:17.849 ERROR (qtp1720891078-7120) [c:profilesindex
s:shard4 r:core_node42 x:profilesindex_shard4_replica_n41]
o.a.s.s.HttpSolrCall null:java.lang.NumberFormatException: empty String*
at
java.base/jdk.internal.math.Flo

Re: Sending compressed (gzip) UpdateRequest with SolrJ

2021-01-08 Thread Walter Underwood
Years ago, working on the Ultraseek spider, we did a bunch of tests on 
compressed HTTP.
I expected it to be a big win, but the results were really inconclusive. 
Sometimes it was faster,
sometimes it was slower. We left it turned off.

It is an absolute win for serving already-compressed static content with Apache 
or whatever.
For dynamic content, it will increase some amount of delay as stuff is 
compressed before
sending. If the content already fits in one or two packets, it is just extra 
overhead. For really
large data, it helps with transmission time, but the processing time for large 
data probably
overwhelms the network time.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jan 8, 2021, at 12:01 AM, Gael Jourdan-Weil 
>  wrote:
> 
> You're right Matthew.
> 
> Jetty supports it for responses but for requests it doesn't seem to be the 
> default.
> However I found a configuration not documented that needs to be set in the 
> GzipHandler for it to work: inflateBufferSize.
> 
> For SolrJ it still hacky to send gzip requests, maybe easier to use a regular 
> http call..
> 
> ---
> 
> De : matthew sporleder 
> Envoyé : jeudi 7 janvier 2021 16:43
> À : solr-user@lucene.apache.org 
> Objet : Re: Sending compressed (gzip) UpdateRequest with SolrJ 
>  
> jetty supports http gzip and I've added it to solr before in my own
> installs (and submitted patches to do so by default to solr) but I
> don't know about the handling for solrj.
> 
> IME compression helps a little, sometimes a lot, and never hurts.
> Even the admin interface benefits a lot from regular old http gzip
> 
> On Thu, Jan 7, 2021 at 8:03 AM Gael Jourdan-Weil
>  wrote:
>> 
>> Answering to myself on this one.
>> 
>> Solr uses Jetty 9.x which does not support compressed requests by itself 
>> meaning, the application behind Jetty (that is Solr) has to decompress by 
>> itself which is not the case for now.
>> Thus even without using SolrJ, sending XML compressed in GZIP to Solr (with 
>> cURL for instance) is not possible for now.
>> 
>> Seems quite surprising to me though.
>> 
>> -
>> 
>> Hello,
>> 
>> I was wondering if someone ever had the need to send compressed (gzip) 
>> update requests (adding/deleting documents), especially using SolrJ.
>> 
>> Somehow I expected it to be done by default, but didn't find any 
>> documentation about it and when looking at the code it seems there is no 
>> option to do it. Or is javabin compressed by default?
>> - 
>> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BinaryRequestWriter.java#L49
>> - 
>> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/RequestWriter.java#L55
>>  (if not using Javabin)
>> - 
>> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L587
>> 
>> By the way, is there any documentation about javabin? I could only find one 
>> on the "old wiki".
>> 
>> Thanks,
>> Gaël



FST building precaution

2021-01-08 Thread gnandre
Hi,

following comment is mentioned in
https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/util/fst/package-info.java
.


 "Input values (keys). These must be provided to Builder in Unicode code
point (UTF8 or UTF32) sorted order. Note that sorting by Java's
String.compareTo, which is UTF16 sorted order, is not correct and can lead
to exceptions while building the FST"

Can someone please suggest how to achieve this?