I really do not expect it to make anything faster. I think you are wasting your 
time. Compression also adds some latency because the compression happens before 
data is sent out. 

If your CPUs are idle, that is a red flag for performance. In every one of our 
clusters, CPU is the limiting factor in both latency and throughput. Our 
largest production cluster is 32 nodes, each with 36 CPUs.

Where is the bottleneck? Are the processes waiting on disk? If they are, you 
need more RAM. Do you have magnetic disks? Get SSDs.

You should have enough RAM to hold the index in memory, after allowing for the 
Solr JVM, kernel, and other processes.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 27, 2019, at 2:44 AM, Luthien Dulk <maike.d...@europeana.eu> wrote:
> 
> Hi Walter and Jörn,
> 
> thanks for your suggestions! I will keep them in mind.
> 
> According to our sysadmin, the CPU's on the Solr nodes are “doing basically 
> nothing", so that’s a plentiful resource in our case. We’re most interested 
> in reducing the response time of the whole chain, that (for search API 
> requests) involves a roundtrip to the Solr cluster hosted on another 
> location. 
> 
> I don’t expect all that much of http compression either, but I am nonetheless 
> interested to see what happens in a performance test with one node with and 
> without gzip enabled, using a copy of the full dataset. 
> I’ll share the results of that. 
> 
> We did manage to figure out how to enable the compression, puzzling pieces 
> found here and there together and a fair bit of trial, teeth-gnashing and 
> error. 
> Here’s how to do it, maybe it will save someone else some time: 
> 
> (using Solr 6.6.5)
> 
> 1) add a gzip configuration file called jetty-gzip.xml in /server/etc/ 
> Default values provided in the standalone Jetty installation are OK, but make 
> sure to only include properties listed under “gzip configuration” of the 
> appropriate version (e.g.: 
> https://www.eclipse.org/jetty/documentation/9.3.25.v20180904/gzip-filter.html).
>  
> The jetty-gzip.xml taken from my Jetty v.9.4.6 installation contained some 
> fields that prevented Solr’s embedded Jetty v.9.3.14 from starting
> 
> 2) add a file called gzip.mod to /server/modules
> Our sysadmin had provided me with this one; the only lines that are not 
> commented out in there are:
> 
> - - - - - - -
> 
> [depend]
> server
> 
> [xml]
> etc/jetty-gzip.xml
> 
> - - - - - - -
> 
> 3) from the Solr root:
>> java -jar server/start.jar --list-modules
> - shows the installed modules, and if they are enabled or not
> 
>> java -jar server/start.jar --add-to-start=gzip
> will activate the gzip module
> 
> Start Solr in the usual way 
>> bin/solr start
> 
> and the response then contains the Content-Encoding →gzip header.
> I don’t know yet if that compression setting is persistent across Solr 
> restarts, it doesn’t feel very solid. But for this test it'll do.
> 
> Thanks,
> Lúthien
> 
> 
>> On 21 Feb 2019, at 15:38, Walter Underwood <wun...@wunderwood.org> wrote:
>> 
>> Years ago we did some testing with HTTP compression for search results with 
>> the Ultraseek search engine. It wasn’t faster. It was sometimes slower.
>> 
>> Once you have enough RAM, search is a CPU-limited problem. HTTP compression 
>> uses more CPU to save network bandwidth. But search isn’t limited by network 
>> bandwidth, so this uses more of the bottleneck resource (CPU) to reduce 
>> usage of a plentiful resource (network bandwidth).
>> 
>> Look at the amount of data going in and out of your nodes. I bet it is far 
>> below the maximum.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Feb 21, 2019, at 6:07 AM, Jörn Franke <jornfra...@gmail.com> wrote:
>>> 
>>> You could also change the responsewriter from json to javabin to improve 
>>> performance. 
>>> Or increase network bandwidth. Then often people fetch more from solr than 
>>> they need. There is a huge saving potential. Increasing the cores for https 
>>> encryption can sometimes help.
>>> 
>>> Compression also leads to other issues (performance but potentially also 
>>> security wise).
> 
> -- 
> Disclaimer: This email and any files transmitted with it are confidential 
> and intended solely for the use of the individual or entity to whom they 
> are
> addressed. If you have received this email in error please notify the 
> system manager. If you are not the named addressee you should not 
> disseminate,
> distribute or copy this email. Please notify the sender 
> immediately by email if you have received this email by mistake and delete 
> this email from your
> system.

Reply via email to