Shawn, Mikhail, Chris, Thank you all for your feedback. Unfortunately I cannot try your recommendations right away - this week is busy. Will post my results here next week.
Regards, Denis On Tue, Apr 24, 2018 at 11:33 AM Shawn Heisey <apa...@elyograg.org> wrote: > On 4/24/2018 6:30 AM, Chris Ulicny wrote: > > I haven't worked with AWS, but recently we tried to move some of our solr > > instances to a cloud in Google's Cloud offering, and it did not go well. > > All of our problems ended up stemming from the fact that the I/O is > > throttled. Any complicated enough query would require too many disk reads > > to return the results in a reasonable time when being throttled. SSDs > were > > better but not a practical cost and not as performant as our own bare > metal. > > If there's enough memory installed beyond what is required for the Solr > heap, then Solr will rarely need to actually read the disk to satisfy a > query. That is the secret to stellar performance. If switching to > faster disks made a big difference in query performance, adding memory > would yield an even greater improvement. > > https://wiki.apache.org/solr/SolrPerformanceProblems#RAM > > > When we were doing the initial indexing, the indexing processes would get > > to a point where the updates were taking minutes to complete and the > cause > > was throttled write ops. > > Indexing speed is indeed affected by disk speed, and adding memory can't > fix that particular problem. Using a storage controller with a large > amount of battery-backed cache memory can improve it. > > > -- set the max threads and max concurrent merges of the mergeScheduler to > > be 1 (or very low). This prevented excessive IO during indexing. > > The max threads should be at 1 in the merge scheduler, but the max > merges should actually be *increased*. I use a value of 6 for that. > With SSD disks, the max threads can be increased, but I wouldn't push it > very high. > > Thanks, > Shawn > >