Re: Info dump, small heap comparison, G1GC vs. ZGC

2022-11-29 Thread David Smiley
Thanks for sharing your analysis!
For setups where users do indexing on certain nodes and querying on other
nodes, I could imagine choosing different collectors on these nodes.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sat, Nov 26, 2022 at 5:25 PM Shawn Heisey  wrote:

> On 11/26/22 14:41, Shawn Heisey wrote:
> > The GC log analyses below cover a full index rebuild on OpenJDK 11:
> >
> > G1GC:
> > https://www.dropbox.com/s/rvw27xlanlmydry/gc_analysis_g1gc.png?dl=0
> >
> > ZGC:
> > https://www.dropbox.com/s/rl80tnf4x1x9wjh/gc_analysis_zgc.png?dl=0
>
> Focusing on one part of the full GC reports above:
>
> https://www.dropbox.com/s/qorl9x0doywxqhy/total_time_g1gc.png?dl=0
> https://www.dropbox.com/s/qs4r3aznspb4pub/total_time_zgc.png?dl=0
>
> For this tiny index, G1GC spends a lot less time doing concurrent tasks
> than ZGC, but has about twice as much pause time.  The difference in
> concurrent time on this system is VERY significant. This instance only
> has two CPUs, so there's not a lot of CPU power to handle concurrent
> threads.  Having over a minute of concurrent GC time (compared to about
> 3 seconds with G1GC) resulted in the rebuild time increasing by three
> minutes.
>
> I think that with a very large heap and/or a large number of CPU cores,
> ZGC will completely trounce G1GC.   Further testing on large installs
> that I don't have is required.
>
> Thanks,
> Shawn
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: Replication opens new SegmentReader for all segments which deem unnecessary

2022-11-29 Thread David Smiley
It could be interesting to explore optimizing "openNewSearcher" when there
is an existing searcher open over some of the same segments.
For such older segments, after replication, are they at the same exact file
path or does replication create a new path?  I forget this detail; I recall
seeing some index data path shuffling going on.  ReplicationHandler etc. is
overly complex.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Nov 23, 2022 at 1:50 PM Patson Luk  wrote:

> Hi all!
>
> We are testing multiple replica setup here (1 NRT + 1 PULL) and noticed
> that CPU consumption for replication is unreasonably high. Profiling shows
> that `SolrCore#openNewSearcher` triggered from `IndexFetcher` takes much
> more CPU time than the same method triggered from regular commits.
>
> Debugging shows that when `SolrCore#openNewSearcher` is triggered from
> `IndexFetcher`, it opens a new `SegmentReader` for every single fragment
> for the updated collection. As a new `IndexWriter`, which keeps a
> `ReaderPool`, is instantiated for each replication. And such pool is not
> reused nor previous segment readers are carried over.
>
> Details in this ticket https://issues.apache.org/jira/browse/SOLR-16560.
>
> Since I'm pretty new to this area, I would love to get some thoughts from
> the community!
>
> Many thanks!
> Patsn
>


Re: Info dump, small heap comparison, G1GC vs. ZGC

2022-11-29 Thread Shawn Heisey

On 11/26/22 14:41, Shawn Heisey wrote:
Java has other issues with heaps 32GB and larger, so the general 
recommendation we give is to keep the heap size below 32GB. That won't 
really matter with EXTREMELY large heaps well beyond 64GB, but most 
users will never need a heap that large.


One additional tidbit related to this:  ZGC always uses 64-bit pointers, 
doesn't support Compressed OOPs, and isn't available on 32-bit Java.  So 
there is no advantage to choosing a 31GB heap compared to 32GB, as there 
is with G1.


32-bit hardware support in software is slowly but surely disappearing.  
Ubuntu no longer builds 32-bit installers.  Debian still does, but I 
imagine that it won't be very many years until they drop it as well.  
32-bit Java is hard to find.


TL;DR:  I recently built a VM running 32-bit Debian.  Their entire 
software catalog is available for all hardware architectures that they 
support, so it was trivial to install a 32-bit Java with "apt install 
openjdk-11-jdk".  With that VM, I was able to check whether a piece of 
Java software would run on 32-bit.  The 2GB heap size limitation is a 
problem for most users, but everything still works if that limit is 
acceptable.


Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Replication opens new SegmentReader for all segments which deem unnecessary

2022-11-29 Thread Noble Paul
> For such older segments, after replication, are they at the same exact
file
path or does replication create a new path?

Replication only downloads the delta. If it already has the files, they are
not downloaded


On Wed, Nov 30, 2022 at 4:41 AM David Smiley  wrote:

> It could be interesting to explore optimizing "openNewSearcher" when there
> is an existing searcher open over some of the same segments.
> For such older segments, after replication, are they at the same exact file
> path or does replication create a new path?  I forget this detail; I recall
> seeing some index data path shuffling going on.  ReplicationHandler etc. is
> overly complex.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Wed, Nov 23, 2022 at 1:50 PM Patson Luk  wrote:
>
> > Hi all!
> >
> > We are testing multiple replica setup here (1 NRT + 1 PULL) and noticed
> > that CPU consumption for replication is unreasonably high. Profiling
> shows
> > that `SolrCore#openNewSearcher` triggered from `IndexFetcher` takes much
> > more CPU time than the same method triggered from regular commits.
> >
> > Debugging shows that when `SolrCore#openNewSearcher` is triggered from
> > `IndexFetcher`, it opens a new `SegmentReader` for every single fragment
> > for the updated collection. As a new `IndexWriter`, which keeps a
> > `ReaderPool`, is instantiated for each replication. And such pool is not
> > reused nor previous segment readers are carried over.
> >
> > Details in this ticket https://issues.apache.org/jira/browse/SOLR-16560.
> >
> > Since I'm pretty new to this area, I would love to get some thoughts from
> > the community!
> >
> > Many thanks!
> > Patsn
> >
>


-- 
-
Noble Paul