Can you please submit this as a wishlist item to bugzilla? it is
easier to keep track of there. You could also submit your threads
based suggestion there, again to keep it easier to keep track of and
possibly get back to in the future.

I will have a look at your approach when I get a chance, but I am
exploring a different approach to avoid scanning old generations that
may be simpler.

Best,

luke

On Wed, 3 Nov 2021, Andreas Kersting wrote:

Hi,

In https://stat.ethz.ch/pipermail/r-devel/2021-October/081147.html I proposed 
to speed up the CHARSXP cache maintenance during GC using threading. This was 
rejected by Luke in 
https://stat.ethz.ch/pipermail/r-devel/2021-October/081172.html.

Here I want to propose an alternative approach to significantly speed up 
CHARSXP cache maintenance during partial GCs. A patch which passes `make 
check-devel` is attached. Compared to R devel (revision 81110) I get the 
following performance improvements on my system:

Elapsed time for five non-full gc in a session after

x <- as.character(runif(5e7))[]
gc(full = TRUE)

+20sec -> ~1sec.


This patch introduces (theoretical) overheads to mkCharLenCE() and full GCs. 
However, I did not measure dramatic differences:

y <- "old_CHARSXP"

after

x <- "old_CHARSXP"; gc(); gc()

takes a median 32 nanoseconds with and without the patch.


gc(full = TRUE)

in a new session takes a median 16 milliseconds with and 14 without the patch.


The basic idea is to maintain the CHARSXP cache using subtables in R_StringHash, one for 
each of the (NUM_GC_GENERATIONS := NUM_OLD_GENERATIONS + 1) GC generations. New CHARSXPs 
are added by mkCharLenCE() to the subtable of the youngest generation. After a partial 
GC, only the chains anchored at the subtables of the youngest (num_old_gens_to_collect + 
1) generations need to be searched for and cleaned of unmarked nodes. Afterwards, these 
chains need to be merged into those of the respective next generation, if any. This 
approach relies on the fact that an object/CHARSXP can never become younger again. It is 
OK though if an object/CHARSXP "skips" a GC generation.

R_StringHash, which is now of length (NUM_GC_GENERATIONS * char_hash_size), is 
structured such that the chains for the same hashcode but for different 
generations are anchored at slots of R_StringHash which are next to each other 
in memory. This is because we often need to access two or more (i.e. currently 
all three) of them for one operation and this avoids cache misses.

HASHPRI, i.e. the number of occupied primary slots, is computed and stored as 
NUM_GC_GENERATIONS times the number of slots which are occupied in at least one 
of the subtables. This is done because in mkCharLenCE() we need to iterate 
through one or more chains if and only if there is a chain for the particular 
hashcode in at least one subtable.

I tried to keep the patch as minimal as possible. In particular, I did not add 
long vector support to R_StringHash. I rather reduced the max value of 
char_hash_size from 2^30 to 2^29, assuming that NUM_OLD_GENERATIONS is (not 
larger than) 2. I also did not yet adjust do_show_cache() and do_write_cache(), 
but I could do so if the patch is accepted.

Thanks for your consideration and feedback.

Regards,
Andreas


P.S. I had a hard time to get the indentation right in the patch due the mix of 
tabs and spaces. Sorry, if I screwed this up.

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
   Actuarial Science
241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to