Currently, substring defaults to last=1000000L, which strongly suggests the intent is to default to "nchar(x)" without having to compute/allocate that up front.
Unfortunately, this default makes no sense for "very large" strings which may exceed 1000000L in "width". The max width of a string is .Machine$integer.max-1: # works x = strrep(" ", .Machine$integer.max-1L) # fails x = strrep(" ", .Machine$integer.max) Error in strrep(" ", .Machine$integer.max) : 'Calloc' could not allocate memory (18446744071562067968 of 1 bytes) (see also the comment in src/main/character.c: "Character strings in R are less than 2^31-1 bytes, so we use int not size_t.") So it seems to me either .Machine$integer.max or .Machine$integer.max-1L would be a more sensible default. Am I missing something? Mike C ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel