>>>>> Stephanie Evert >>>>> on Wed, 15 Jan 2025 13:18:03 +0100 writes:
> Well, the real issue then seems to be that .roman2numeric uses an invalid regular expression: >>> grepl("^M{,3}D?C{,4}L?X{,4}V?I{,4}$", cc) >> [1] TRUE TRUE TRUE TRUE TRUE > or >>> grepl("^I{,2}$", c("II", "III", "IIII")) >> [1] TRUE TRUE FALSE > Both the TRE and the PCRE specification only allow repetition quantifiers of the form > {a} > {a,b} > {a,} > https://laurikari.net/tre/documentation/regex-syntax/ > https://www.pcre.org/original/doc/html/pcrepattern.html#SEC17 > {,2} and {,4} are thus invalid and seem to result in undefined behaviour (which PCRE and TRE fill in different ways, but consistently not what was intended). >> > grepl("^I{,2}$", c("II", "III", "IIII")) >> [1] TRUE TRUE FALSE >> > grepl("^I{,2}$", c("II", "III", "IIII"), perl=TRUE) >> [1] FALSE FALSE FALSE > Fix thus is easy: {,4} => {0,4} > Best, > Stephanie Thanks a lot, Stephanie -- indeed, I think I would not have searched in this direction at all ( To me it seemed "obvious" that if {3,} is well defined, {,3} would be so, too... But I was *wrong* and actually I also understand and that {,3} is not needed, and {0,3} is clearer, whereas {3,} is not easy to re-express ( '{0,inf}' or similar would make the code considerably more complicated and probably slower..) Actually, to remain back compatible (see Jani's original report: he'd like "IIIII" to work, as it did for many/most of us), we should replace {,4} by {0,5}. But there's more: our current help page https://search.r-project.org/R/refmans/utils/html/roman.html says > Only numbers between 1 and 3999 have a unique representation > as roman numbers, and hence others result in as.roman(NA). which is really not quite true, in more than one sense: 1. as.roman(3899:3999) # works fine not producing any NA 2. I think, e.g., "MMMM" is a pretty unique representation of 4000. Also, one piece of other software (online) https://www.rapidtables.com/convert/number/date-to-roman-numerals.html does convert _dates_ up to the year 4999, see, https://www.rapidtables.com/convert/number/date-to-roman-numerals.html?msel=January&dsel=1&year=4999&fmtsel=MM.DD.YYYY giving MMMMCMXCIX for 4999. Hence, I also think we should enlarge the valid range from current {1 .. 3999} to {1 .. 4999} Martin ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.