Hi all, I'm getting the following error from substring: > substr("<I>Jens Oehlschl\xe4gel-Akiyoshi", 1, 100) Error in substr("<I>Jens Oehlschl\xe4gel-Akiyoshi", 1, 100) : invalid multibyte string at '<e4>gel-A<6b>iyoshi'
Is that normal / intended? I've tried setting the Encoding/locale to Latin-1/UTF-8 but that does not help. nchar gives me something similar > nchar("<I>Jens Oehlschl\xe4gel-Akiyoshi") Error in nchar("<I>Jens Oehlschl\xe4gel-Akiyoshi") : invalid multibyte string, element 1 I find it strange that substr/nchar give an error but regexpr works for telling me the length: > regexpr(".*", "<I>Jens Oehlschl\xe4gel-Akiyoshi") [1] 1 attr(,"match.length") [1] 29 Is that inconsistency normal/intended? btw this example comes from our very own list: > readLines(" https://stat.ethz.ch/pipermail/r-devel/1999-November/author.html")[28] [1] "<I>Jens Oehlschl\xe4gel-Akiyoshi" Best, Toby [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel