Simon, Thanks for the clear and direct answer. I was prepared for that to be the case :-).
Steve > On Jan 19, 2025, at 7:48 PM, Simon Urbanek <simon.urba...@r-project.org> > wrote: > > Steven, > > >> On 19 Jan 2025, at 05:37, Steven Jenkins <sjenk...@studioj.us> wrote: >> >> Longtime R user, first-time packager. Devtools are great; it’s pretty easy. >> >> I’m trying to submit a package (https://github.com/jsjuni/massProps) that >> calculates mass properties and uncertainties for assembly trees. Very >> standard mechanical engineering stuff. The data are represented in a data >> frame and a tree. >> >> It is customary in this and other fields to name an uncertainty parameter >> with “σ” subscripted by the symbol of the parameter it characterizes. My >> code assumes a data frame with column names including σ_mass, σ_Cx, σ_Cy, >> σ_Ixx, etc. (If you can’t see it in your locale, it’s GREEK SMALL LETTER >> SIGMA, U+03C3.) >> >> At least in my locale, these are valid names in R, so I used expressions >> like result$σ_mass and input[row, “σ_mass”] liberally. Everything works. >> >> Unfortunately, CRAN doesn’t like that. check() complains about Unicode >> characters. Easy enough to fix. Putting escapes in the R names made the >> computation code ugly, so I replaced all the internals as, e.g., >> result$sigma_mass. Read and update operations on the data frame are >> isolated, so it was fairly straightforward to fix those as input[row, >> "\u03c3_mass”]. >> >> Again, all good. All unit tests pass, check() produces no errors, no >> warnings, no notes. Doxygen contents documenting the columns (which retain >> the Unicode character) render perfectly. >> >> Then I get to check(manual = TRUE). LaTeX issues many complaints: >> >> ! LaTeX Error: Unicode character σ (U+03C3) >> not set up for use with LaTeX. >> >> After a good bit of searching, I can't find a fix. Bookdown suggests setting >> the LaTeX engine to “xelatex”, but I don’t know whether that’s applicable >> (or possible) here. >> > > > No, it is not, it has to work with pdflatex. > > >> So, two questions: (1) Is it bad practice to name columns like this in >> external serialization? > > > Yes. See R-exts 1.6.3: > "First, consider carefully if you really need non-ASCII text. Some users of R > will only be able to view correctly text in their native language group (e.g. > Western European, Eastern European, Simplified Chinese) and ASCII. Other > characters may not be rendered at all, rendered incorrectly, or cause your R > code to give an error. " > >> m > sigma_mass > [1,] -2.173881 > [2,] 1.184118 > [3,] -1.295566 > > is a lot more readable and usable than > >> m > <U+03C3>_mass > [1,] -2.173881 > [2,] 1.184118 > [3,] -1.295566 > > >> Obviously, I can just use “sigma_” everywhere, but I prefer it the way it is. > > > I suspect you may be the a very small minority as non-ASCII characters are > really problematic in code as they require special input methods, making your > symbols mostly unusable. Unicode has its place in data as text, e.g. when > rendering names or words in other languages, but even there they require > careful handling (e.g., still won't work in LaTeX output). Row names are fine > as long as they reflect real data (e.g. names/words), but in your case it’s > just an unnecessarily awkward use of a symbol so it does not fall into that > category. > > Cheers, > Simon > > > >> (2) Is there some way to change either the Doxygen input or the LaTeX >> configuration to correct the problem? (Escaping the Doxygen input doesn’t >> work.) >> >> Thanks in advance for your guidance. >> >> Steve >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-package-devel@r-project.org <mailto:R-package-devel@r-project.org> mailing >> list >> https://stat.ethz.ch/mailman/listinfo/r-package-devel [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel