Re: [R-pkg-devel] LaTeX Error: Unicode character not set up for use with LaTeX

Steven Jenkins Sun, 19 Jan 2025 19:14:54 -0800

Simon,

Thanks for the clear and direct answer. I was prepared for that to be the case 
:-).


Steve

> On Jan 19, 2025, at 7:48 PM, Simon Urbanek <simon.urba...@r-project.org> 
> wrote:
> 
> Steven,
> 
> 
>> On 19 Jan 2025, at 05:37, Steven Jenkins <sjenk...@studioj.us> wrote:
>> 
>> Longtime R user, first-time packager. Devtools are great; it’s pretty easy.
>> 
>> I’m trying to submit a package (https://github.com/jsjuni/massProps) that 
>> calculates mass properties and uncertainties for assembly trees. Very 
>> standard mechanical engineering stuff. The data are represented in a data 
>> frame and a tree.
>> 
>> It is customary in this and other fields to name an uncertainty parameter 
>> with “σ” subscripted by the symbol of the parameter it characterizes. My 
>> code assumes a data frame with column names including σ_mass, σ_Cx, σ_Cy, 
>> σ_Ixx, etc. (If you can’t see it in your locale, it’s GREEK SMALL LETTER 
>> SIGMA, U+03C3.)
>> 
>> At least in my locale, these are valid names in R, so I used expressions 
>> like result$σ_mass and input[row, “σ_mass”] liberally. Everything works.
>> 
>> Unfortunately, CRAN doesn’t like that. check() complains about Unicode 
>> characters. Easy enough to fix. Putting escapes in the R names made the 
>> computation code ugly, so I replaced all the internals as, e.g., 
>> result$sigma_mass. Read and update operations on the data frame are 
>> isolated, so it was fairly straightforward to fix those as input[row, 
>> "\u03c3_mass”].
>> 
>> Again, all good. All unit tests pass, check() produces no errors, no 
>> warnings, no notes. Doxygen contents documenting the columns (which retain 
>> the Unicode character) render perfectly.
>> 
>> Then I get to check(manual = TRUE). LaTeX issues many complaints:
>> 
>> ! LaTeX Error: Unicode character σ (U+03C3)
>>                not set up for use with LaTeX.
>> 
>> After a good bit of searching, I can't find a fix. Bookdown suggests setting 
>> the LaTeX engine to “xelatex”, but I don’t know whether that’s applicable 
>> (or possible) here.
>> 
> 
> 
> No, it is not, it has to work with pdflatex.
> 
> 
>> So, two questions: (1) Is it bad practice to name columns like this in 
>> external serialization?
> 
> 
> Yes. See R-exts 1.6.3:
> "First, consider carefully if you really need non-ASCII text. Some users of R 
> will only be able to view correctly text in their native language group (e.g. 
> Western European, Eastern European, Simplified Chinese) and ASCII. Other 
> characters may not be rendered at all, rendered incorrectly, or cause your R 
> code to give an error. "
> 
>> m
>     sigma_mass
> [1,]  -2.173881
> [2,]   1.184118
> [3,]  -1.295566
> 
> is a lot more readable and usable than 
> 
>> m
>            <U+03C3>_mass
> [1,]     -2.173881
> [2,]      1.184118
> [3,]     -1.295566
> 
> 
>> Obviously, I can just use “sigma_” everywhere, but I prefer it the way it is.
> 
> 
> I suspect you may be the a very small minority as non-ASCII characters are 
> really problematic in code as they require special input methods, making your 
> symbols mostly unusable. Unicode has its place in data as text, e.g. when 
> rendering names or words in other languages, but even there they require 
> careful handling (e.g., still won't work in LaTeX output). Row names are fine 
> as long as they reflect real data (e.g. names/words), but in your case it’s 
> just an unnecessarily awkward use of a symbol so it does not fall into that 
> category.
> 
> Cheers,
> Simon
> 
> 
> 
>> (2) Is there some way to change either the Doxygen input or the LaTeX 
>> configuration to correct the problem? (Escaping the Doxygen input doesn’t 
>> work.)
>> 
>> Thanks in advance for your guidance.
>> 
>> Steve
>> [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-package-devel@r-project.org <mailto:R-package-devel@r-project.org> mailing 
>> list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel


        [[alternative HTML version deleted]]

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] LaTeX Error: Unicode character not set up for use with LaTeX

Reply via email to