Agree with Bert per your stated problem, but want to point out that you don't 
have control over the locale in which your users will be trying to display the 
encoded strings in your data. I am no expert in this, but you will need to 
become one in order to understand your own problem and any solutions you are 
given in r-package-devel. You will likely benefit from reading Kevin Ushey's 
writeup: https://kevinushey.github.io/blog/2018/02/21/string-encoding-and-r/

On September 16, 2021 9:17:05 AM PDT, Bert Gunter <bgunter.4...@gmail.com> 
wrote:
>This should not be posted here. Post on the R-package-devel list instead.
>
>Bert Gunter
>
>"The trouble with having an open mind is that people keep coming along
>and sticking things into it."
>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>On Thu, Sep 16, 2021 at 9:13 AM Marc Girondot via R-help
><r-help@r-project.org> wrote:
>>
>> Hello everyone,
>>
>> I am a little bit stucked on the problem to include a database with
>> utf-8 string in a package. When I submit it to CRAN, it reports NOTES
>> for several Unix system and I try to find a solution (if it exists) to
>> not have these NOTES.
>>
>> The database has references and some names have non ASCII characters.
>>
>> * First I don't agree at all with the solution proposed here:
>>
>> https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Encoding-issues
>>
>> "First, consider carefully if you really need non-ASCIItext."
>>
>> If a language has non ASCII characters, it is not just to make the
>> writting nicer of more complex, it is because it changes the prononciation.
>>
>> * Then I try to find solution to not have these NOTES.
>>
>> For example, here is a reference with utf-8 characters
>>
>> > DatabaseTSD$Reference[211]
>> [1] Hernández-Montoya, V., Páez, V.P. & Ceballos, C.P. (2017) Effects of
>> temperature on sex determination and embryonic development in the
>> red-footed tortoise, Chelonoidis carbonarius. Chelonian Conservation and
>> Biology 16, 164-171.
>>
>> When I convert the characters into unicode, I get indeed only ASCII
>> characters. Perfect.
>>
>> >  iconv(DatabaseTSD$Reference[211], "UTF-8", "ASCII", "Unicode")
>> [1] "Hern<U+00E1>ndez-Montoya, V., P<U+00E1>ez, V.P. & Ceballos, C.P.
>> (2017) Effects of temperature on sex determination and embryonic
>> development in the red-footed tortoise, Chelonoidis carbonarius.
>> Chelonian Conservation and Biology 16, 164-171."
>>
>> Then I have no NOTES when I checked the package with database in UNIX...
>> but how can I print the reference back with original characters ?
>>
>> Thanks a lot to point me to best practices to include databases with
>> non-ASCII characters and not have NOTES while submitted package to CRAN.
>>
>> Marc
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to