On Fri, 05 Jun 2015 13:35:24 +0200, Mathieu Roy wrote: > However, here: > > $ cat test.pl > #!/usr/bin/perl > > use HTML::Entities; > $input = "vis-à-vis Beyoncé's naïve\npapier-mâché résumé"; > print encode_entities($input), "\n" > > # EOF > > $ perl test.pl > vis-à -vis Beyoncé's naïve > papier-mâché résumé
Oh, fun with encodings in general and UTF-8 in particular again. This works: % cat test.pl #!/usr/bin/perl use utf8; use HTML::Entities; $input = "vis-à-vis Beyoncé's naïve\npapier-mâché résumé"; print encode_entities($input), "\n" % perl test.pl vis-à-vis Beyoncé's naïve papier-mâché résumé > Where do these à come from? From perl not knowing that the script ins utf8-encoded and taking it as Latin1 or something. So, I'm not sure there is actually a bug somewhere. With "use utf8;" this works, and perl needs to be told about the encoding ... > Plus, as a side bug (require a report on its own?), > man HTML::Entities prints > > For example, this: > > $input = "vis-a-vis Beyonce's naieve\npapier-mache resume"; > print encode_entities($input), "\n" > > Prints this out: > > [...] > > Yes, the man page example is actually stripped of entities to encode! Ouch, ugly. Yes, please report a separate bug. Cheers, gregor -- .''`. Homepage: http://info.comodo.priv.at/ - OpenPGP key 0xBB3A68018649AA06 : :' : Debian GNU/Linux user, admin, and developer - https://www.debian.org/ `. `' Member of VIBE!AT & SPI, fellow of the Free Software Foundation Europe `- NP: Penelope Swales: Lost & Found
signature.asc
Description: Digital Signature