Control: tag -1 confirmed Hi Dominique,
Quoting Dominique Dumont (2020-02-20 17:15:29) > While packaging nqp, I've noticed a discrepancy in licensecheck output: > > licensecheck correctly reports the absence of information when > scanning nqp/115-nums.t file from nqp directory: > > $ licensecheck --encoding utf8 --copyright --machine --recursive nqp | grep > 115 > nqp/115-nums.t UNKNOWN *No copyright* > > licensecheck correctly reports garbage when scanning nqp/115-nums.t > file from current directory: > > $ licensecheck --encoding utf8 --copyright --machine --recursive . | grep 115 > ./nqp/115-nums.t UNKNOWN ೨೪, ೫e-೩೨೪, '6e-324 denormal > equates to 5e-324 denormal (Uni)'); / ೨೪, ೫e-೩೨೪, '5e-324 > denormal equates to 5e-324 denormal (Uni)'); / ೨೪, à³§e-೩೨೩, > '9e-324 denormal is 1e-323 (Uni)'); / ೨೪, ೦e೦, 'denormal 5e-324 is > recognized and is not 0 (Uni)'); / ೨೪, ೦e೦, '2e-324 denormal is 0e0 > (Uni)'); / e-೩೨೪, ೫e-೩೨೪, '2e-324 denormal equates to 5e-324 > denormal (Uni)'); > > The mis-decoded file contains © character hence the mojibake garbage. > > I would expect --encoding utf8 option to be used to read all files. Thanks for an excellently framed bugreport! The cause for the difference in output is revealed in --verbose mode: $ licensecheck --encoding utf8 --copyright --machine --recursive --verbose . | grep '115\|cannot be read' file moar/05-decoder.t cannot be read with App::Licensecheck=HASH(0x563009ec7500)->encoding; encoding, will try latin-1: ----- nqp/115-nums.t header ----- ./nqp/115-nums.t UNKNOWN ೨೪, ೫e-೩೨೪, '6e-324 denormal equates to 5e-324 denormal (Uni)'); / ೨೪, ೫e-೩೨೪, '5e-324 denormal equates to 5e-324 denormal (Uni)'); / ೨೪, à³§e-೩೨೩, '9e-324 denormal is 1e-323 (Uni)'); / ೨೪, ೦e೦, 'denormal 5e-324 is recognized and is not 0 (Uni)'); / ೨೪, ೦e೦, '2e-324 denormal is 0e0 (Uni)'); / e-೩೨೪, ೫e-೩೨೪, '2e-324 denormal equates to 5e-324 denormal (Uni)'); Licensecheck chokes on moar/05-decoder.t and re-reads as latin-1. ...but then licensecheck _continues_ to read following files as latin-1, which is wrong. (enabling --verbose also reveals that Licensecheck wrongly treats Encode objects as strings, as seen with the HASH string in the warning message) - Jonas -- * Jonas Smedegaard - idealist & Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private
signature.asc
Description: signature