binutils/windmc: UTF16 bug has been fixed but semantics may not be right

John Duncan Wed, 27 Nov 2013 06:34:40 -0800

Hi,

I came across the UTF16 bug using an older version of binutils that I
noticed has been fixed in the current tree. However, I do not know if the
sematics are correct. I think it works on all little-endian machines now
but is likely to not work on big-endian machines. Unfortunately, I don't
have a BE machine to test with.


The reason is that the input file is converted to UTF16LE and then run
through the lexer. The previous bug was occurring because the lexer
expected to work with integers and, when fed big-endian UTF16 was treating
000A as 0A00. Well, a big-endian machine working on UTF16LE input is going
to have the same problem. Since the output binaries should be UTF16LE, I
suppose the best solution is byte-reading.

I've included a patch, but I can't test it on a big-endian machine. Seems
to work on a little-endian machine. A better approach may be to read the
file into UTF8 and work on it like that, then output the UTF16 strings, but
it would require some redesigning of the parser and lexer.

-- 
John Duncan

mclex-c-endian.patch
Description: Binary data

_______________________________________________
bug-binutils mailing list
bug-binutils@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-binutils

binutils/windmc: UTF16 bug has been fixed but semantics may not be right

Reply via email to