Hi,
A rust source file can start with a UTF-8 BOM sequence (EF BB
BF). This simply indicates that the file is encoded as UTF-8 (all rust
input is interpreted as asequence of Unicode code points encoded in
UTF-8) so can be skipped before starting real lexing.
It isn't necessary to keep track of th
The very first thing in a rust source file might be the optional UTF-8
BOM. This is the 3 bytes 0xEF 0xBB 0xBF. They can simply be skipped,
they just mark the file as UTF-8. Add some testcases to show we now
handle such files.
---
gcc/rust/lex/rust-lex.cc| 13
The lexer deals with the UTF-8 BOM and the parser cannot detect
whether there is or isn't a BOM at the start of a file. The flag isn't
relevant or useful in the AST and HIR Crate classes.
---
gcc/rust/ast/rust-ast-full-test.cc | 3 ---
gcc/rust/ast/rust-ast.h | 11 +++