Hi, A rust source file can start with a UTF-8 BOM sequence (EF BB BF). This simply indicates that the file is encoded as UTF-8 (all rust input is interpreted as asequence of Unicode code points encoded in UTF-8) so can be skipped before starting real lexing.
It isn't necessary to keep track of the BOM in the AST or HIR Crate classes. So I removed the has_utf8bom flag. Also included are a couple of simple tests to show we handle the BOM correctly now. [PATCH 1/2] Handle UTF-8 BOM in lexer [PATCH 2/2] Remove has_utf8bom flag from AST and HIR Crate classes Cheers, Mark -- Gcc-rust mailing list Gcc-rust@gcc.gnu.org https://gcc.gnu.org/mailman/listinfo/gcc-rust