Hi,

A rust source file can start with a UTF-8 BOM sequence (EF BB
BF). This simply indicates that the file is encoded as UTF-8 (all rust
input is interpreted as asequence of Unicode code points encoded in
UTF-8) so can be skipped before starting real lexing.

It isn't necessary to keep track of the BOM in the AST or HIR Crate
classes. So I removed the has_utf8bom flag.

Also included are a couple of simple tests to show we handle the BOM
correctly now.

 [PATCH 1/2] Handle UTF-8 BOM in lexer
 [PATCH 2/2] Remove has_utf8bom flag from AST and HIR Crate classes

Cheers,

Mark
-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust

Reply via email to