Re: UTF-8 BOM handling

2021-07-06 Thread Marc via Gcc-rust
Mark Wielaard writes: > Hi, > > A rust source file can start with a UTF-8 BOM sequence (EF BB > BF). This simply indicates that the file is encoded as UTF-8 (all rust > input is interpreted as asequence of Unicode code points encoded in > UTF-8) so can be skipped before starting real lexing. > >

UTF-8 BOM handling

2021-07-05 Thread Mark Wielaard
Hi, A rust source file can start with a UTF-8 BOM sequence (EF BB BF). This simply indicates that the file is encoded as UTF-8 (all rust input is interpreted as asequence of Unicode code points encoded in UTF-8) so can be skipped before starting real lexing. It isn't necessary to keep track of th