https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105544
--- Comment #10 from ibuclaw at gcc dot gnu.org --- (In reply to Fabian Vogt from comment #9) > (In reply to ibuclaw from comment #8) > > (In reply to Fabian Vogt from comment #6) > > > I had a quick debugging session: The DMD lexer code doesn't really care > > > about the size of the buffer and instead runs until it encounters either > > > a 0 > > > or 0x1A byte. The stdin read loop in d_parse_file doesn't explicitly > > > 0-terminate the buffer, which means that it works randomly... > > > > > > > OK, so the suggestion would be to zero the padding at the end of the input > > buffer then. > > > > --- a/gcc/d/d-lang.cc > > +++ b/gcc/d/d-lang.cc > > @@ -1072,6 +1072,10 @@ d_parse_file (void) > > global.params.doHdrGeneration); > > modules.push (m); > > > > + /* Zero the padding past the end of the buffer so the D lexer has a > > + sentinel. The lexer only reads up to 4 bytes at a time. */ > > + memset (buffer + len, '\0', 16); > > + > > /* Overwrite the source file for the module, the one created by > > Module::create would have a forced a `.d' suffix. */ > > m->src.length = len; > > Yep, that should work. Though I wonder why 16B of padding and not just a > single byte for the 0. FWICT the lexer reads a single byte at a time only > (utf8_t is an unsigned char), so it should stop at the first 0. > > The comment above explaining the padding mentions a "final '\n'" which > should probably be adjusted with the change to \0. The lexer scans spaces 4 bytes at a time (*cast(uint*)p == 0x20202020). So should zero at least 4 bytes to avoid asan complaining about reading uninitialized memory.