https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105544

--- Comment #11 from Fabian Vogt <fab...@ritter-vogt.de> ---
(In reply to ibuclaw from comment #10)
> (In reply to Fabian Vogt from comment #9)
> > (In reply to ibuclaw from comment #8)
> > > (In reply to Fabian Vogt from comment #6)
> > > > I had a quick debugging session: The DMD lexer code doesn't really care
> > > > about the size of the buffer and instead runs until it encounters 
> > > > either a 0
> > > > or 0x1A byte. The stdin read loop in d_parse_file doesn't explicitly
> > > > 0-terminate the buffer, which means that it works randomly...
> > > > 
> > > 
> > > OK, so the suggestion would be to zero the padding at the end of the input
> > > buffer then.
> > >
> > > --- a/gcc/d/d-lang.cc
> > > +++ b/gcc/d/d-lang.cc
> > > @@ -1072,6 +1072,10 @@ d_parse_file (void)
> > >                                 global.params.doHdrGeneration);
> > >     modules.push (m);
> > >  
> > > +   /* Zero the padding past the end of the buffer so the D lexer has a
> > > +      sentinel.  The lexer only reads up to 4 bytes at a time.  */
> > > +   memset (buffer + len, '\0', 16);
> > > +
> > >     /* Overwrite the source file for the module, the one created by
> > >        Module::create would have a forced a `.d' suffix.  */
> > >     m->src.length = len;
> > 
> > Yep, that should work. Though I wonder why 16B of padding and not just a
> > single byte for the 0. FWICT the lexer reads a single byte at a time only
> > (utf8_t is an unsigned char), so it should stop at the first 0.
> > 
> > The comment above explaining the padding mentions a "final '\n'" which
> > should probably be adjusted with the change to \0.
> 
> The lexer scans spaces 4 bytes at a time (*cast(uint*)p == 0x20202020). So
> should zero at least 4 bytes to avoid asan complaining about reading
> uninitialized memory.

Indeed, that's the case with GCC 12. I've been looking at the code from GCC 11,
where it doesn't do that yet (and the frontend is still in C).

Reply via email to