https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105544

--- Comment #10 from ibuclaw at gcc dot gnu.org ---
(In reply to Fabian Vogt from comment #9)
> (In reply to ibuclaw from comment #8)
> > (In reply to Fabian Vogt from comment #6)
> > > I had a quick debugging session: The DMD lexer code doesn't really care
> > > about the size of the buffer and instead runs until it encounters either 
> > > a 0
> > > or 0x1A byte. The stdin read loop in d_parse_file doesn't explicitly
> > > 0-terminate the buffer, which means that it works randomly...
> > > 
> > 
> > OK, so the suggestion would be to zero the padding at the end of the input
> > buffer then.
> >
> > --- a/gcc/d/d-lang.cc
> > +++ b/gcc/d/d-lang.cc
> > @@ -1072,6 +1072,10 @@ d_parse_file (void)
> >                                   global.params.doHdrGeneration);
> >       modules.push (m);
> >  
> > +     /* Zero the padding past the end of the buffer so the D lexer has a
> > +        sentinel.  The lexer only reads up to 4 bytes at a time.  */
> > +     memset (buffer + len, '\0', 16);
> > +
> >       /* Overwrite the source file for the module, the one created by
> >          Module::create would have a forced a `.d' suffix.  */
> >       m->src.length = len;
> 
> Yep, that should work. Though I wonder why 16B of padding and not just a
> single byte for the 0. FWICT the lexer reads a single byte at a time only
> (utf8_t is an unsigned char), so it should stop at the first 0.
> 
> The comment above explaining the padding mentions a "final '\n'" which
> should probably be adjusted with the change to \0.

The lexer scans spaces 4 bytes at a time (*cast(uint*)p == 0x20202020). So
should zero at least 4 bytes to avoid asan complaining about reading
uninitialized memory.

Reply via email to