O errors

matthijs at stdin dot nl Wed, 23 Sep 2020 09:01:24 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80753


--- Comment #4 from Matthijs Kooijman <matthijs at stdin dot nl> ---
I looked a bit at the source, and it seems the the problem is (not
surprisingly) that `__has_include` causes the header filename to be put into
the cache, and an error message is only generated when the entire include path
has been processed without resolving the entry from the cache (essentially this
means that an error is only triggered when the entry is put into the cache,
*except* when it happens as part of `__has_include`).

Relevant function is _cpp_find_file:

https://github.com/gcc-mirror/gcc/blob/da13b7737662da11f8fefb28eaf4ed7c50c51767/libcpp/files.c#L506

This is called with kind = _cpp_FFK_HAS_INCLUDE for `__has_include` which
prevents an error here:

https://github.com/gcc-mirror/gcc/blob/da13b7737662da11f8fefb28eaf4ed7c50c51767/libcpp/files.c#L591-L592

And the cache entry is immediately returned on subsequent calls here:

https://github.com/gcc-mirror/gcc/blob/da13b7737662da11f8fefb28eaf4ed7c50c51767/libcpp/files.c#L523-L526

It seems there are continuous lookups in the cache and it took me a while to
realize how the cache actually works (I initially thought that maybe files that
were *not* found) were not actually put in the cache, but AFAIU the
`pfile->file_hash` cache works like this:
 - Hash slots are indexed by filename
 - Each slot contains a linked list of entries.
 - Each entry contains the directory lookups start from. This can be the
start_dir, but also the first quote_include or bracket_include dir. Iow, the
cache entry is only valid for a lookup that starts at that directory, or has
progressed to that directory, to allow a "" include to prime the cache for a <>
include as well.
 - Once a file is found, or the search path is exhausted, the result is stored
in the cache for start_dir and for "" and <> includes if the start dir for
those has been passed.

Anyway, this means that a failed lookup based from __has_include() will cause
the failed result to be put into the cache for one or more directories without
emitting an error and always be returned for subsequent includes without
emitting an error.


An obvious fix would be to simply not put the result in the cache for
_cpp_FFK_HAS_INCLUDE, but that would be a missed cache opportunity.

An alternative would be to add a boolean "error_emitted" to each _cpp_file* in
the cache (cannot add it to the cache entry itself, since the same _cpp_file*
might end up in different cache entries), that defaults to false and is set to
true by open_file_failed. When kind is not _cpp_FFK_HAS_INCLUDE, and returning
a cache entry that has "error_emitted" set to false and has an errno call
open_file_failed to emit an error. This would require that open_file_failed to
emit the right output in this case (i.e. the cached _cpp_file* should not rely
on the context in which it was generated), but AFAICS this would be the case.

I'm not quite familiar with building and patching gcc (and also really need to
get this yak hair back to my actual work), so I probably won't be providing a
patch here. Maybe my above analysis enables someone else to do so? :-)

[Bug preprocessor/80753] __has_include and __has_include_next taints subsequent I/O errors

Reply via email to