Re: On the subject of module consumer diagnostics.

2024-09-06 Thread Ben Boeckel via Gcc
On Tue, Sep 03, 2024 at 16:53:43 +0100, Iain Sandoe wrote:
> I think that might be a misunderstanding on the part of the author;
> AFAIU both GCC and MSVC _do_ require access to the sources at BMI
> consume-time to give decent diagnostics.   I think that there might be
> confusion because the compilation would suceed on those toolchains
> without the sources - but with poorer diagnostic quality?

Does this have (additional) implications for caching tools and modules?
They cache diagnostic output, but if these other paths showing up or
disappearing affects the output, the cache key should incorporate that
as well. Should there be a way for such tools to get this information
somehow? Ideally the paths would only matter if reported diagnostics
*would* look at the files, not just "there's a BMI that mentions a
source path X" kind of inspection.

--Ben


Re: #pragma once behavior

2024-09-06 Thread Ben Boeckel via Gcc
On Fri, Sep 06, 2024 at 00:03:23 -0500, Jeremy Rifkin wrote:
> Hello,
> 
> I'm looking at #pragma once behavior among the major C/C++ compilers as
> part of a proposal paper for standardizing #pragma once. (This is
> apparently a very controversial topic)
> 
> To put my question up-front: Would GCC ever be open to altering its
> #pragma once behavior to bring it more in-line with behavior from other
> compilers and possibly more in-line with what users expect?
> 
> To elaborate more:
> 
> Design decisions for #pragma once essentially boil down to a file-based
> definitions vs a content-based definition of "same file".
> 
> A file-based definition is easier to reason about and more in-line with
> what users expect, however, distinct copies of headers can't be handled
> and multiple mount points are problematic.
> 
> A content-based definition works for distinct copies, multiple mount
> points, and is completely sufficient 99% of the time, however, it could
> potentially break in hard-to-debug ways in a few notable cases (more
> information later).
> 
> Currently the three major C/C++ compilers treat #pragma once very differently:
> - GCC uses file mtime + file contents
> - Clang uses inodes
> - MSVC uses file path
> 
> None of the major compilers have documented their #pragma once semantics.
> 
> In practice all three of these approaches work pretty well most of the
> time (which is why people feel comfortable using #pragma once). However,
> they can each break in their own ways.
> 
> As mentioned earlier, clang and MSVC's file-based definitions of "same
> file" break for multiple mount points and multiple copies of the same
> header. MSVC's approach breaks for symbolic links and hard links.
> 
> GCC's hybrid approach can break in surprising ways. I have three
> examples to share:
> 
> Example 1:
> 
> Consider a scenario such as:
> 
> usr/
>   include/
> library_a/
>   library_main.hpp
>   foo.hpp
> library_b/
>   library_main.hpp
>   foo.hpp
> src/
>   main.cpp
> 
> main.cpp:
> #include "library_a/library_main.hpp"
> #include "library_b/library_main.hpp"
> 
> And both library_main.hpp's have:
> #pragma once
> #include "foo.hpp"

Could a "uses the relative search path" fact be used to mix into the
file's identity? This way the `once` key would see "this content looked
for things in directory `library_a`" and would see that
`library_b/library_main.hpp`, despite the same content (and mtime) is
actually a different context and actually perform the inclusions?

Of course, this fails if `#include "../common/foo.hpp"` is used in each
location as that *would* then want to elide the second inclusion. I
don't know how this problem is avoided without actually reading the
contents again. But the "I read this file" can remember what relative
paths were searched (since the contents are the same at least).

> Example 2:
> 
> namespace v1 {
> #include "library_v1.hpp"
> }
> namespace v2 {
> #include "library_v2.hpp"
> }
> 
> Where both library headers include their own copy of a shared header
> using #pragma once.

Again, the context of the inclusion matters, so "is wrapped in a scope"
can modify the "onceness" (`extern "C"` is probably the more common
instance).

> Example 3:
> 
> usr/
>   include/
> library/
>   library.hpp
>   vendored-dependency.hpp
> src/
>   main.cpp
>   vendored-dependency.hpp
> 
> main.cpp:
> #include "vendored-dependency.hpp"
> #include 
> 
> library.hpp:
> #pragma once
> #include "vendored-dependency.hpp"

This is basically the same as Example 1 as far as context goes.

Note that context cannot include `#define` state because `#once` is
defined to be the first thing in the file and a file that is intended to
be included multiple times (e.g., Boost.PP shenanigans) in different
states cannot, in good faith, use `#once`.

Hrm…though if we are doing `otherdir/samecontent`, the different
preprocessor state *might* change that "what relative files did we look
for?" state… Nothing is easy :( .

--Ben


Re: On the subject of module consumer diagnostics.

2024-09-06 Thread David Malcolm via Gcc
On Fri, 2024-09-06 at 08:44 -0400, Ben Boeckel via Gcc wrote:
> On Tue, Sep 03, 2024 at 16:53:43 +0100, Iain Sandoe wrote:
> > I think that might be a misunderstanding on the part of the author;
> > AFAIU both GCC and MSVC _do_ require access to the sources at BMI
> > consume-time to give decent diagnostics.   I think that there might
> > be
> > confusion because the compilation would suceed on those toolchains
> > without the sources - but with poorer diagnostic quality?
> 
> Does this have (additional) implications for caching tools and
> modules?
> They cache diagnostic output, but if these other paths showing up or
> disappearing affects the output, the cache key should incorporate
> that
> as well.

What kinds of caching tools are you thinking of?

I'm curious about caching of diagnostics, and how the diagnostics are
represented in the cache.

FWIW, SARIF has a way of storing the source associated with a
diagnostic (and/or hashes of the source), and GCC's SARIF output uses
this to capture the source of any file referred to by path by a
diagnostic in the SARIF output (but we don't yet capture hashes of
source).

Dave

>  Should there be a way for such tools to get this information
> somehow? Ideally the paths would only matter if reported diagnostics
> *would* look at the files, not just "there's a BMI that mentions a
> source path X" kind of inspection.
> 
> --Ben
> 



Re: On the subject of module consumer diagnostics.

2024-09-06 Thread Ben Boeckel via Gcc
On Fri, Sep 06, 2024 at 09:30:26 -0400, David Malcolm wrote:
> On Fri, 2024-09-06 at 08:44 -0400, Ben Boeckel via Gcc wrote:
> > Does this have (additional) implications for caching tools and
> > modules?
> > They cache diagnostic output, but if these other paths showing up or
> > disappearing affects the output, the cache key should incorporate
> > that
> > as well.
> 
> What kinds of caching tools are you thinking of?

`ccache`, `sccache`, etc. These tools try to detect if the compilation
would be the same and place the object in its output location and report
the cached output on stdout/stderr as performed in the original compile
so that it acts "just like the compiler"'s execution.

> I'm curious about caching of diagnostics, and how the diagnostics are
> represented in the cache.

I know `sccache` just stores it as a text blob; `ccache` is probably the
same, but I haven't been in its code myself to know.

--Ben


Re: On the subject of module consumer diagnostics.

2024-09-06 Thread Jason Merrill via Gcc

On 9/6/24 9:41 AM, Ben Boeckel wrote:

On Fri, Sep 06, 2024 at 09:30:26 -0400, David Malcolm wrote:

On Fri, 2024-09-06 at 08:44 -0400, Ben Boeckel via Gcc wrote:

Does this have (additional) implications for caching tools and
modules?
They cache diagnostic output, but if these other paths showing up or
disappearing affects the output, the cache key should incorporate
that
as well.


What kinds of caching tools are you thinking of?


`ccache`, `sccache`, etc. These tools try to detect if the compilation
would be the same and place the object in its output location and report
the cached output on stdout/stderr as performed in the original compile
so that it acts "just like the compiler"'s execution.


I'm curious about caching of diagnostics, and how the diagnostics are
represented in the cache.


I know `sccache` just stores it as a text blob; `ccache` is probably the
same, but I haven't been in its code myself to know.


Certainly these tools are complicated when the preprocessor output isn't 
enough to reproduce the compilation.  It might be nice to have some 
combined preprocessed form of the primary translation unit and any 
interface units it depends on...


Jason



Proposed new pass to optimise mode register assignments

2024-09-06 Thread Andrew Carlotti via Gcc
Hi,

I'm working on optimising assignments to the AArch64 Floating-point Mode
Register (FPMR), as part of our FP8 enablement work.  Claudio has already
implemented FPMR as a hard register, with the intention that FP8 intrinsic
functions will compile to a combination of an fpmr register set, followed by an
FP8 operation that takes fpmr as an input operand.

It would clearly be inefficient to retain an explicit FPMR assignment prior to
each FP8 instruction (especially in the common case where every assignment uses
the same FPMR value).  I think the best way to optimise this would be to
implement a new pass that can optimise assignments to individual hard registers.

There are a number of existing passes that do similar optimisations, but which
I believe are unsuitable for this scenario for various reasons.  For example:

- cse1 can already optimise FPMR assignments within an extended basic block,
  but can't handle broader optimisations.
- pre (in gcse.c) doesn't work with assigning constant values, which would miss
  many potential usages.  It also has limits on how far code can be moved,
  based around ideas of register pressure that don't apply to the context of a
  single hard register that shouldn't be used by the register allocator for
  anything else.  Additionally, it doesn't run at -Os.
- hoist (also using gcse.c) only handles constant values, and only runs when
  optimising for size.  It also has the rest of the issues that pre does.
- mode_sw only handles a small finite set of modes.  The mode requirements are
  determined solely by the instructions that require the specific mode, so mode
  switches don't depend on the output of previous instructions.


My intention would be for the new pass to reuse ideas, and hopefully some of
the existing code, from the mode-switching and gcse passes.  In particular,
gcse.c (or it's dependencies) has code that could identify when values assigned
to the FPMR are known to be the same (although we may not need the full CSE
capabilities of gcse.c), and mode-switching.cc knows how to globally optimise
mdoe assignments (and unlike gcse.c, doesn't use cautious heuristics to avoid
excessively increasing register pressure).

Initially the new pass would only apply to the AArch64 FPMR register, but in
future it could also be used for other hard registers with similar properties.

Does anyone have any comments on this approach, before I start writing any
code?

Thanks,
Andrew




Re: Proposed CHOST change for the 64bit time_t transition

2024-09-06 Thread Arsen Arsenović via Gcc
Paul Eggert  writes:

> One possible improvement would be to append "t32" if you want 32-bit time_t,
> instead of appending "t64" for 64-bit time_t. That way, people wouldn't be
> stuck with appending that confusing "t64" for the foreseeable future, and only
> specialists concerned with 32-bit time_t would need to know about the issue.

But that'd change semantics in non-obvious ways.  The intention behind
this suggestion is to have a mechanism to communicate to packages and
the toolchain alike that "yes, this system is Y2038-proof".  There is
currently no mechanism to do that.  There isn't even a mechanism to
guess based on your dependencies whether you should also enable LFS and
T64 (and there can't be a general one - you'd need to detect what
libraries are doing what if they have time_t or other system integers on
ABI boundaries, which is not generally possible).  Not that the latter
would suffice - even if we changed all packages we can to use such a
mechanism, there would be plenty of packages that don't (think of all
the hand-rolled makefiles..).

An alternative that I pondered was to teach the linker about some notion
of "compatibility strings" that it would compare and reject if
different, plus teaching the compiler how to emit those, plus teaching
glibc to tell the compiler to emit those..  We could have key-value
pairs in some section.  For each key K, we could have the linker check
that, for each (shared or otherwise) object either does not contain K or
contains K with the same value as all the other ones, and produce an
error otherwise.  On the resulting object, the KV pairs would be the
union of all KV pairs of all constituent objects.

... but this is for i?86, a CPU family I haven't used in ~15 years (and
I suspect many also have not..), and there are other things eating my
time.  And it'd still require a world rebuild.

> Personally, I hope backward-compatibility concerns don't require this sort of
> thing. I'd rather just switch, as Debian has.

The "status quo" of some packages enabling it of their own volition, and
some not, leads to various subtle breakages (example:
https://bugs.gentoo.org/828001).  I think switching like that would not
be much different.

I do not know what approach Debian took, but if it is one of altering
the toolchain, then this is a sure way to introduce subtle divergences
between distros (this is why I've suggested we CC the GCC and binutils
MLs); if it is one of teaching debhelper (is that the right tool?  not
sure) about it, then this will break user-compiled packages (so,
./configure && make && make install, or moral equivalent).  If it is to
alter libc, then, can we do libc.so.7?  ;)

The only actually solid approach I see today is to /somehow/ communicate
to the system to not use 32-bit time, ever (and consequently, to enable
LFS).  I think that the "least effort" path to do that is through the
tuple.

There's precedent for this also, AFAIK, in the 32-bit ARM world
(gnueabi/gnueabihf, whatever that means).

config.guess would need to be altered a little bit.  My preference is
for [[ $os = *-*-gnu*t64* ]] informing glibc to completely ignore
_FILE_OFFSET_BITS, _LARGEFILE_SOURCE, _LARGEFILE64_SOURCE, and
_TIME_BITS and just presume 64 for all of those system integers.  This
means that config.guess could undef those (in case a toolchain sets
those) and include some libc file, then check for sizeof (time_t), or
just have glibc define something if on a gnut64 target.

> I felt the same way about the 64-bit off_t back in the 1990s. It was obvious 
> to
> me even at the time that we would have been significantly better off making
> off_t 64-bit, while keeping 32-bit off_t in the ABI for backward 
> compatibility;
> this is what NetBSD did with time_t in 2012. Although I realize others felt
> differently, I never fully understood their concerns.

That is history now I fear; I also wish that time_t was made
64-bit a long ago ;)

> And here I am, three decades later, still having to make changes[1] to
> Autoconf's AC_SYS_LARGEFILE macro to continue to support that 30-year-old 
> off_t3
> mistake, and now with 64-bit time_t interacting with 64-off_t in 
> non-orthogonal
> ways.

Indeed, and the "best" part is that, whatever you do in autoconf, unless
a program exists in isolation only interfacing with libc, it will break
some consumer (or will be broken by some dependency) because there's no
mechanism to signal the time_t size across ABI boundaries.
-- 
Arsen Arsenović


signature.asc
Description: PGP signature


gcc-13-20240906 is now available

2024-09-06 Thread GCC Administrator via Gcc
Snapshot gcc-13-20240906 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/13-20240906/
and on various mirrors, see https://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 13 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-13 revision 8ad345a8387a5449eee7223b9b3ab8d68a0e6c2e

You'll find:

 gcc-13-20240906.tar.xz   Complete GCC

  SHA256=f64f8c3e3117250ff6f88926ca4b300b5f09cc5d3c700300e12a58032a02cbef
  SHA1=5f9ea8677de1ac0cb1b6cf638112e69abfa485f9

Diffs from 13-20240830 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-13
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Proposed CHOST change for the 64bit time_t transition

2024-09-06 Thread Bruno Haible
Paul Eggert wrote:
> I'd rather just switch, as Debian has.

I'd go one step further, and not only
  make the ABI transition without changing the canonical triplet,
but also
  make gcc and clang define -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64
  among their predefines.

Rationale:

  * We want that a user of a distro with the new ABI can build
packages in the usual way:
  - ./configure; make; make install (when using Autoconf), or
  - make; make install  (when there is just a Makefile).
This *requires* that gcc and clang get patched, as indicated
above.
(Only changing Debian-specific files or variables won't do it.)

  * Once this has been done, is there a need for a triplet change?
Not in the toolchain, and not in the packages either.
Needs that have been mentioned in [1][2]:

  - Users would like to know in which ABI they / their distro lives.
This can be done through a property in /etc/os-release.

  - "risks incompatibility with other distributions" [2]
What is the problem? Do we expect users to build binaries
on 32-bit distro X and try to run them on 32-bit distro Y?
Or do we expect binary package distributors (like Mozilla,
videolan.org) to do so?
It was my impression that this approach is doomed anyway,
because so many shared libraries have different major version
in distro X than in distro Y.
And that such binary package distributors use flatpak, AppImage, etc.
precisely to get out of this dilemma.

  - Building gcc and glibc might need some particular options.
Such options can be documented without requiring a new triplet.

References:
[1] https://wiki.debian.org/ReleaseGoals/64bit-time
[2] https://wiki.gentoo.org/wiki/Project:Toolchain/time64_migration





Re: Proposed CHOST change for the 64bit time_t transition

2024-09-06 Thread Bruno Haible
Arsen Arsenović wrote:
> An alternative that I pondered was to teach the linker about some notion
> of "compatibility strings" that it would compare and reject if
> different, plus teaching the compiler how to emit those, plus teaching
> glibc to tell the compiler to emit those..  We could have key-value
> pairs in some section.  For each key K, we could have the linker check
> that, for each (shared or otherwise) object either does not contain K or
> contains K with the same value as all the other ones, and produce an
> error otherwise.  On the resulting object, the KV pairs would be the
> union of all KV pairs of all constituent objects.

This sounds much like the arm eabi attributes: If a .s file does not
start with
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 2
.eabi_attribute 34, 0
.eabi_attribute 18, 4
the resulting .o file cannot be linked with other .o files on the system.

Is it a hassle even for packages that don't use time_t of off_t (such as
GNU libffcall or libffi).

Yes, it would be useful to have a way to have the linker warn if a binary
that depends on 32-bit time_t and a binary that depends on 64-bit time_t
get linked together. But PLEASE implement this in a way that is a no-op
when time_t is not used by either of the two binaries.

Bruno





Re: #pragma once behavior

2024-09-06 Thread Jeremy Rifkin
Thanks Andrew, I appreciate the context and links. It looks like the
prior implementation failed to handle links due to being based on file
path, given cpp_simplify_pathname. Do you have thoughts on the use if
device ID + inode as a way to also accommodate symbolic links and hard
links without the fickleness of mtime?

Cheers,
Jeremy

On Sep 6 2024, at 12:25 am, Andrew Pinski  wrote:

> On Thu, Sep 5, 2024 at 10:04 PM Jeremy Rifkin  wrote:
>>  
>> Hello,
>>  
>> I'm looking at #pragma once behavior among the major C/C++ compilers as
>> part of a proposal paper for standardizing #pragma once. (This is
>> apparently a very controversial topic)
>>  
>> To put my question up-front: Would GCC ever be open to altering its
>> #pragma once behavior to bring it more in-line with behavior from other
>> compilers and possibly more in-line with what users expect?
>>  
>> To elaborate more:
>>  
>> Design decisions for #pragma once essentially boil down to a file-based
>> definitions vs a content-based definition of "same file".
>>  
>> A file-based definition is easier to reason about and more in-line with
>> what users expect, however, distinct copies of headers can't be handled
>> and multiple mount points are problematic.
>>  
>> A content-based definition works for distinct copies, multiple mount
>> points, and is completely sufficient 99% of the time, however, it could
>> potentially break in hard-to-debug ways in a few notable cases (more
>> information later).
>>  
>> Currently the three major C/C++ compilers treat #pragma once very 
>> differently:
>> - GCC uses file mtime + file contents
>> - Clang uses inodes
>> - MSVC uses file path
>  
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52566#c2 .
> Note this was changed specifically in GCC 3.4 to fix the issue around
> symlinks and hard links.
> See https://gcc.gnu.org/pipermail/gcc-patches/2003-July/111203.html
> for more information on the fixes.
>  
> In fact `#pragma once` was deprecated before GCC 3.4 because it would
> do incorrectly what clang and MSVC are doing and that was considered
> wrong.
> So GCC behavior has been this way before clang was even written.
>  
> Thanks,
> Andrew
>  
>>  
>> None of the major compilers have documented their #pragma once semantics.
>>  
>> In practice all three of these approaches work pretty well most of the
>> time (which is why people feel comfortable using #pragma once). However,
>> they can each break in their own ways.
>>  
>> As mentioned earlier, clang and MSVC's file-based definitions of "same
>> file" break for multiple mount points and multiple copies of the same
>> header. MSVC's approach breaks for symbolic links and hard links.
>>  
>> GCC's hybrid approach can break in surprising ways. I have three
>> examples to share:
>>  
>> Example 1:
>>  
>> Consider a scenario such as:
>>  
>> usr/
>>   include/
>> library_a/
>>   library_main.hpp
>>   foo.hpp
>> library_b/
>>   library_main.hpp
>>   foo.hpp
>> src/
>>   main.cpp
>>  
>> main.cpp:
>> #include "library_a/library_main.hpp"
>> #include "library_b/library_main.hpp"
>>  
>> And both library_main.hpp's have:
>> #pragma once
>> #include "foo.hpp"
>>  
>> Example 2:
>>  
>> namespace v1 {
>> #include "library_v1.hpp"
>> }
>> namespace v2 {
>> #include "library_v2.hpp"
>> }
>>  
>> Where both library headers include their own copy of a shared header
>> using #pragma once.
>>  
>> Example 3:
>>  
>> usr/
>>   include/
>> library/
>>   library.hpp
>>   vendored-dependency.hpp
>> src/
>>   main.cpp
>>   vendored-dependency.hpp
>>  
>> main.cpp:
>> #include "vendored-dependency.hpp"
>> #include 
>>  
>> library.hpp:
>> #pragma once
>> #include "vendored-dependency.hpp"
>>  
>> Assuming the same contents byte-for-byte of vendored-dependency.hpp, and
>> it uses #pragma once.
>>  
>> Each of these examples are plausible scenarios where two files with the
>> same contents could be #included. In each example, on GCC, the code can
>> work or break based on mtime:
>> - Example 1: Breaks if mtimes for library_main.hpp happen to be the same
>> - Example 2: Breaks if mtimes for the shared dependency copies happen to
>> be the same
>> - Example 3: Only works if mtimes are the same
>>  
>> File mtimes can happen to match sometimes, e.g. in a fresh git clone.
>> However, this is a rather fickle criteria to rely on and could easily
>> diverge in the middle of development. Notably, Example 2 was shared with
>> me as an example where #pragma once worked great in development and
>> broke in CI.
>>  
>> Additionally, while GCC's approach might be able to handle multiple
>> mounts better than other approaches, it can still break under multiple
>> mounts if mtime resolution differs.
>>  
>> Obviously there is no silver bullet for making #pragma once work
>> perfectly all the time, however, I think it's easier to provide clear
>> guarantees for #pragma once behavior when the definition of "same file"
>> is based on file identity on d

Re: #pragma once behavior

2024-09-06 Thread Jeremy Rifkin
Thanks Martin,
There's some context on N2896 in the meeting minutes: 
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2941.pdf

I think the key thing about N2896 is that it left unqualified #once
implementation-defined, which is no better than the current state of
affairs. I'm trying to approach this with a focused paper on just
providing a clear set of mechanics for #pragma once. I do recognize the
uphill battle and legitimate concern about "blessing" an unreliable
feature. My thesis is that #pragma once is unreliable today because
every implementation approaches it differently and none of them document
what they do. While no silver bullet is possible, my hope is that the
current state of affairs could at least be made much better by a clear
set of semantics. This would enables a more clear understanding of where
#pragma once may fall over. I'd love to hear your thoughts.

Cheers,
Jeremy

On Sep 6 2024, at 12:58 am, Martin Uecker  wrote:

> 
> There was a recent related proposal for C23.
> 
> https://www9.open-std.org/JTC1/SC22/WG14/www/docs/n2896.htm
> 
> See also the email by Linus Torvalds referenced in
> this paper.
> 
> Note that this proposal was not adopted for ISO C23.
> I can't find when it was discussed,  but IIRC the general
> criticism was that the regular form is not reliable and
> difficult to standardize any specific rules and that the
> form with ID does not add much value over traditional
> include guards.
> 
> Martin
> 
> 
> 
> Am Freitag, dem 06.09.2024 um 00:03 -0500 schrieb Jeremy Rifkin:
>> Hello,
>> 
>> I'm looking at #pragma once behavior among the major C/C++ compilers as
>> part of a proposal paper for standardizing #pragma once. (This is
>> apparently a very controversial topic)
>> 
>> To put my question up-front: Would GCC ever be open to altering its
>> #pragma once behavior to bring it more in-line with behavior from other
>> compilers and possibly more in-line with what users expect?
>> 
>> To elaborate more:
>> 
>> Design decisions for #pragma once essentially boil down to a file-based
>> definitions vs a content-based definition of "same file".
>> 
>> A file-based definition is easier to reason about and more in-line with
>> what users expect, however, distinct copies of headers can't be handled
>> and multiple mount points are problematic.
>> 
>> A content-based definition works for distinct copies, multiple mount
>> points, and is completely sufficient 99% of the time, however, it could
>> potentially break in hard-to-debug ways in a few notable cases (more
>> information later).
>> 
>> Currently the three major C/C++ compilers treat #pragma once very 
>> differently:
>> - GCC uses file mtime + file contents
>> - Clang uses inodes
>> - MSVC uses file path
>> 
>> None of the major compilers have documented their #pragma once semantics.
>> 
>> In practice all three of these approaches work pretty well most of the
>> time (which is why people feel comfortable using #pragma once). However,
>> they can each break in their own ways.
>> 
>> As mentioned earlier, clang and MSVC's file-based definitions of "same
>> file" break for multiple mount points and multiple copies of the same
>> header. MSVC's approach breaks for symbolic links and hard links.
>> 
>> GCC's hybrid approach can break in surprising ways. I have three
>> examples to share:
>> 
>> Example 1:
>> 
>> Consider a scenario such as:
>> 
>> usr/
>>   include/
>> library_a/
>>   library_main.hpp
>>   foo.hpp
>> library_b/
>>   library_main.hpp
>>   foo.hpp
>> src/
>>   main.cpp
>> 
>> main.cpp:
>> #include "library_a/library_main.hpp"
>> #include "library_b/library_main.hpp"
>> 
>> And both library_main.hpp's have:
>> #pragma once
>> #include "foo.hpp"
>> 
>> Example 2:
>> 
>> namespace v1 {
>> #include "library_v1.hpp"
>> }
>> namespace v2 {
>> #include "library_v2.hpp"
>> }
>> 
>> Where both library headers include their own copy of a shared header
>> using #pragma once.
>> 
>> Example 3:
>> 
>> usr/
>>   include/
>> library/
>>   library.hpp
>>   vendored-dependency.hpp
>> src/
>>   main.cpp
>>   vendored-dependency.hpp
>> 
>> main.cpp:
>> #include "vendored-dependency.hpp"
>> #include 
>> 
>> library.hpp:
>> #pragma once
>> #include "vendored-dependency.hpp"
>> 
>> Assuming the same contents byte-for-byte of vendored-dependency.hpp, and
>> it uses #pragma once.
>> 
>> Each of these examples are plausible scenarios where two files with the
>> same contents could be #included. In each example, on GCC, the code can
>> work or break based on mtime:
>> - Example 1: Breaks if mtimes for library_main.hpp happen to be the same
>> - Example 2: Breaks if mtimes for the shared dependency copies happen to
>> be the same
>> - Example 3: Only works if mtimes are the same
>> 
>> File mtimes can happen to match sometimes, e.g. in a fresh git clone.
>> However, this is a rather fickle criteria to rely on and could easily
>> diverge in the middle of development. Notably, Example 2 was s

Re: #pragma once behavior

2024-09-06 Thread Andrew Pinski via Gcc
On Fri, Sep 6, 2024 at 5:49 PM Jeremy Rifkin  wrote:
>
> Thanks Andrew, I appreciate the context and links. It looks like the
> prior implementation failed to handle links due to being based on file
> path, given cpp_simplify_pathname. Do you have thoughts on the use if
> device ID + inode as a way to also accommodate symbolic links and hard
> links without the fickleness of mtime?

Not always. because inodes are not always stable on some file systems.
And also does not work with multi-mounted devices too.
The whole definition of what is the same file is really up for debate here.
I say if the file has the same content, then it is the same file and
GCC uses that definition. While clang says it is based on if it is the
same inode which is not always true because of file systems which
don't use an inode number. While MSVC says it is based on the path but
what is the canonical path to a file, is a hard link to the same file
the same file or not; what about symbolic links? How about overlays
and mounted directories are they the same then?
GCC definition is the only one which supports all issues described
here dealing with inodes (sometimes being non-stable), canonical paths
and both kinds of links and even re-mounted file systems.

What does the other implementations say about changing their
definition of what "the same file is"? Have you asked clang and MSVC
folks?
Anyways GCC has an optimization already for #ifdef/#define/#endif (and
that is documented here:
https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html) so does
it make sense to really standardize `#pramga once` here or just push
other implementations to add a similar optimization instead?

Thanks,
Andrew Pinski

>
> Cheers,
> Jeremy
>
> On Sep 6 2024, at 12:25 am, Andrew Pinski  wrote:
>
> > On Thu, Sep 5, 2024 at 10:04 PM Jeremy Rifkin  wrote:
> >>
> >> Hello,
> >>
> >> I'm looking at #pragma once behavior among the major C/C++ compilers as
> >> part of a proposal paper for standardizing #pragma once. (This is
> >> apparently a very controversial topic)
> >>
> >> To put my question up-front: Would GCC ever be open to altering its
> >> #pragma once behavior to bring it more in-line with behavior from other
> >> compilers and possibly more in-line with what users expect?
> >>
> >> To elaborate more:
> >>
> >> Design decisions for #pragma once essentially boil down to a file-based
> >> definitions vs a content-based definition of "same file".
> >>
> >> A file-based definition is easier to reason about and more in-line with
> >> what users expect, however, distinct copies of headers can't be handled
> >> and multiple mount points are problematic.
> >>
> >> A content-based definition works for distinct copies, multiple mount
> >> points, and is completely sufficient 99% of the time, however, it could
> >> potentially break in hard-to-debug ways in a few notable cases (more
> >> information later).
> >>
> >> Currently the three major C/C++ compilers treat #pragma once very 
> >> differently:
> >> - GCC uses file mtime + file contents
> >> - Clang uses inodes
> >> - MSVC uses file path
> >
> > See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52566#c2 .
> > Note this was changed specifically in GCC 3.4 to fix the issue around
> > symlinks and hard links.
> > See https://gcc.gnu.org/pipermail/gcc-patches/2003-July/111203.html
> > for more information on the fixes.
> >
> > In fact `#pragma once` was deprecated before GCC 3.4 because it would
> > do incorrectly what clang and MSVC are doing and that was considered
> > wrong.
> > So GCC behavior has been this way before clang was even written.
> >
> > Thanks,
> > Andrew
> >
> >>
> >> None of the major compilers have documented their #pragma once semantics.
> >>
> >> In practice all three of these approaches work pretty well most of the
> >> time (which is why people feel comfortable using #pragma once). However,
> >> they can each break in their own ways.
> >>
> >> As mentioned earlier, clang and MSVC's file-based definitions of "same
> >> file" break for multiple mount points and multiple copies of the same
> >> header. MSVC's approach breaks for symbolic links and hard links.
> >>
> >> GCC's hybrid approach can break in surprising ways. I have three
> >> examples to share:
> >>
> >> Example 1:
> >>
> >> Consider a scenario such as:
> >>
> >> usr/
> >>   include/
> >> library_a/
> >>   library_main.hpp
> >>   foo.hpp
> >> library_b/
> >>   library_main.hpp
> >>   foo.hpp
> >> src/
> >>   main.cpp
> >>
> >> main.cpp:
> >> #include "library_a/library_main.hpp"
> >> #include "library_b/library_main.hpp"
> >>
> >> And both library_main.hpp's have:
> >> #pragma once
> >> #include "foo.hpp"
> >>
> >> Example 2:
> >>
> >> namespace v1 {
> >> #include "library_v1.hpp"
> >> }
> >> namespace v2 {
> >> #include "library_v2.hpp"
> >> }
> >>
> >> Where both library headers include their own copy of a shared header
> >> using #pragma once.
> >>
> >> Example 3:
> >>
> >> usr/
> >>   i

Re: #pragma once behavior

2024-09-06 Thread Jeremy Rifkin
> Could a "uses the relative search path" fact be used to mix into the
> file's identity? This way the `once` key would see "this content looked
> for things in directory `library_a`" and would see that
> `library_b/library_main.hpp`, despite the same content (and mtime) is
> actually a different context and actually perform the inclusions?

I think I see what you're getting at. I am having a hard time imagining
an implementation that doesn't lead to a lot of complexity and I think
handling hard links would also be out of the question.

Jeremy

On Sep 6 2024, at 8:26 am, Ben Boeckel  wrote:

> On Fri, Sep 06, 2024 at 00:03:23 -0500, Jeremy Rifkin wrote:
>> Hello,
>>  
>> I'm looking at #pragma once behavior among the major C/C++ compilers as
>> part of a proposal paper for standardizing #pragma once. (This is
>> apparently a very controversial topic)
>>  
>> To put my question up-front: Would GCC ever be open to altering its
>> #pragma once behavior to bring it more in-line with behavior from other
>> compilers and possibly more in-line with what users expect?
>>  
>> To elaborate more:
>>  
>> Design decisions for #pragma once essentially boil down to a file-based
>> definitions vs a content-based definition of "same file".
>>  
>> A file-based definition is easier to reason about and more in-line with
>> what users expect, however, distinct copies of headers can't be handled
>> and multiple mount points are problematic.
>>  
>> A content-based definition works for distinct copies, multiple mount
>> points, and is completely sufficient 99% of the time, however, it could
>> potentially break in hard-to-debug ways in a few notable cases (more
>> information later).
>>  
>> Currently the three major C/C++ compilers treat #pragma once very 
>> differently:
>> - GCC uses file mtime + file contents
>> - Clang uses inodes
>> - MSVC uses file path
>>  
>> None of the major compilers have documented their #pragma once semantics.
>>  
>> In practice all three of these approaches work pretty well most of the
>> time (which is why people feel comfortable using #pragma once). However,
>> they can each break in their own ways.
>>  
>> As mentioned earlier, clang and MSVC's file-based definitions of "same
>> file" break for multiple mount points and multiple copies of the same
>> header. MSVC's approach breaks for symbolic links and hard links.
>>  
>> GCC's hybrid approach can break in surprising ways. I have three
>> examples to share:
>>  
>> Example 1:
>>  
>> Consider a scenario such as:
>>  
>> usr/
>>   include/
>> library_a/
>>   library_main.hpp
>>   foo.hpp
>> library_b/
>>   library_main.hpp
>>   foo.hpp
>> src/
>>   main.cpp
>>  
>> main.cpp:
>> #include "library_a/library_main.hpp"
>> #include "library_b/library_main.hpp"
>>  
>> And both library_main.hpp's have:
>> #pragma once
>> #include "foo.hpp"
>  
> Could a "uses the relative search path" fact be used to mix into the
> file's identity? This way the `once` key would see "this content looked
> for things in directory `library_a`" and would see that
> `library_b/library_main.hpp`, despite the same content (and mtime) is
> actually a different context and actually perform the inclusions?
>  
> Of course, this fails if `#include "../common/foo.hpp"` is used in each
> location as that *would* then want to elide the second inclusion. I
> don't know how this problem is avoided without actually reading the
> contents again. But the "I read this file" can remember what relative
> paths were searched (since the contents are the same at least).
>  
>> Example 2:
>>  
>> namespace v1 {
>> #include "library_v1.hpp"
>> }
>> namespace v2 {
>> #include "library_v2.hpp"
>> }
>>  
>> Where both library headers include their own copy of a shared header
>> using #pragma once.
>  
> Again, the context of the inclusion matters, so "is wrapped in a scope"
> can modify the "onceness" (`extern "C"` is probably the more common
> instance).
>  
>> Example 3:
>>  
>> usr/
>>   include/
>> library/
>>   library.hpp
>>   vendored-dependency.hpp
>> src/
>>   main.cpp
>>   vendored-dependency.hpp
>>  
>> main.cpp:
>> #include "vendored-dependency.hpp"
>> #include 
>>  
>> library.hpp:
>> #pragma once
>> #include "vendored-dependency.hpp"
>  
> This is basically the same as Example 1 as far as context goes.
>  
> Note that context cannot include `#define` state because `#once` is
> defined to be the first thing in the file and a file that is intended to
> be included multiple times (e.g., Boost.PP shenanigans) in different
> states cannot, in good faith, use `#once`.
>  
> Hrm…though if we are doing `otherdir/samecontent`, the different
> preprocessor state *might* change that "what relative files did we look
> for?" state… Nothing is easy :( .
>  
> --Ben
>


Re: #pragma once behavior

2024-09-06 Thread Jeremy Rifkin
Hi Andrew,
Thanks for the thoughts and quick reply.

> Not always. because inodes are not always stable on some file systems.
> And also does not work with multi-mounted devices too.

Unusual filesystems and multiple mounts are indeed the failing. As I
mentioned, there's no silver bullet; they each have pitfalls. I do,
however, think this is a less surprising failure mode than GCC's which
rears its head in surprising and inconsistent cases.

> I say if the file has the same content, then it is the same file and
> GCC uses that definition.

GCC doesn't use this definition, really. It's relying primarily on the
mtime check and only falling back to contents in case of collision.

The point on same contents contents == same file is well received. When
I wrote the first draft of my paper I wrote it proposing this, however,
I have become convinced this isn't the right approach based on examples
where you could intend to include two files with the same contents that
actually mean different things (such as Example 1).

GCC's approach is hybrid, half relying on something from the filesystem
and half relying on the contents. As far as I can tell this can lead to
a worst of both worlds.

> GCC definition is the only one which supports all issues described
> here dealing with inodes (sometimes being non-stable), canonical paths
> and both kinds of links and even re-mounted file systems.

I'd initially been thinking of a content-based solution in order to
avoid any filesystem reliance and support multiple mounts etc. The
problem currently is even GCC's approach, which has the best chance of
working on multiple mounts, doesn't work consistently due to potential
differences in mtime resolution.  

> What does the other implementations say about changing their
> definition of what "the same file is"? Have you asked clang and MSVC
> folks?

I've not yet asked. If I proceed with a proposal paper what I'll most
likely be proposing is what Clang does, worded in terms of same device
same location. I started here since GCC's approach is least similar to
that than what MSVC does. It's also easier to reach out to developers on
open source projects.


Thanks,
Jeremy


On Sep 6 2024, at 8:16 pm, Andrew Pinski  wrote:

> On Fri, Sep 6, 2024 at 5:49 PM Jeremy Rifkin  wrote:
>>  
>> Thanks Andrew, I appreciate the context and links. It looks like the
>> prior implementation failed to handle links due to being based on file
>> path, given cpp_simplify_pathname. Do you have thoughts on the use if
>> device ID + inode as a way to also accommodate symbolic links and hard
>> links without the fickleness of mtime?
>  
> Not always. because inodes are not always stable on some file systems.
> And also does not work with multi-mounted devices too.
> The whole definition of what is the same file is really up for debate here.
> I say if the file has the same content, then it is the same file and
> GCC uses that definition. While clang says it is based on if it is the
> same inode which is not always true because of file systems which
> don't use an inode number. While MSVC says it is based on the path but
> what is the canonical path to a file, is a hard link to the same file
> the same file or not; what about symbolic links? How about overlays
> and mounted directories are they the same then?
> GCC definition is the only one which supports all issues described
> here dealing with inodes (sometimes being non-stable), canonical paths
> and both kinds of links and even re-mounted file systems.
>  
> What does the other implementations say about changing their
> definition of what "the same file is"? Have you asked clang and MSVC
> folks?
> Anyways GCC has an optimization already for #ifdef/#define/#endif (and
> that is documented here:
> https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html) so does
> it make sense to really standardize `#pramga once` here or just push
> other implementations to add a similar optimization instead?
>  
> Thanks,
> Andrew Pinski
>  
>>  
>> Cheers,
>> Jeremy
>>  
>> On Sep 6 2024, at 12:25 am, Andrew Pinski  wrote:
>>  
>> > On Thu, Sep 5, 2024 at 10:04 PM Jeremy Rifkin  wrote:
>> >>
>> >> Hello,
>> >>
>> >> I'm looking at #pragma once behavior among the major C/C++
>> compilers as
>> >> part of a proposal paper for standardizing #pragma once. (This is
>> >> apparently a very controversial topic)
>> >>
>> >> To put my question up-front: Would GCC ever be open to altering its
>> >> #pragma once behavior to bring it more in-line with behavior from other
>> >> compilers and possibly more in-line with what users expect?
>> >>
>> >> To elaborate more:
>> >>
>> >> Design decisions for #pragma once essentially boil down to a file-based
>> >> definitions vs a content-based definition of "same file".
>> >>
>> >> A file-based definition is easier to reason about and more in-line with
>> >> what users expect, however, distinct copies of headers can't be handled
>> >> and multiple mount points are problematic.
>> >>
>>

Re: #pragma once behavior

2024-09-06 Thread Andrew Pinski via Gcc
On Fri, Sep 6, 2024, 7:42 PM Jeremy Rifkin  wrote:

> Hi Andrew,
> Thanks for the thoughts and quick reply.
>
> > Not always. because inodes are not always stable on some file systems.
> > And also does not work with multi-mounted devices too.
>
> Unusual filesystems and multiple mounts are indeed the failing. As I
> mentioned, there's no silver bullet; they each have pitfalls. I do,
> however, think this is a less surprising failure mode than GCC's which
> rears its head in surprising and inconsistent cases.
>
> > I say if the file has the same content, then it is the same file and
> > GCC uses that definition.
>
> GCC doesn't use this definition, really. It's relying primarily on the
> mtime check and only falling back to contents in case of collision.
>
> The point on same contents contents == same file is well received. When
> I wrote the first draft of my paper I wrote it proposing this, however,
> I have become convinced this isn't the right approach based on examples
> where you could intend to include two files with the same contents that
> actually mean different things (such as Example 1).
>
> GCC's approach is hybrid, half relying on something from the filesystem
> and half relying on the contents. As far as I can tell this can lead to
> a worst of both worlds.
>
> > GCC definition is the only one which supports all issues described
> > here dealing with inodes (sometimes being non-stable), canonical paths
> > and both kinds of links and even re-mounted file systems.
>
> I'd initially been thinking of a content-based solution in order to
> avoid any filesystem reliance and support multiple mounts etc. The
> problem currently is even GCC's approach, which has the best chance of
> working on multiple mounts, doesn't work consistently due to potential
> differences in mtime resolution.
>
> > What does the other implementations say about changing their
> > definition of what "the same file is"? Have you asked clang and MSVC
> > folks?
>
> I've not yet asked. If I proceed with a proposal paper what I'll most
> likely be proposing is what Clang does, worded in terms of same device
> same location. I started here since GCC's approach is least similar to
> that than what MSVC does. It's also easier to reach out to developers on
> open source projects.
>

Except the clang solution does not work for some file systems and is broken
when used on them. Maybe those file systems are not in use as they once
were and that is why clang didn't run into folks asking to fix it.

Early 2000s vs now have a different landscape when it comes to file
systems. This is why I said what is the a same file if you can't rely on
inodes working?

Thanks,
Andrew




>
> Thanks,
> Jeremy
>
>
> On Sep 6 2024, at 8:16 pm, Andrew Pinski  wrote:
>
> > On Fri, Sep 6, 2024 at 5:49 PM Jeremy Rifkin  wrote:
> >>
> >> Thanks Andrew, I appreciate the context and links. It looks like the
> >> prior implementation failed to handle links due to being based on file
> >> path, given cpp_simplify_pathname. Do you have thoughts on the use if
> >> device ID + inode as a way to also accommodate symbolic links and hard
> >> links without the fickleness of mtime?
> >
> > Not always. because inodes are not always stable on some file systems.
> > And also does not work with multi-mounted devices too.
> > The whole definition of what is the same file is really up for debate
> here.
> > I say if the file has the same content, then it is the same file and
> > GCC uses that definition. While clang says it is based on if it is the
> > same inode which is not always true because of file systems which
> > don't use an inode number. While MSVC says it is based on the path but
> > what is the canonical path to a file, is a hard link to the same file
> > the same file or not; what about symbolic links? How about overlays
> > and mounted directories are they the same then?
> > GCC definition is the only one which supports all issues described
> > here dealing with inodes (sometimes being non-stable), canonical paths
> > and both kinds of links and even re-mounted file systems.
> >
> > What does the other implementations say about changing their
> > definition of what "the same file is"? Have you asked clang and MSVC
> > folks?
> > Anyways GCC has an optimization already for #ifdef/#define/#endif (and
> > that is documented here:
> > https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html) so does
> > it make sense to really standardize `#pramga once` here or just push
> > other implementations to add a similar optimization instead?
> >
> > Thanks,
> > Andrew Pinski
> >
> >>
> >> Cheers,
> >> Jeremy
> >>
> >> On Sep 6 2024, at 12:25 am, Andrew Pinski  wrote:
> >>
> >> > On Thu, Sep 5, 2024 at 10:04 PM Jeremy Rifkin 
> wrote:
> >> >>
> >> >> Hello,
> >> >>
> >> >> I'm looking at #pragma once behavior among the major C/C++
> >> compilers as
> >> >> part of a proposal paper for standardizing #pragma once. (This is
> >> >> apparently a very controversial topic)
>

Re: #pragma once behavior

2024-09-06 Thread Jeremy Rifkin
> This is why I said what is the a same file if you can't rely on inodes 
> working? 

I don't have a good answer for such a case. Of course, no matter how one
approaches #pragma once there will be cases that aren't handled.

The criteria to optimize for, imo, is which has the most clear failure
mode. Contents happening match could occur naturally without realizing,
which is hard to triage. Mtimes colliding could easily happen without
realizing, which is also hard to triage and reproduce. Path issues pop
up as real build systems use links. Mtime can fail on multiple mounts,
path certainly will. In my opinion, the failure modes for contents and
mtime are very sub-ideal. Path isn't adequate, it seems clear supporting
links is an important goal.

To level-set: I don't think it's reasonable to expect #pragma once to
handle multiple distinct copies of the same file. Especially given that
contents isn't an option.

The failure mode of inodes, however, is a lot clearer. It breaks with
things like multiple mounts and filesystems that don't have inodes. The
way I see it, advice to users becomes clear since it's much clearer
exactly how and why #pragma once might break.

> Early 2000s vs now have a different landscape when it comes to file systems.

Given the landscape today, could it make sense to re-evaluate mtime + content?

Cheers
Jeremy

On Sep 6 2024, at 10:29 pm, Andrew Pinski  wrote:

>> On Fri, Sep 6, 2024, 7:42 PM Jeremy Rifkin  wrote:
>>  
>>> Hi Andrew,
>>> Thanks for the thoughts and quick reply.
>>>  
 Not always. because inodes are not always stable on some file systems.
 And also does not work with multi-mounted devices too.
>>>  
>>> Unusual filesystems and multiple mounts are indeed the failing. As I
>>> mentioned, there's no silver bullet; they each have pitfalls. I do,
>>> however, think this is a less surprising failure mode than GCC's which
>>> rears its head in surprising and inconsistent cases.
>>>  
 I say if the file has the same content, then it is the same file and
 GCC uses that definition.
>>>  
>>> GCC doesn't use this definition, really. It's relying primarily on the
>>> mtime check and only falling back to contents in case of collision.
>>>  
>>> The point on same contents contents == same file is well received. When
>>> I wrote the first draft of my paper I wrote it proposing this, however,
>>> I have become convinced this isn't the right approach based on examples
>>> where you could intend to include two files with the same contents that
>>> actually mean different things (such as Example 1).
>>>  
>>> GCC's approach is hybrid, half relying on something from the filesystem
>>> and half relying on the contents. As far as I can tell this can lead to
>>> a worst of both worlds.
>>>  
 GCC definition is the only one which supports all issues described
 here dealing with inodes (sometimes being non-stable), canonical paths
 and both kinds of links and even re-mounted file systems.
>>>  
>>> I'd initially been thinking of a content-based solution in order to
>>> avoid any filesystem reliance and support multiple mounts etc. The
>>> problem currently is even GCC's approach, which has the best chance of
>>> working on multiple mounts, doesn't work consistently due to potential
>>> differences in mtime resolution. 
>>>  
 What does the other implementations say about changing their
 definition of what "the same file is"? Have you asked clang and MSVC
 folks?
>>>  
>>> I've not yet asked. If I proceed with a proposal paper what I'll most
>>> likely be proposing is what Clang does, worded in terms of same device
>>> same location. I started here since GCC's approach is least similar to
>>> that than what MSVC does. It's also easier to reach out to
>>> developers on
>>> open source projects.
>  
> Except the clang solution does not work for some file systems and is
> broken when used on them. Maybe those file systems are not in use as
> they once were and that is why clang didn't run into folks asking to
> fix it.
>  
> Early 2000s vs now have a different landscape when it comes to file
> systems. This is why I said what is the a same file if you can't rely
> on inodes working? 
>  
> Thanks,
> Andrew
>  
>  
>  
>  
>>  
>>>  
>>>  
>>> Thanks,
>>> Jeremy
>>>  
>>>  
>>> On Sep 6 2024, at 8:16 pm, Andrew Pinski  wrote:
>>>  
 On Fri, Sep 6, 2024 at 5:49 PM Jeremy Rifkin  wrote:
>  
> Thanks Andrew, I appreciate the context and links. It looks like the
> prior implementation failed to handle links due to being based on file
> path, given cpp_simplify_pathname. Do you have thoughts on the use if
> device ID + inode as a way to also accommodate symbolic links and hard
> links without the fickleness of mtime?
  
 Not always. because inodes are not always stable on some file systems.
 And also does not work with multi-mounted devices too.
 The whole definition of what is the same file is really up for
 debate