On Tue, Oct 12, 2021 at 9:44 PM Martin Sebor <mse...@gmail.com> wrote: > > On 10/12/21 12:52 AM, Richard Biener wrote: > > On Mon, Oct 11, 2021 at 11:25 PM Martin Sebor <mse...@gmail.com> wrote: > >> > >> The attached change extends GCC's warnings for out-of-bounds > >> stores to cover atomic (and __sync) built-ins. > >> > >> Rather than hardcoding the properties of these built-ins just > >> for the sake of the out-of-bounds detection, on the assumption > >> that it might be useful for future optimizations as well, I took > >> the approach of extending class attr_fnspec to express their > >> special property that they encode the size of the access in their > >> name. > >> > >> I also took the liberty of making attr_fnspec assignable (something > >> the rest of my patch relies on), and updating some comments for > >> the characters the class uses to encode function properties, based > >> on my understanding of their purpose. > >> > >> Tested on x86_64-linux. > > > > Hmm, so you place 'A' at an odd place (where the return value is specified), > > but you do not actually specify the behavior on the return value. Shoudln't > > > > + 'A' specifies that the function atomically accesses a constant > > + 1 << N bytes where N is indicated by character 3+2i > > > > maybe read > > > > 'A' specifies that the function returns the memory pointed to > > by argument > > one of size 1 << N bytes where N is indicated by > > character 3 +2i accessed atomically > > > > ? > > I didn't think the return value would be interesting because in > general (parallel accesses) it's not related (in an observable > way) to the value of the dereferenced operand. Not all > the built-ins also return a value (e.g., atomic_store), and > whether or not one does return the argument would need to be > encoded somehow because it cannot be determined from the return > type (__atomic_compare_exchange and __atomic_test_and_set return > bool that's not necessarily the value of the operand). Also, > since the functions return the operand value either before or > after the update, we'd need another letter to describe that. > (This alone could be dealt with simply by using 'A' and 'a', > but that's not enough for the other cases.) > > So with all these possibilities I don't think encoding > the return value at this point is worthwhile. If/when this > enhancement turns out to be used for optimization and we think > encoding the return value would be helpful, I'd say let's > revisit it then. The accessor APIs should make it a fairly > straightforward exercise.
I though it would be useful for points-to analysis since knowing how the return value is composed improves the points-to result for it. Note that IPA mod-ref now synthesizes fn-spec and might make use of 'A' if it were not narrowly defined. Sure it's probably difficult to fully specify the RMW cycle that's eventually done but since we have a way to specify a non-constant size of accesses as passed by a parameter it would be nice to allow specifying a constant size anyhow. It just occured to me we could use "fake" parameters to encode those, so for void foo (int *); use like ". R2c4" saying that parameter 1 is read with the size specified by (non-existing) parameter 2 which is specified as 'c'onstant 1 << 4. Alternatively a constant size specification could use alternate encoding 'a' to 'f'. That said, if 'A' is not suppose to specify the return value it shouldn't be in the return value specification... > > I also wonder if it's necessary to constrain this to 'atomic' accesses > > for the purpose of the patch and whether that detail could be omitted to > > eventually make more use of it? > > I pondered the same question but I couldn't think of any other > built-ins with similar semantics (read-write-modify, return > a result either pre- or post-modification), so I opted for > simplicity. I am open to generalizing it if/when there is > a function I could test it with, although I'm not sure > the current encoding scheme has enough letters and letter > positions to describe the effects in their full generality. > > > > > Likewise > > > > + '0'...'9' specifies the size of value written/read is given either > > + by the specified argument, or for atomic functions, by > > + 2 ^ N where N is the constant value denoted by the character > > > > should mention (excluding '0') for the argument position. > > Sure, I'll update the comment if you think this change is worth > pursuing. > > > > > /* length of the fn spec string. */ > > - const unsigned len; > > + unsigned len; > > > > why that? > > The const member is what prevents the struct from being assignable, > which is what the rest of the patch depends on. > > > > > + /* Return true of the function is an __atomic or __sync built-in. */ > > > > you didn't specify that for 'A' ... > > > > + bool > > + atomic_p () const > > + { > > + return str[0] == 'A'; > > + } > > > > +attr_fnspec > > +atomic_builtin_fnspec (tree callee) > > +{ > > + switch (DECL_FUNCTION_CODE (callee)) > > + { > > +#define BUILTIN_ACCESS_SIZE_FNSPEC(N, lgsz) \ > > + BUILT_IN_ATOMIC_LOAD_ ## N: \ > > + return "Ap" "R" lgsz; > > > > note that doing this for atomics makes those no longer a compiler barrier > > for (aliased) loads and stores which means they are no longer a reliable > > way to implement locks. That's a reason why I never pushed a > > PTA/alias patch I have to open-code this. > > > > Thus, do we really want to do this? > > That's my question to you :) If you don't think this attr_fnspec > extension would be useful for optimization I'll drop this part > of the patch and open-code it separately only for the out-of-bounds > diagnostics. I'd started out that way but the fnspec class made > the code cleaner and if it could be used elsewhere so much > the better. Please let me know. Well - as said, we're relying on the property of a function call being a barrier for global/escaped memory for things like pthread_mutex_lock and at least some of the atomic builtins might be used as locking primitives and most definitely that would break. For example int cnt; int lock; void foo(int n) { for (int i = 0; i < n; ++i) { int val = 0; int ret; if (__atomic_compare_exchange (&lock, &val, 1, false, __ATOMIC_ACQ)) { cnt++; } } } will see PRE applied to the load of 'cnt' with your patch(?), that is, the lock is not a barrier for protected loads and stores. Of course you have to provide some incentive to perform an invalid optimization and the trivial one like initialization before the locked area that can be CSEd is probably invalid since the init then has data races. So yes, I think we should be very careful here. The reason why I chickened out myself doing the obvious improvement for the atomic builtins... Any opinions from others? Richard. > > Martin