https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363
--- Comment #19 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 4 May 2021, vgupta at synopsys dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 > > --- Comment #18 from Vineet Gupta <vgupta at synopsys dot com> --- > (In reply to Richard Biener from comment #9) > > (In reply to Linus Torvalds from comment #8) > > > (In reply to Alexander Monakov from comment #7) > > > > > > > > Most likely the issue is that sout/sfrom are misaligned at runtime, > > > > while > > > > the vectorized code somewhere relies on them being sufficiently aligned > > > > for > > > > a 'short'. > > > > > > They absolutely are. > > > > > > And we build the kernel with -Wno-strict-aliasing exactly to make sure the > > > compiler doesn't think that "oh, I can make aliasing decisions based on > > > type > > > information". > > > > > > Because we have those kinds of issues all over, and we know which > > > architectures support unaligned loads etc, and all the tricks with > > > "memcpy()" and unions make for entirely unreadable code. > > > > > > So please fix the aliasing logic to not be type-based when people > > > explicitly > > > tell you not to do that. > > > > > > Linus > > > > Note alignment has nothing to do with strict-aliasing (-fno-strict-aliasing > > you mean btw). > > > > One thing we do is (I'm not 50% sure this explains the observed issue) > > assume > > that if you have two accesses with type 'short' and they are aligned > > according to this type then they will not partly overlap. Note this has > > nothing to do with C strict aliasing rules but is basic pointer math when > > you know lower zero bits. > > OK, given that source code has type short, they will assume these things are > short aligned and thus won't overlap for short accesses. But then the code > actually generated by loop vectorizer assumes they are 8 bytes apart - since > that is what it is generating. That's guarded by a runtime check but this check again assumes the accesses are aligned as short and thus will fail if not