On Fri, Jun 27, 2014 at 12:02 PM, Thomas Preud'homme <thomas.preudho...@arm.com> wrote: >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: Friday, June 27, 2014 5:32 PM >> > 4) Cast value stored in a according to the size of b by removing bits of >> > highest weight (via VIEW_CONVERT_EXPR). >> >> Fact is that 4) already is not working. >> >> /* We are at the end of walk, see if we can view convert the >> result. */ >> if (!AGGREGATE_TYPE_P (TREE_TYPE (ctor)) && !offset >> /* VIEW_CONVERT_EXPR is defined only for matching sizes. */ >> && operand_equal_p (TYPE_SIZE (type), >> TYPE_SIZE (TREE_TYPE (ctor)), 0)) >> >> while it's true we don't check precision in the gimple-verifier >> VIEW_CONVERT_EXPR isn't supposed to be applied to types >> with same size but differing precision. VIEW_CONVERT_EXPR >> is supposed to _not_ strip any excess bits! (it's supposed to do >> "nothing"). > > Right but since ctor is an integer constant for a bitfield, this would check > the size of the integer type qualifier of the bitfield, which is the source > of the bug. > >> >> Hmm. I wonder if the code behaves correctly for >> >> union { >> struct { a : 10; b : 6 } a; >> struct { a : 8; b : 8 } b; >> } u = { .a = { .... } }; >> >> and reading u.b.a or u.b.b? >> >> That is, is fold_nonarray_ctor_reference doing the right thing with >> consecutive bitfield fields? > > With my current approach it would for u.b.a but not for u.b.b. The reason > is that fold_ctor_reference will call fold_nonarray_ctor_reference as > long as the constructor is an aggregate type. So for u.b.a it will find the > constructor for u.a.a which completely contains u.b.a and it would work. > > This indeed suggest going towards the support for aggregate int > native_encode_expr and native_interpret_expr. > >> >> Is the current code behaving correctly for size % BITS_PER_UNIT == 0? > > For primitive type (a bitfield could respect this constraint) It does > because the check in fold_ctor_reference would work correctly. > >> If so that might be a good fix for branches (who cares about >> constant folding bitfield constants ... not many, given the bug you >> show). > > The problem is not really that folding doesn't happen, but that it > happens and gives wrong result. Is wrong code generation (even for > bitfield) not serious enough?
Yes - I was specifically looking for an early out to avoid wrong-code. So, is size % BITS_PER_UNIT != 0 a working fix to avoid the wrong-code issues with bitfields (on big-endian targets)? >> > However this double conversion between host and target memory >> > layout is unelegant and unefficient. I could do just one conversion >> > completely in fold_nonarray_ctor_reference but somehow I feel it is >> > not the right place to do it. As anybody a better advice? >> >> I wonder if you have a patch to show us how unelegant it is? > > Not finished yet. I have improved native_interpret_integer but not > native_encode_integer yet. It took me a while to reason about > the various endianness (byte and word) of both target and host. > Hopefully doing native_encode_integer should be easier now. > > However, since I have some other task with higher priority to > handle, I think I'll push something that disables folding for bitfield on > big endian target on Monday and continue working on this at some > later time. I'd rather disable it for all endianesses - gives us more incentive to fix it properly. >> Btw, another idea would be to support native_encode for the >> whole CONSTRUCTOR and then use native_interpret with >> offset/size. Of course for large CONSTRUCTORs that may >> be prohibitive (but native_encode could get a start offset as well, >> to skip uninteresting bytes). >> >> Being able to native encode a CONSTRUCTOR could be >> factored and re-used from asm output. >> >> And it could be used to more efficiently store large aggregate >> constant initializers (native encode it and compress it instead >> of keeping CONSTRUCTOR). > > That would also allow to fold example such as the one you > mentioned above. However that's more work than the current > plan so that would be a second step. Yeah (the code is already there in varasm.c:output_constructor - it just uses a different "buffer"). >> >> So any incremental improvement to native_{encode/interpret}_expr >> is good. > > Yeah, also my improvement also should fix some confusion between > CHAR_BIT and BITS_PER_UNIT in the current code (not done yet but > I'll definitely fix that before I propose the patch for review). Thanks, Richard. > Best regards, > > Thomas > >