On November 12, 2015 8:10:05 PM GMT+01:00, Senthil Kumar Selvaraj
<[email protected]> wrote:
>Hi,
>
> When analyzing code size differences between a 4.9.x compiler and
> trunk for the AVR target, I found quite a few cases where extra
> registers were being used to hold smaller types (two 8 bit registers
> for a uint8_t, for example).
>
>On deeper analysis, I found that the VRP pass (gcc/tree-vrp.c) was the
>point
> at which the dumps start to diverge.
>
> For code like this
>
>#include <stdint.h>
>
>uint16_t get(const uint16_t wvalue)
>{
> const uint8_t type = (wvalue >> 8);
> uint16_t size = 0;
>
> if (type == 1)
> {
> size = 20;
> }
> else if (type == 2)
> {
> size = 32;
> }
> return size;
>}
>
> VRP substitutes wvalue for the type variable in the conditionals
> (simplify_cond_using_ranges:9506), as it figures out that the range
> of wvalue is the same as a uint8_t). That is, code goes from
>
><bb 2>:
>_3 = wvalue_2(D) >> 8;
>type_4 = (const uint8_t) _3;
>if (type_4 == 1)
> goto <bb 5>;
>else
> goto <bb 3>;
>
>to
>
><bb 2>:
>_3 = wvalue_2(D) >> 8;
>type_4 = (const uint8_t) _3;
>if (_3 == 1)
> goto <bb 5>;
>else
> goto <bb 3>;
>
> This "widening" causes later passes to allocate extra registers
> (holding zeros for the extra bits) and generate extra comparisons
> for the AVR target.
>
> I found that in the 4.9.2 compiler I was benchmarking against, VRP
> didn't figure out the range for wvalue and therefore the folding
> didn't happen, which was why the code was better.
>
> With the native compiler on my machine (gcc 5.2 x86_64) - replacing
> uint8_t by uint32_t and uint16_t by uint64_t, and shifting right by
> 32 bits instead of 8 shows the same effect - the generated code uses
> rdi instead of just di to hold the type variable.
>
> Is this a bug? Should the folding happen only if the type
> conversion was from a smaller type to a bigger one? Or is the backend
> supposed to detect this pattern and deal with it?
We should probably avoid widening beyond word_mode (or sth else if there is sth
suitable).
Richard.
>Regards
>Senthil
>
>
>details-raw vrp1 dump
>
>;; Function get (get, funcdef_no=0, decl_uid=1522, cgraph_uid=0,
>symbol_order=0)
>
>;; 1 loops found
>;;
>;; Loop 0
>;; header 0, latch 1
>;; depth 0, outer -1
>;; nodes: 0 1 2 3 4 5
>;; 2 succs { 5 3 }
>;; 3 succs { 4 5 }
>;; 4 succs { 5 }
>;; 5 succs { 1 }
>
>ASSERT_EXPRs to be inserted
>
>Assertions to be inserted for type_4
> if (type_4 == 1)
>
> BB #3
> EDGE 2->3 2 [55.9%] (FALSE_VALUE,EXECUTABLE)
> PREDICATE: type_4 ne_expr 1
>
>
>
>
>Updating SSA:
>Registering new PHI nodes in block #2
>Updating SSA information for statement type_4 = (const uint8_t) _3;
>Updating SSA information for statement if (type_4 == 1)
>Registering new PHI nodes in block #3
>Updating SSA information for statement type_6 = ASSERT_EXPR <type_4,
>type_4 != 1>;
>Updating SSA information for statement if (type_4 == 2)
>Registering new PHI nodes in block #4
>Registering new PHI nodes in block #5
>
>SSA replacement table
>N_i -> { O_1 ... O_j } means that N_i replaces O_1, ..., O_j
>
>type_6 -> { type_4 }
>Incremental SSA update started at block: 2
>Number of blocks in CFG: 6
>Number of blocks to update: 2 ( 33%)
>Affected blocks: 2 3
>
>
>
>SSA form after inserting ASSERT_EXPRs
>get (const uint16_t wvalue, const uint8_t windex, const void * * const
>address)
>{
> uint16_t size;
> const uint8_t type;
> unsigned int _3;
>
> <bb 2>:
> _3 = wvalue_2(D) >> 8;
> type_4 = (const uint8_t) _3;
> if (type_4 == 1)
> goto <bb 5>;
> else
> goto <bb 3>;
>
> <bb 3>:
> type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
> if (type_6 == 2)
> goto <bb 4>;
> else
> goto <bb 5>;
>
> <bb 4>:
>
> <bb 5>:
> # size_1 = PHI <20(2), 0(3), 32(4)>
> return size_1;
>
>}
>
>
>Immediate_uses:
>
>size_1 : --> single use.
>return size_1;
>
>wvalue_2(D) : --> single use.
>_3 = wvalue_2(D) >> 8;
>
>_3 : --> single use.
>type_4 = (const uint8_t) _3;
>
>type_4 : -->3 uses.
>type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
>type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
>if (type_4 == 1)
>
>.MEM_5(D) : --> single use.
># VUSE <.MEM_5(D)>
>return size_1;
>
>type_6 : --> single use.
>if (type_6 == 2)
>
>Adding destination of edge (0 -> 2) to worklist
>
>Simulating block 2
>
>Visiting statement:
>_3 = wvalue_2(D) >> 8;
>Intersecting
> [0, 255]
>and
> [0, +INF]
>to
> [0, 255]
>Found new range for _3: [0, 255]
>interesting_ssa_edges: adding SSA use in type_4 = (const uint8_t) _3;
>marking stmt to be not simulated again
>
>Visiting statement:
>type_4 = (const uint8_t) _3;
>Found new range for type_4: [0, +INF]
>interesting_ssa_edges: adding SSA use in type_6 = ASSERT_EXPR <type_4,
>type_4 != 1>;
>interesting_ssa_edges: adding SSA use in if (type_4 == 1)
>marking stmt to be not simulated again
>
>Visiting statement:
>if (type_4 == 1)
>
>Visiting conditional with predicate: if (type_4 == 1)
>
>With known ranges
> type_4: [0, +INF]
>
>Predicate evaluates to: DON'T KNOW
>Adding destination of edge (2 -> 5) to worklist
>Adding destination of edge (2 -> 3) to worklist
>
>Simulating block 3
>
>Visiting statement:
>type_6 = ASSERT_EXPR <type_4, type_4 != 1>;
>Intersecting
> ~[1, 1] EQUIVALENCES: { type_4 } (1 elements)
>and
> [0, +INF]
>to
> ~[1, 1] EQUIVALENCES: { type_4 } (1 elements)
>Found new range for type_6: ~[1, 1]
>interesting_ssa_edges: adding SSA use in if (type_6 == 2)
>marking stmt to be not simulated again
>
>Visiting statement:
>if (type_6 == 2)
>
>Visiting conditional with predicate: if (type_6 == 2)
>
>With known ranges
> type_6: ~[1, 1] EQUIVALENCES: { type_4 } (1 elements)
>
>Predicate evaluates to: DON'T KNOW
>Adding destination of edge (3 -> 4) to worklist
>
>Simulating block 4
>
>Simulating block 5
>
>Visiting PHI node: size_1 = PHI <20(2), 0(3), 32(4)>
> Argument #0 (2 -> 5 executable)
> 20: [20, 20]
> Argument #1 (3 -> 5 executable)
> 0: [0, 0]
>Meeting
> [20, 20]
>and
> [0, 0]
>to
> [0, 20]
> Argument #2 (4 -> 5 executable)
> 32: [32, 32]
>Meeting
> [0, 20]
>and
> [32, 32]
>to
> [0, 32]
>Intersecting
> [0, 32]
>and
> [0, +INF]
>to
> [0, 32]
>Found new range for size_1: [0, 32]
>interesting_ssa_edges: adding SSA use in return size_1;
>marking stmt to be not simulated again
>
>Visiting statement:
>return size_1;
>
>Value ranges after VRP:
>
>size_1: [0, 32]
>wvalue_2(D): VARYING
>_3: [0, 255]
>type_4: [0, +INF]
>type_6: ~[1, 1] EQUIVALENCES: { type_4 } (1 elements)
>
>
>
>Substituting values and folding statements
>
>Folding statement: _3 = wvalue_2(D) >> 8;
>Not folded
>Folding statement: type_4 = (const uint8_t) _3;
>Not folded
>Folding statement: if (type_4 == 1)
>Folded into: if (_3 == 1)
>
>Folding statement: if (type_6 == 2)
>Not folded
>Folding PHI node: size_1 = PHI <20(2), 0(3), 32(4)>
>No folding possible
>Folding statement: return size_1;
>Not folded
>get (const uint16_t wvalue, const uint8_t windex, const void * * const
>address)
>{
> uint16_t size;
> const uint8_t type;
> unsigned int _3;
>
> <bb 2>:
> _3 = wvalue_2(D) >> 8;
> type_4 = (const uint8_t) _3;
> if (_3 == 1)
> goto <bb 5>;
> else
> goto <bb 3>;
>
> <bb 3>:
> if (type_4 == 2)
> goto <bb 4>;
> else
> goto <bb 5>;
>
> <bb 4>:
>
> <bb 5>:
> # size_1 = PHI <20(2), 0(3), 32(4)>
> return size_1;
>
>}