On May 21, 2015 5:28:14 PM GMT+02:00, Jakub Jelinek <ja...@redhat.com> wrote: >Hi! > >We ICE on the following testcase at -O3 on x86_64-linux, because >gimple folding attempts to simplify FLOAT_EXPR conversion of >signed V4SI to V4SF feeding FIX_TRUNC_EXPR to unsigned V4SI >into a FIX_TRUNC_EXPR with unsigned V4SI lhs and signed V4SI rhs1, >which is invalid GIMPLE. >All the other simplifications in the same iterator block don't >optimize anything for vector types, and I can't find out any case >where something like this would be beneficial for vector types. >These days we represent source level casts of vectors to same sized >integers as VIEW_CONVERT_EXPR, which isn't handled in here, >and *_prec doesn't really mean what it tests for vector types >(it is log2 of number of elements), vector integer or float widening >is not represented using convert/float/fix_trunc, but using >VEC_PERM_EXPR, >VEC_UNPACK*_{LO,HI}_EXPR etc. >I've bootstrapped/regtested with a logging variant and if >(inside_vec || inter_vec || final_vec) is true, we (mis)optimize >anything only on the testcase included in the patch and on >gfortran.dg/stfunc_4.f90 testcase, in both cases it is >V4SI -> V4SF -> V4SI, which we really shouldn't be optimizing, >because SF mode obviously can't represent all integers exactly. > >So, this patch disables optimizing vectors. >Ok for trunk/5.2 if bootstrap/regtest succeeds?
OK. Thanks, Richard. >For 4.9/4.8 a similar patch will be needed, but to >fold-const.c/tree-ssa-forwprop.c instead of match.pd. > >2015-05-21 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/66233 > * match.pd (ocvt (icvt@1 @0)): Don't handle vector types. > Simplify. > > * gcc.c-torture/execute/pr66233.c: New test. > >--- gcc/match.pd.jj 2015-05-19 15:53:43.000000000 +0200 >+++ gcc/match.pd 2015-05-21 16:21:35.627916502 +0200 >@@ -730,16 +730,12 @@ (define_operator_list inverted_tcc_compa > (for integers). Avoid this if the final type is a pointer since > then we sometimes need the middle conversion. Likewise if the > final type has a precision not equal to the size of its mode. */ >- (if (((inter_int && inside_int) >- || (inter_float && inside_float) >- || (inter_vec && inside_vec)) >+ (if (((inter_int && inside_int) || (inter_float && inside_float)) >+ && (final_int || final_float) > && inter_prec >= inside_prec >- && (inter_float || inter_vec >- || inter_unsignedp == inside_unsignedp) >- && ! (final_prec != GET_MODE_PRECISION (element_mode (type)) >- && element_mode (type) == element_mode (inter_type)) >- && ! final_ptr >- && (! final_vec || inter_prec == inside_prec)) >+ && (inter_float || inter_unsignedp == inside_unsignedp) >+ && ! (final_prec != GET_MODE_PRECISION (TYPE_MODE (type)) >+ && TYPE_MODE (type) == TYPE_MODE (inter_type))) > (ocvt @0)) > > /* If we have a sign-extension of a zero-extended value, we can >--- gcc/testsuite/gcc.c-torture/execute/pr66233.c.jj 2015-05-21 >17:13:32.639713225 +0200 >+++ gcc/testsuite/gcc.c-torture/execute/pr66233.c 2015-05-21 >17:10:57.000000000 +0200 >@@ -0,0 +1,22 @@ >+/* PR tree-optimization/66233 */ >+ >+unsigned int v[8]; >+ >+__attribute__((noinline, noclone)) void >+foo (void) >+{ >+ int i; >+ for (i = 0; i < 8; i++) >+ v[i] = (float) i; >+} >+ >+int >+main () >+{ >+ unsigned int i; >+ foo (); >+ for (i = 0; i < 8; i++) >+ if (v[i] != i) >+ __builtin_abort (); >+ return 0; >+} > > Jakub