Could we use VIEW_CONVERT_EXPR to build ADDR_EXPR ?
Hi, I am working on bug 45260 and found that the problem is related to VIEW_CONVERT_EXPR. In the prefetching pass, we generate the base address for the prefetching: tree-ssa-loop-prefetch.c (issue_prefetch_ref): addr_base = build_fold_addr_expr_with_type (ref->mem, ptr_type_node); + gcc_assert (is_gimple_address (addr_base)); Here ref->mem is a COMPONENT_REF and contains a VIEW_CONVERT_EXPR. When I put an assert after build_fold_addr_expr_with_type, I found that the addr_base is not a gimple address at all. The direct reason is that the TREE_OPERAND of the VIEW_CONVERT_EXPR is a SSA_NAME. My questions are: (1) Can we generate address expression for COMPONENT_REF and contains VIEW_CONVERT expression (is it legal to do so)? (2) The assert in the bug actually occurs in verify_expr in tree-cfg.c, is this assert valid? I need to understand whether the bug is in the VIEW_CONVERT_EXPR generation or in build_fold_addr_expr_with_type. Thanks for your inputs. Changpeng
RE: Could we use VIEW_CONVERT_EXPR to build ADDR_EXPR ?
>No you should not generate addresses for VCEs that contain a SSA_NAME. > I think you should check if get_base_address is a >is_gimple_addressable inside gather_memory_references_ref. There, TREE_CODE ( get_base_address (ref)) == SSA_NAME and get_base_address (ref) is is_gimple_addressable. However, address expression containing SSA_NAME is NOT considered as a gimple address. Thanks, Changpeng
RE: Could we use VIEW_CONVERT_EXPR to build ADDR_EXPR ?
> >No you should not generate addresses for VCEs that contain a SSA_NAME. > > I think you should check if get_base_address is a > >is_gimple_addressable inside gather_memory_references_ref. > > There, TREE_CODE ( get_base_address (ref)) == SSA_NAME > > and get_base_address (ref) is is_gimple_addressable. > > However, address expression containing SSA_NAME is NOT considered > as a gimple address. >You simply can't take an address of such thing. Look at IVOPTs, >it has measures to avoid this stuff. Thanks, Richard: I have a fix based on this suggestion: http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01625.html Changpeng
How to avoid auto-vectorization for this loop (rolls at most 3 times)
It seems the auto-vectorizer could not recognize that this loop will roll at most 3 times. And it will generate quite messy code. int a[1024], b[1024]; void foo (int n) { int i; for (i = (n/4)*4; i< n; i++) a[i] = a[i] + b[i]; } How can we correctly estimate the number of iterations for this case and use this info for the vectorizer? Thanks, Changpeng
RE: How to avoid auto-vectorization for this loop (rolls at most 3 times)
>> It seems the auto-vectorizer could not recognize that this loop will >> roll at most 3 times. >> And it will generate quite messy code. >> >> int a[1024], b[1024]; >> void foo (int n) >> { >> int i; >> for (i = (n/4)*4; i< n; i++) >> a[i] = a[i] + b[i]; >> } >> >> How can we correctly estimate the number of iterations for this case >> and use this info for the vectorizer? >Does it recognise it if you rewrite the loop as follows: >for (i = n&~0x3; i< n; i++) >a[i] = a[i] + b[i]; NO. But it is OK for the following case: for (i = n-3; i< n; i++) a[i] = a[i] + b[i]; It seems it fails at the case of "unknown but small". Anyway, this mostly affects compilation time and code size, and has limited impact on performance. For for (i = n&~0x3; i< n; i++) a[i] = a[i] + b[i]; The attached foo-O3-no-tree-vectorize.s is what we expect from the optimizer. foo-O3.s is too bad. Thanks, Changpeng foo-O3-no-tree-vectorize.s Description: foo-O3-no-tree-vectorize.s foo-O3.s Description: foo-O3.s
What loop optimizations could increase the code size significantly?
Hi, I am kooking ways to reduce the code size. What loop optimizations could increase the code size significantly? The optimization I know are: unswitch, vectorization, prefetch and unrolling. We should not perform these optimizations if the loop just roll a few iterations. In addition, what loop optimizations could generate pre- and/or post loops? For example, vectorization, unrolling, Thanks, Changpeng