It looks like maybe a 64bit scalar-evolution issue - when I compile on
powerpc-linux with -m64, I also get the
"vect4.f:4: note: not consecutive access"
message.
This problem looks very similar to PR18403 which has been resolved a while
ago:
When compiling for 32bit, we get the following representation for the loop:
# i_2 = PHI <i_25(11), i_41(14)>;
<L12>:;
D.505_38 = i_2 + -1;
D.506_39 = (*b_14)[D.505_38];
(*a_9)[D.505_38] = D.506_39;
i_41 = i_2 + 1;
if (i_2 == D.489_27) goto <L26>; else goto <L27>;
When compiling for 64bit, there is an extra cast:
# i_2 = PHI <i_27(11), i_45(14)>;
<L12>:;
D.691_41 = (int8) i_2;
D.692_42 = D.691_41 + -1;
D.693_43 = (*b_16)[D.692_42];
(*a_10)[D.692_42] = D.693_43;
i_45 = i_2 + 1;
if (i_2 == D.674_29) goto <L26>; else goto <L27>;
>From the vectorizer dumps for the 32bit code, it looks like the
access-function computed for the index to array b is quite simple and the
dataref analyzer can easily analyze it (i.e. extact the step - 4B - and
conclude that the access is consecutive):
Created dr for (*b_14)[D.708_38]
base_address: b_14
offset from base address: (<unnamed type>) (i_25 * 4 + -4)
constant offset from base address: 0
base_object: *b_14
step: 4B
base aligned 0
misalignment from base:
memtag: TMT.11
In the 64bit case however, the vectorizer dumps show that the
access-function returned for the index to array b is much more compilcated
- the dataref analyzer doesn't seem to be able to extract the
evolution/step in this case, and concludes that the access is
non-consecutive:
Created dr for (*b_16)[D.692_42]
base_address: b_16
offset from base address: (<unnamed type>) ((int8) {i_27, +, 1}_2 *
4 + -4)
constant offset from base address: 0
base_object: *b_16
step: 0B
base aligned 0
misalignment from base:
memtag: TMT.11
Should we reopen PR18403?
thanks,
dorit
Toon Moene <[EMAIL PROTECTED]> wrote on 22/10/2005 12:18:57:
> This one gets vectorized for me, on powerpc-linux:
>
> ~/mainline_cvs/bin/gfortran -O3 -ftree-vectorize -maltivec
> -ftree-vectorizer-verbose=4 -S hilaram4.f90
>
> hilaram4.f90:4: note: Alignment of access forced using peeling.
> hilaram4.f90:4: note: Vectorizing an unaligned access.
> hilaram4.f90:4: note: LOOP VECTORIZED.
> hilaram4.f90:7: note: vectorized 1 loops in function.
>
> dorit
>
>
> > L.S.,
> >
> > This code:
> >
> > SUBROUTINE S(N)
> > DIMENSION A(N), B(N)
> > READ*,ISTART,ISTOP,B
> > DO I = ISTART, ISTOP
> > A(I) = B(I)
> > ENDDO
> > PRINT*,A
> > END
> >
> > when compiled thusly:
> >
> > $ gfortran -g -S -O3 -ftree-vectorize -ftree-vectorizer-verbose=2 -
> > msse2 vect4.f
> >
> > draws the following "not vectorized" message:
> >
> > vect4.f:4: note: not vectorized: complicated access pattern.
> > vect4.f:4: note: vectorized 0 loops in function.
>
> I get the following, using the autovect branch compiler on
> x86_64-unknown-linux-gnu:
>
> vect4.f:4: note: ===== analyze_loop_nest =====
> vect4.f:4: note: === vect_analyze_loop_form ===
> vect4.f:4: note: split exit edge.
> vect4.f:4: note: === get_loop_niters ===
> vect4.f:4: note: ==> get_loop_niters:(<unnamed type>) (D.813_29 - i_27) +
1
> vect4.f:4: note: Symbolic number of iterations is (<unnamed type>)
> (D.813_29 - i_27) + 1
> vect4.f:4: note: === vect_analyze_data_refs ===
> vect4.f:4: note: get vectype with 4 units of type real4
> vect4.f:4: note: vectype: vector real4
> vect4.f:4: note: get vectype with 4 units of type real4
> vect4.f:4: note: vectype: vector real4
> vect4.f:4: note: === vect_analyze_scalar_cycles ===
> vect4.f:4: note: Analyze phi: HEAP.49_363 = PHI <HEAP.49_380(5),
> HEAP.49_381(7)>;
> vect4.f:4: note: virtual phi. skip.
> vect4.f:4: note: Analyze phi: HEAP.42_259 = PHI <HEAP.42_268(5),
> HEAP.42_268(7)>;
> vect4.f:4: note: virtual phi. skip.
> vect4.f:4: note: Analyze phi: HEAP.35_60 = PHI <HEAP.35_312(5),
> HEAP.35_312(7)>;
> vect4.f:4: note: virtual phi. skip.
> vect4.f:4: note: Analyze phi: i_1 = PHI <i_27(5), i_45(7)>;
> vect4.f:4: note: Access function of PHI: {i_27, +, 1}_2
> vect4.f:4: note: step: 1, init: i_27
> vect4.f:4: note: Detected induction.
> vect4.f:4: note: === vect_pattern_recog ===
> vect4.f:4: note: === vect_mark_stmts_to_be_vectorized ===
> vect4.f:4: note: init: phi relevant? HEAP.49_363 = PHI <HEAP.
> 49_380(5), HEAP.49_381(7)>;
> vect4.f:4: note: init: phi relevant? HEAP.42_259 = PHI <HEAP.
> 42_268(5), HEAP.42_268(7)>;
> vect4.f:4: note: init: phi relevant? HEAP.35_60 = PHI <HEAP.
> 35_312(5), HEAP.35_312(7)>;
> vect4.f:4: note: init: phi relevant? i_1 = PHI <i_27(5), i_45(7)>;
> vect4.f:4: note: init: stmt relevant? <L14>:
> vect4.f:4: note: init: stmt relevant? D.830_39 = (int8) i_1
> vect4.f:4: note: init: stmt relevant? D.831_40 = D.830_39 + -1
> vect4.f:4: note: init: stmt relevant? D.832_43 = (*b_10)[D.831_40]
> vect4.f:4: note: init: stmt relevant? (*a_16)[D.831_40] = D.832_43
> vect4.f:4: note: vec_stmt_relevant_p: stmt has vdefs.
> vect4.f:4: note: mark relevant 2, live 0.
> vect4.f:4: note: init: stmt relevant? i_45 = i_1 + 1
> vect4.f:4: note: init: stmt relevant? if (i_1 == D.813_29) goto
> <L35>; else goto <L29>;
> vect4.f:4: note: init: stmt relevant? <L29>:
> vect4.f:4: note: worklist: examine stmt: (*a_16)[D.831_40] = D.832_43
> vect4.f:4: note: vect_is_simple_use: operand D.832_43
> vect4.f:4: note: def_stmt: D.832_43 = (*b_10)[D.831_40]
> vect4.f:4: note: type of def: 2.
> vect4.f:4: note: worklist: examine use 2: D.832_43
> vect4.f:4: note: mark relevant 2, live 0.
> vect4.f:4: note: worklist: examine stmt: D.832_43 = (*b_10)[D.831_40]
> vect4.f:4: note: === vect_analyze_data_refs_alignment ===
> vect4.f:4: note: vect_compute_data_ref_alignment:
> vect4.f:4: note: Unknown alignment for access: *b_10
> vect4.f:4: note: vect_compute_data_ref_alignment:
> vect4.f:4: note: Unknown alignment for access: *a_16
> vect4.f:4: note: === vect_determine_vectorization_factor ===
> vect4.f:4: note: ==> examining statement: <L14>:
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: D.830_39 = (int8) i_1
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: D.831_40 = D.830_39 + -1
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: D.832_43 = (*b_10)[D.831_40]
> vect4.f:4: note: vectype: vector real4
> vect4.f:4: note: nunits = 4
> vect4.f:4: note: ==> examining statement: (*a_16)[D.831_40] = D.832_43
> vect4.f:4: note: vectype: vector real4
> vect4.f:4: note: nunits = 4
> vect4.f:4: note: ==> examining statement: i_45 = i_1 + 1
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: if (i_1 == D.813_29) goto
> <L35>; else goto <L29>;
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: <L29>:
> vect4.f:4: note: skip.
> vect4.f:4: note: === vect_determine_vectorization_factor ===
> vect4.f:4: note: ==> examining statement: <L14>:
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: D.830_39 = (int8) i_1
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: D.831_40 = D.830_39 + -1
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: D.832_43 = (*b_10)[D.831_40]
> vect4.f:4: note: vectype: vector real4
> vect4.f:4: note: nunits = 4
> vect4.f:4: note: ==> examining statement: (*a_16)[D.831_40] = D.832_43
> vect4.f:4: note: vectype: vector real4
> vect4.f:4: note: nunits = 4
> vect4.f:4: note: ==> examining statement: i_45 = i_1 + 1
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: if (i_1 == D.813_29) goto
> <L35>; else goto <L29>;
> vect4.f:4: note: skip.
> vect4.f:4: note: ==> examining statement: <L29>:
> vect4.f:4: note: skip.
> vect4.f:4: note: === vect_analyze_dependences ===
> vect4.f:4: note: === vect_analyze_data_ref_accesses ===
> vect4.f:4: note: not consecutive access
> ^^^^^^^^^^^^^^^^^^^^^^ This is incorrect. The accesses
*are*
> consecutive; it's just that there
is
> a "jump" at the beginning.
> vect4.f:4: note: not vectorized: complicated access pattern.
> vect4.f:4: note: bad data access.
>
> Hope this helps.
>
> Kind regards,
>
> --
> Toon Moene - e-mail: [EMAIL PROTECTED] - phone: +31 346 214290
> Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
> A maintainer of GNU Fortran 95: http://gcc.gnu.org/fortran/