Hello.
I am trying to run a simple program with SVE instructions on gem5. However, the
output with debug flag ExecALL suggests there is a issue with the decoder.
Here is the test code:
#define STREAM_ARRAY_SIZE 16
void main()
{
for (int j=0; j<STREAM_ARRAY_SIZE; j++)
{
A[j]=3; B[j]=2;
}
int x=add(A,B);
printf("return %d \n",A[3]); // should print 6, does not in gem5
}
int add(int * restrict p, int * restrict q)
{
for (int i=0; i<STREAM_ARRAY_SIZE; i+=1)
{
*(p+i)=*(q+i)+4;
}
printf("dummy %d %d \n", *(p+3), *(q+3)); // should print 6 and 2, does
not in gem5
return *(p+3);
}
I compiled it with gcc cross compiler for arm with following command:
aarch64-linux-gnu-gcc-11 -O3 -static -mcpu=a64fx+sve2 -msve-vector-bits=512 -o
test test.c
Without the-mcpu=a64fx+sve2, SVE instructions are not generated.
Here is the command I used:
./build/ARM/gem5.opt ./configs/deprecated/example/se.py --cpu-type=ArmO3CPU
--caches --cacheline_size=64 --mem-size=8GB --arm-iset=aarch64 -c ./test
I have also used "./configs/example/arm/starter_se.py", but the results are
same.
When I use --debug-flag=Execall, I see the following isssues:
1) 12589500: system.cpu: A0 T0 : 0x400524 @main+4 : ptrue p0, VL64
: SimdPredAlu
: D=[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] FetchSeq=14292 CPSeq=4962 flags=()
The D=[] should not be all zeros.
2)
12591000: system.cpu: A0 T0 : 0x400550 @main+48 : st1 {z1}, p0/z, ,
[x19] : MemWrite :
A=0x491040 FetchSeq=14305 CPSeq=4975 flags=(IsInteger|IsVector|IsStore)
12591000: system.cpu: A0 T0 : 0x400554 @main+52 : st1 {z0}, p0/z, ,
[x19, #1, mul vl] : MemWrite : A=0x491050 FetchSeq=14306 CPSeq=4976
flags=(IsInteger|IsVector|IsStore)
The second A should be 0x491080, not 0x491050.
I have run the same thing on RIKEN simulator, which was built on top of gem5
for Fujitsu A64FX.
Here are the same instructions seen in RIKEN.
1) 15322000: system.cpu A0 T0 : @main+4 : ptrue p0, VL64 :
SimdPredAlu :
D=0b[0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_0000_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111]
FetchSeq=18146 CPSeq=5254 flags=()
As you can see, my data arrays are 64 bytes and appropriate bits in predicate
registers are set to 1.
2)
15323000: system.cpu A0 T0 : @main+48 : st1 {z1}, p0/z, , [x19] :
SveMemWrite :
A=0x491040 FetchSeq=18159 CPSeq=5267
flags=(IsInteger|IsVector|IsMemRef|IsStore)
15323000: system.cpu A0 T0 : @main+52 : st1 {z0}, p0/z, , [x19, #1, mul
vl] : SveMemWrite :
A=0x491080 FetchSeq=18160 CPSeq=5268
The second address is calcuated as 0x491080, which is the correct result for
x19, #1, mul vl, as vl=64.
I tried to compare the files in src/arch/arm/ISA from riken with current gem5.
Since RIKEN is based on old gem5, there are obvious syntax differences. Other
than that, I have found 2 things:
1) in ArmISA.py, in riken, there is this:
id_aa64pfr0_el1 = Param.UInt64(0x0000000100000022, "AArch64 Processor
Feature Register 0")"
I did not find anything similar in gem5. I did find id_aa64pfr0_el1 in
ar/arm/reg/misch.hh but its value wasnt set anwhere.
2) In ArmISA.py in current gem5, there is this "FEAT_SVE" extension in class
ArmDefaultSERelease. However, this is for armv8.2, and I dont know how to
specify this architecture in command line.
What I am trying to find out is, am I missing any runtime flags that would
enable the proper SVE instructions in gem5, or is it due to any compile time
flags since I am setting -mcpu to a64fx (setting -march to armv8.2-a+sve or
whatever does not produce SVE instructions, it has to be -mcpu=a64fx+sve), or
is it a possible issue/bug in the new gem5 itself. Any suggestions would be
appreciated.
Thank you.
_______________________________________________
gem5-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]