On 12/16/22 18:31, 钟居哲 wrote:
Register allocation (RA) doesn't affect the assembler checks since I relax the registers in assmebler checks,
all assmebler checks have their own goal. For example:

The code like this:

+void foo2 (void * restrict in, void * restrict out, int n)
+{
+  for (int i = 0; i < n; i++)
+    {
+      vuint16mf4_t v = *(vuint16mf4_t*)(in + i);
+      *(vuint16mf4_t*)(out + i) = v;
+    }
+}

Assembler check:

scan-assembler-times 
{vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e16,\s*mf4,\s*t[au],\s*m[au]\s+\.L[0-9]\:\s+vle16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\((?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7])\

I don't care about which vector register is using since I relax register in 
assembler : (?:v[0-9]|v[1-2][0-9]|v3[0-1]), this means any vector register 
v0-v31

But also I relax scalar register : (?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]), 
so could be any x0 - x31 of them.

The only strict check is that make sure the vsetvl is hoist outside the loop 
meaning the location of vsetvl is outside of the Lable L[0-9]:

vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e16,\s*mf4,\s*t[au],\s*m[au]\s+\.L[0-9]

You can see the last assembler is \s+\.L[0-9] to make sure VSETVL PASS 
successfully do the optimization that hoist the vsetvl instruction outside the 
loop

I try to use check-function-body but it fails since it can not recognize the 
Lable which is most important for such cases.
Ah, I should have looked at those regexps closer. Understood about the checking for hoisting the vsetvl. Though it makes me wonder if we'd be better off dumping information out of the vsetvl pass.

In the case of hoisting we could dump the loop nest of the original evaluation block and the loop nest of the new vsetvl location, then we scan for those in the vsetvl pass dump. While it doesn't check the assembly code, it's probably just as good if not better.

Consider that as an alternative. But I'm not going to insist on it. I just know we've had a lot of trouble through the years where assembly code changes slightly, causing test fails. So I try to avoid too much assembly scanning if it can be avoided. Often the easiest way to get the same basic effect is to dump at the transformation point and scan for those markers in the dump file.

Jeff

Reply via email to