On 7/3/19 12:11 PM, Richard Henderson wrote:
> On 7/1/19 6:35 AM, Jan Bobek wrote:
>> +sub write_mov_rr($$)
>> +{
>> + my ($r1, $r2) = @_;
>> +
>> + my %insn = (opcode => X86OP_MOV,
>> + modrm => {mod => MOD_DIRECT,
>> + reg => ($r1 & 0x7),
>> + rm => ($r2 & 0x7)});
>> +
>> + $insn{rex}{w} = 1 if $is_x86_64;
>> + $insn{rex}{r} = 1 if $r1 >= 8;
>> + $insn{rex}{b} = 1 if $r2 >= 8;
>
> This is where maybe it's better to leave rex.[rb] to risugen_x86_asm, and just
> leave $modrm{reg} and $modrm{rm} as 4-bit quantities.
That's what I have in v3, stay tuned!
>> +sub write_mov_reg_imm($$)
>> +{
>> + my ($reg, $imm) = @_;
>> + my %insn;
>> +
>> + if (0 <= $imm && $imm <= 0xffffffff) {
>
> Should include !$is_x86_64 here,
>
>> + %insn = (opcode => {value => 0xB8 | ($reg & 0x7), len => 1},
>> + imm => {value => $imm, len => 4});
>> + } elsif (-0x80000000 <= $imm && $imm <= 0x7fffffff) {
>> + %insn = (opcode => {value => 0xC7, len => 1},
>> + modrm => {mod => MOD_DIRECT,
>> + reg => 0, rm => ($reg & 0x7)},
>> + imm => {value => $imm, len => 4});
>> +
>> + $insn{rex}{w} = 1 if $is_x86_64;
>
> making this unconditional.
Doesn't B8 (without REX.W) work for x86_64, too? It zeroes the upper
part of the destination, so it's effectively zero-extending, and it's
one byte shorter than C7 (no ModR/M byte needed).
That being said, I moved most of this function to risugen_x86_asm and
included a bunch of comments regarding different cases, so it should
be easier to understand.
>> +sub write_random_ymmdata()
>> +{
>> + my $ymm_cnt = $is_x86_64 ? 16 : 8;
>> + my $ymm_len = 32;
>> + my $datalen = $ymm_cnt * $ymm_len;
>> +
>> + # Generate random data blob
>> + write_random_datablock($datalen);
>> +
>> + # Load the random data into YMM regs.
>> + for (my $ymm_reg = 0; $ymm_reg < $ymm_cnt; $ymm_reg++) {
>> + write_insn(vex => {l => VEX_L_256, p => VEX_P_DATA16,
>> + r => !($ymm_reg >= 8)},
>
> Again, vex.r should be handled in vex_encode.
As I said, there will be more high-level instruction-assembling
functions exported by risugen_x86_asm in v3, which take care of this.
>> + opcode => X86OP_VMOVAPS,
>> + modrm => {mod => MOD_INDIRECT_DISP32,
>> + reg => ($ymm_reg & 0x7),
>> + rm => REG_EAX},
>> + disp => {value => $ymm_reg * $ymm_len,
>> + len => 4});
>> + }
>
> So... this now generates code that cannot run without AVX2.
>
> Which is probably fine for testing right now, since we do
> want to be able to notice effects of SSE/AVX insns on the
> high bits of the registers.
>
> But we'll probably need to have the same --xsave=foo
> command-line option that we have for risu itself.
>
> That would let you initialize only 16-bytes here, or
> for avx512 initialize 64-bytes, plus the k-registers.
Ah yes, indeed.
-Jan
signature.asc
Description: OpenPGP digital signature
