On Tue, Oct 01, 2024 at 10:02:58PM GMT, Daniel Henrique Barboza wrote:
...
> +/*
> + * RISCV IOMMU Address Translation Lookup - Page Table Walk
> + *
> + * Note: Code is based on get_physical_address() from
> target/riscv/cpu_helper.c
> + * Both implementation can be merged into single helper function in future.
> + * Keeping them separate for now, as error reporting and flow specifics are
> + * sufficiently different for separate implementation.
> + *
> + * @s : IOMMU Device State
> + * @ctx : Translation context for device id and process address space
> id.
> + * @iotlb : translation data: physical address and access mode.
> + * @return : success or fault cause code.
> + */
> +static int riscv_iommu_spa_fetch(RISCVIOMMUState *s, RISCVIOMMUContext *ctx,
> + IOMMUTLBEntry *iotlb)
> +{
> + dma_addr_t addr, base;
> + uint64_t satp, gatp, pte;
> + bool en_s, en_g;
> + struct {
> + unsigned char step;
> + unsigned char levels;
> + unsigned char ptidxbits;
> + unsigned char ptesize;
> + } sc[2];
> + /* Translation stage phase */
> + enum {
> + S_STAGE = 0,
> + G_STAGE = 1,
> + } pass;
> + MemTxResult ret;
> +
> + satp = get_field(ctx->satp, RISCV_IOMMU_ATP_MODE_FIELD);
> + gatp = get_field(ctx->gatp, RISCV_IOMMU_ATP_MODE_FIELD);
> +
> + en_s = satp != RISCV_IOMMU_DC_FSC_MODE_BARE;
> + en_g = gatp != RISCV_IOMMU_DC_IOHGATP_MODE_BARE;
> +
> + /*
> + * Early check for MSI address match when IOVA == GPA. This check
> + * is required to ensure MSI translation is applied in case
> + * first-stage translation is set to BARE mode. In this case IOVA
> + * provided is a valid GPA. Running translation through page walk
> + * second stage translation will incorrectly try to translate GPA
> + * to host physical page, likely hitting IOPF.
> + */
Why was this comment expanded from the simple
"Early check for MSI address match when IOVA == GPA."
The comment is now incorrect since the check is required even when
first-stage translation is not BARE. I just skimmed the spec again trying
to figure out if the removal of '!en_s' is a hack or a fix, and I'm
inclined to say "fix", but it's an incomplete fix. I found a sentence that
says
"If the virtual memory scheme selected for first-stage is Bare but the
scheme for the second-stage is not Bare then the IOVA is a GPA."
which, in a way, defines a GPA to only be a GPA when second-stage is used
(and all MSI translation specifications refer to GPAs). However, maybe I
missed it, but I couldn't find any actual reason that the MSI table can't
be used when first-stage is not BARE and second-stage is (and, of course,
it makes no difference for single-stage translations to call IOVAs GPAs
or not).
Now, I also see
"If the virtual memory scheme selected for neither stage is Bare then the
IOVA is a VA. Two-stage address translation is in effect. The first-stage
translates the VA to a GPA and the second-stage translates the GPA to a
SPA."
in the spec, which means we should probably change the removal of '!en_s'
to '!(en_s && en_g)'. VFIO+irqbypass would still work with that and, when
Linux learns to support two-stage translation, we wouldn't incorrectly try
to check for a GVA in the MSI table.
Thanks,
drew