Re: [Xen-devel] [PATCH V1] iommu/arm: Add Renesas IPMMU-VMSA support

Oleksandr Mon, 22 Jul 2019 09:28:15 -0700


On 20.07.19 21:25, Julien Grall wrote:

Hi,



Hi, Julien.

Apologies for the late answer. I have been traveling for the past fewweeks.


Absolutely no problem. Thank you for your review.


On 6/26/19 11:30 AM, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko <[email protected]>

The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
which provides address translation and access protection functionalities
to processing units and interconnect networks.


Do you have a link to the specification?


All I have is a TRM. Unfortunately, I can't share it.

My knowledge about the IPMMU is quite limited, so for now, I willmostly look at Xen specific code. It would be good if someone with abetter knowledge of the driver could have a look.

I understand your point. Make sense. I will CC a person who is familiarwith H/W.



[...]

diff --git a/xen/drivers/passthrough/Kconfigb/xen/drivers/passthrough/Kconfig
index a3c0649..e57aa29 100644
--- a/xen/drivers/passthrough/Kconfig
+++ b/xen/drivers/passthrough/Kconfig
@@ -12,4 +12,17 @@ config ARM_SMMU
Say Y here if your SoC includes an IOMMU deviceimplementing the
        ARM SMMU architecture.
+
+config IPMMU_VMSA
+    bool "Renesas IPMMU-VMSA found in R-Car Gen3 SoCs"
+    default y


I would prefer this to be a default N for the time-being.

ok

+    depends on ARM_64
+    ---help---
+      Support for implementations of the Renesas IPMMU-VMSA found
+      in R-Car Gen3 SoCs.
+
+ Say Y here if you are using newest R-Car Gen3 SoCs revisionswhich IPMMU
What new now will be old soon ;). So it would be best if you give aprecise revision here.


yes ;). Will clarify.

+ hardware supports stage 2 translation table format and is ableto use
+      CPU's P2M table as is.
+
  endif
diff --git a/xen/drivers/passthrough/arm/Makefileb/xen/drivers/passthrough/arm/Makefile
index b3efcfd..40ac7a9 100644
--- a/xen/drivers/passthrough/arm/Makefile
+++ b/xen/drivers/passthrough/arm/Makefile
@@ -1,2 +1,3 @@
  obj-y += iommu.o
  obj-$(CONFIG_ARM_SMMU) += smmu.o
+obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.cb/xen/drivers/passthrough/arm/ipmmu-vmsa.c
new file mode 100644
index 0000000..5091c61
--- /dev/null
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -0,0 +1,1487 @@
+/*
+ * xen/drivers/passthrough/arm/ipmmu-vmsa.c
+ *
+ * Driver for the Renesas IPMMU-VMSA found in R-Car Gen3 SoCs.
+ *
+ * The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
+ * which provides address translation and access protectionfunctionalities
+ * to processing units and interconnect networks.
+ *
+ * Please note, current driver is supposed to work only with newestGen3 SoCs+ * revisions which IPMMU hardware supports stage 2 translation tableformat and
+ * is able to use CPU's P2M table as is.
+ *
+ * Based on Linux's IPMMU-VMSA driver from Renesas BSP:
+ *    drivers/iommu/ipmmu-vmsa.c
What are the major differences compare the Linux driver?


Well, the major differences are:

1. Stage 1/Stage 2 translation. Linux driver supports Stage 1translation only (with Stage 1 translation table format). It managespage table by itself. But Xen driver supports Stage 2 translation (withStage 2 translation table format) to be able to share the page tablewith the CPU. Stage 1 translation is always bypassed in Xen driver.

So, Xen driver is supposed to be used with newest Gen3 SoC revisionsonly (H3 ES3.0, M3 ES3.0, etc.) which IPMMU hardware does support stage2 translation table format.

2. AArch64 support. Linux driver uses VMSAv8-32 mode, while Xen driverenables Armv8 VMSAv8-64 mode to cover up to 40 bit input address.

3. Context bank (sets of page table) usage. In Xen, each context bank ismapped to one Xen domain. So, all devices being pass throughed to thesame Xen domain share the same context bank.

+ * you can found at:
+ * url:git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git
+ *    branch: v4.14.75-ltsi/rcar-3.9.2
+ *    commit: a5266d298124874c2c06b8b13d073f6ecc2ee355
Is there any reason to use the BSP driver and not the one provided byLinux directly?

I was thinking the BSP driver is a *little bit* more updated than Linuxone. Sometime it was a big difference between mainline and BSP driver.But now

the difference is not big and mostly in DDR_BACKUP and WHITELISTsupport. I looked at mainline driver as well when implementing Xen driver.

[...]
+#define dev_archdata(dev) ((struct ipmmu_vmsa_xen_device*)dev->archdata.iommu)
+
+/* Root/Cache IPMMU device's information */
+struct ipmmu_vmsa_device
+{
AFAICT, this file was converted to Xen coding style. If so, the stylefor struct is
struct ... {
...
};


Yes, will correct everywhere in this file.

+    struct device *dev;
+    void __iomem *base;
+    struct ipmmu_vmsa_device *root;
+    struct list_head list;
+    unsigned int num_utlbs;
+    unsigned int num_ctx;
+    spinlock_t lock;    /* Protects ctx and domains[] */
+    DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
+    struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
+};
+
+/*
+ * Root/Cache IPMMU domain's information
+ *

+ * Root IPMMU device is assigned to Root IPMMU domain while CacheIPMMU device+ * is assigned to Cache IPMMU domain. Master devices are connectedto Cache

+ * IPMMU devices through specific ports called micro-TLBs.
+ * All Cache IPMMU devices, in turn, are connected to Root IPMMU device
+ * which manages IPMMU context.
+ */
+struct ipmmu_vmsa_domain
+{


Ditto.

ok

+    /*
+     * IPMMU device assigned to this IPMMU domain.
+ * Either Root device which is located at the main memory busdomain or
+     * Cache device which is located at each hierarchy bus domain.
+     */
+    struct ipmmu_vmsa_device *mmu;
+
+    /* Context used for this IPMMU domain */
+    unsigned int context_id;
+
+    /* Xen domain associated with this IPMMU domain */
+    struct domain *d;
+
+    /* The fields below are used for Cache IPMMU domain only */
+
+    /*
+ * Used to keep track of the master devices which are attachedto this+ * IPMMU domain (domain users). Master devices behind the sameIPMMU device
+     * are grouped together by putting into the same IPMMU domain.
+ * Only when the refcount reaches 0 this IPMMU domain can bedestroyed.
+     */
+    int refcount;
If the refcount cannot be 0, then I would prefer an unsigned int here.


It can be >= 0.

+    /* Used to link this IPMMU domain for the same Xen domain */
+    struct list_head list;
+};
[...]
+/* Read/Write Access */
+static u32 ipmmu_read(struct ipmmu_vmsa_device *mmu, unsigned intoffset)
If you are going to use Xen coding style, then this should be"uint32_t". The same is valid everywhere in this file, I am not goingto mention all of them :).


Yes, will change everywhere in this file.

+{
+    return readl(mmu->base + offset);
+}
+

+static void ipmmu_write(struct ipmmu_vmsa_device *mmu, unsigned intoffset,

+                        u32 data)
+{
+    writel(data, mmu->base + offset);
+}
+
+static u32 ipmmu_ctx_read_root(struct ipmmu_vmsa_domain *domain,
+                               unsigned int reg)
+{
+    return ipmmu_read(domain->mmu->root,
+                      domain->context_id * IM_CTX_SIZE + reg);
+}
+
+static void ipmmu_ctx_write_root(struct ipmmu_vmsa_domain *domain,
+                                 unsigned int reg, u32 data)
+{
+    ipmmu_write(domain->mmu->root,
+                domain->context_id * IM_CTX_SIZE + reg, data);
+}
+
+static void ipmmu_ctx_write_cache(struct ipmmu_vmsa_domain *domain,
+                                  unsigned int reg, u32 data)
+{
+    ASSERT(reg == IMCTR);

What's the rationale of passing reg in parameter if it can only beequal to IMCTR?

Good question. I tried to retain the same interface as foripmmu_ctx_write_root(_all) for visibility.

Cache IPMMU device has other than IMCTR context registers, but they arenot used by this driver.


Shall I drop reg parameter?

+
+    /* Mask fields which are implemented in IPMMU-MM only. */
+    if ( !ipmmu_is_root(domain->mmu) )
+ ipmmu_write(domain->mmu, domain->context_id * IM_CTX_SIZE +reg,
+                    data & IMCTR_COMMON_MASK);
+}
[...]
+/* Enable MMU translation for the micro-TLB. */
+static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain *domain,
+                              unsigned int utlb)
+{
+    struct ipmmu_vmsa_device *mmu = domain->mmu;
+
+    /*
+ * TODO: Reference-count the micro-TLB as several bus masterscan be
+     * connected to the same micro-TLB.
+     */
What would be the exact problem if this is not handled? Could a utlbshared between multiple domain?

Actually, this TODO as well as the whole "TLB and micro-TLB Management"code I tried to retain as much as possible.

The large amount of devices have unique micro-TLB connection, but thereis a case, when not. At the moment, I don't see any problem if thesepotential devices (which share the same utlb) are assigned to the sameXen domain.

In this case we would just enable the same utlb more than once at thedomain creation time (assign a device) and disable the same utlb morethan once at the domain destruction time (deassign a device).

The worst case scenario would be when these devices are assigned todifferent Xen domains. So, I think, the same utlb *can't* be sharedbetween multiple Xen domains, since it points to the context bank to usefor the page walk.

+    ipmmu_write(mmu, IMUASID(utlb), 0);
+    ipmmu_write(mmu, IMUCTR(utlb), ipmmu_read(mmu, IMUCTR(utlb)) |
+                IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_MMUEN);
+}


[...]

+static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
+{
+    u64 ttbr;


s/u64/uint64_t/

ok

+    u32 tsz0;
+    int ret;
+
+    /* Find an unused context. */
+    ret = ipmmu_domain_allocate_context(domain->mmu->root, domain);
+    if ( ret < 0 )
+        return ret;
+
+    domain->context_id = ret;
+
+    /*
+     * TTBR0
+     * Use P2M table for this Xen domain.
+     */
+    ASSERT(domain->d != NULL);
+    ttbr = page_to_maddr(domain->d->arch.p2m.root);
+
+ dev_info(domain->mmu->root->dev, "d%d: Set IPMMU context %u (pgd0x%"PRIx64")\n",
We introduce a format specifier to print a domain. So you can use %pdin combination of just domain->d.


Such a convenient thing. Will use.

+             domain->d->domain_id, domain->context_id, ttbr);
+
+    ipmmu_ctx_write_root(domain, IMTTLBR0, ttbr & IMTTLBR0_TTBR_MASK);
+ ipmmu_ctx_write_root(domain, IMTTUBR0, (ttbr >> 32) &IMTTUBR0_TTBR_MASK);
+
+    /*
+     * TTBCR
+ * We use long descriptors with inner-shareable WBWA tables andallocate+ * the whole "p2m_ipa_bits" IPA space to TTBR0. Use 4KB pagegranule.+ * Start page table walks at first level. Bypass stage 1translation
+     * when only stage 2 translation is performed.
I am not sure to understand the last sentence. You only use stage2right, so it shouldn't it be "Always bypass stage 1 translation"?


Yes, you are right. Will rephrase.

+     */
+    tsz0 = (64 - p2m_ipa_bits) << IMTTBCR_TSZ0_SHIFT;
+    ipmmu_ctx_write_root(domain, IMTTBCR, IMTTBCR_EAE | IMTTBCR_PMB |
+ IMTTBCR_SH0_INNER_SHAREABLE |IMTTBCR_ORGN0_WB_WA |+ IMTTBCR_IRGN0_WB_WA | IMTTBCR_SL0_LVL_1 |tsz0);
[...]
+/* Fault Handling */
+static void ipmmu_domain_irq(struct ipmmu_vmsa_domain *domain)
+{
+ const u32 err_mask = IMSTR_MHIT | IMSTR_ABORT | IMSTR_PF |IMSTR_TF;
+    struct ipmmu_vmsa_device *mmu = domain->mmu;
+    u32 status;
+    u64 iova;
+
+    status = ipmmu_ctx_read_root(domain, IMSTR);
+    if ( !(status & err_mask) )
+        return;
+
+    iova = ipmmu_ctx_read_root(domain, IMELAR) |
+ ((u64)ipmmu_ctx_read_root(domain, IMEUAR) << 32);
+
+    /*
+     * Clear the error status flags. Unlike traditional interrupt flag
+ * registers that must be cleared by writing 1, this statusregister+ * seems to require 0. The error address register must be readbefore,
+     * otherwise its value will be 0.
+     */
+    ipmmu_ctx_write_root(domain, IMSTR, 0);
+
+    /* Log fatal errors. */
+    if ( status & IMSTR_MHIT )
+ dev_err_ratelimited(mmu->dev, "d%d: Multiple TLB hits@0x%"PRIx64"\n",
Similar remark for d%d here ...

ok

+ domain->d->domain_id, iova);
+    if ( status & IMSTR_ABORT )
+ dev_err_ratelimited(mmu->dev, "d%d: Page Table Walk Abort@0x%"PRIx64"\n",
... here and ...

ok

+ domain->d->domain_id, iova);
+
+ /* Return if it is neither Permission Fault nor TranslationFault. */
+    if ( !(status & (IMSTR_PF | IMSTR_TF)) )
+        return;
+
+ /* Flush the TLB as required when IPMMU translation erroroccurred. */
+    ipmmu_tlb_invalidate(domain);
+
+ dev_err_ratelimited(mmu->dev, "d%d: Unhandled fault: status0x%08x iova 0x%"PRIx64"\n",
... here.

ok

+ domain->d->domain_id, status, iova);
+}
+
+static void ipmmu_irq(int irq, void *dev, struct cpu_user_regs *regs)
+{
+    struct ipmmu_vmsa_device *mmu = dev;
+    unsigned int i;
+    unsigned long flags;
+
+    spin_lock_irqsave(&mmu->lock, flags);
+
+    /*

+ * When interrupt arrives, we don't know the context it isrelated to.

+     * So, check interrupts for all active contexts to locate a context
+     * with status bits set.
+    */
+    for ( i = 0; i < mmu->num_ctx; i++ )
+    {
+        if ( !mmu->domains[i] )
+            continue;
+        ipmmu_domain_irq(mmu->domains[i]);
+    }
+
+    spin_unlock_irqrestore(&mmu->lock, flags);
+}
+
+/* Master devices management */
+static int ipmmu_attach_device(struct ipmmu_vmsa_domain *domain,
+                               struct device *dev)
+{
+    struct ipmmu_vmsa_master_cfg *cfg = dev_archdata(dev)->cfg;
+    struct ipmmu_vmsa_device *mmu = cfg->mmu;
+    unsigned int i;
+
+    if ( !mmu )
+    {
+        dev_err(dev, "Cannot attach to IPMMU\n");
+        return -ENXIO;
+    }
+
+    if ( !domain->mmu )


So you read domain->mmu here and ...

+    {
+        /* The domain hasn't been used yet, initialize it. */
+        domain->mmu = mmu;
+
+        /*
+ * We have already enabled context for Root IPMMU assignedto this
+         * Xen domain in ipmmu_domain_init_context().
+ * Enable the context for Cache IPMMU only. Flush the TLB asrequired
+         * when modifying the context registers.
+         */
+        ipmmu_ctx_write_cache(domain, IMCTR,
+ ipmmu_ctx_read_root(domain, IMCTR) |IMCTR_FLUSH);
+
+        dev_info(dev, "Using IPMMU context %u\n", domain->context_id);
+    }
+    else if ( domain->mmu != mmu )

... here. What actually promise that domain->mmu can't change inparallel?


ipmmu_attach_device is protected by xen_domain->lock

+    {
+        /*
+         * Something is wrong, we can't attach two master devices using
+         * different IOMMUs to the same IPMMU domain.
+         */
+        dev_err(dev, "Can't attach IPMMU %s to domain on IPMMU %s\n",
+                dev_name(mmu->dev), dev_name(domain->mmu->dev));
+        return -EINVAL;
+    }
+    else

+ dev_info(dev, "Reusing IPMMU context %u\n",domain->context_id);

+
+    for ( i = 0; i < cfg->num_utlbs; ++i )
+        ipmmu_utlb_enable(domain, cfg->utlbs[i]);
+
+    return 0;
+}


[...]

+static void ipmmu_protect_masters(struct ipmmu_vmsa_device *mmu)
+{
+    struct dt_device_node *node;
+
+    dt_for_each_device_node( dt_host, node )
+    {
+        if ( mmu->dev->of_node != dt_parse_phandle(node, "iommus", 0) )
+            continue;
+
+ /* Let Xen know that the master device is protected by anIOMMU. */
+        dt_device_set_protected(node);
+
+ dev_info(mmu->dev, "Found master device %s\n",dt_node_full_name(node));
+    }
+}

I don't much like this. You are going to go through the whole DT quitea few times.

+1

The IOMMU interface in Xen has not been designed with the new IOMMUbindings in mind. I would prefer if we look for extending add_devicecallback to support platform device.
This would allow to probe the device later on and therefore avoid togo through the device-tree multiple.

I completely agree with you that current implementation is not optimaland should be reworked in order not to scan the whole DT many times, butI am not completely understand what we should do and how exactly.


Could you, please, add more details?

+
+static void ipmmu_device_reset(struct ipmmu_vmsa_device *mmu)
+{
+    unsigned int i;
+
+    /* Disable all contexts. */
+    for ( i = 0; i < mmu->num_ctx; ++i )
+        ipmmu_write(mmu, i * IM_CTX_SIZE + IMCTR, 0);
+}
+
+/*
+ * This function relies on the fact that Root IPMMU device is beingprobed+ * the first. If not the case, it denies further Cache IPMMU deviceprobes+ * (returns the -ENODEV) until the Root IPMMU device has beenregistered
+ * for sure.
Can we look at handling -EDEFER in Xen instead?

I am not sure this is something we should implement at this stage (whileonly IPMMU driver would be a user). I have already resolved thatpossible issue by trying to locate a Root IPMMU device and probe it thefirst

to avoid the case described above. So now, we don't depend on how IPMMUdevices are located in DT. Please, see ipmmu_init(). So, I tend to livewith it some time.

+ */
[...]
+static int __must_check ipmmu_map_page(struct domain *d, dfn_t dfn,mfn_t mfn,
+                                       unsigned int flags,
+                                       unsigned int *flush_flags)
The function is exactly the same as for the SMMU driver. Could weimplement common helpers in a separate file?


Sounds indeed reasonable. I will look at it.

+{
+    p2m_type_t t;
+
+    /*
+     * Grant mappings can be used for DMA requests. The dev_bus_addr
+     * returned by the hypercall is the MFN (not the IPA). For device
+ * protected by an IOMMU, Xen needs to add a 1:1 mapping in thedomain
+     * p2m to allow DMA request to work.
+ * This is only valid when the domain is directed mapped. Hencethis+ * function should only be used by gnttab code with gfn == mfn== dfn.
+     */
+    BUG_ON(!is_domain_direct_mapped(d));
+    BUG_ON(mfn_x(mfn) != dfn_x(dfn));
+
+    /* We only support readable and writable flags */
+    if ( !(flags & (IOMMUF_readable | IOMMUF_writable)) )
+        return -EINVAL;
+
+ t = (flags & IOMMUF_writable) ? p2m_iommu_map_rw :p2m_iommu_map_ro;
+
+    /*
+ * The function guest_physmap_add_entry replaces the currentmapping
+     * if there is already one...
+     */
+ return guest_physmap_add_entry(d, _gfn(dfn_x(dfn)),_mfn(dfn_x(dfn)), 0, t);
+}
+
+static int __must_check ipmmu_unmap_page(struct domain *d, dfn_t dfn,
+                                         unsigned int *flush_flags)
Same here.


ok


----------

Julien, what we should do with the fact that IPMMU supports only 3-levelpage table?

I left a TODO regarding that, but we need to work out some usablesolution if possible.


        /*
         * As 4-level translation table is not supported in IPMMU, we need
         * to check IPA size used for P2M table beforehand to be sure it is
         * 3-level and the IPMMU will be able to use it.
         *
         * In case of using 4KB page granule we should use two concatenated
         * translation tables at level 1 in order to support 40 bit IPA
         * with 3-level translation table.
         *

* TODO: Probably, when determing the "pa_range" insetup_virt_paging()

         * we should take into the account the IPMMU ability as well.
         */
        if ( IPMMU_MAX_P2M_IPA_BITS < p2m_ipa_bits )
        {

dev_err(&node->dev, "P2M IPA size is not supported (P2M=%uIPMMU=%u)!\n",

                    p2m_ipa_bits, IPMMU_MAX_P2M_IPA_BITS);
            return -EOPNOTSUPP;
        }


--
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V1] iommu/arm: Add Renesas IPMMU-VMSA support

Reply via email to