Ping ..
Any updates please on this?
It will be great to have the fix upstreamed (properly of course).
Right now, the patch contains the change as suggested, of
explicitly/properly clearing out dma-mappings when unmap is called.
Please let me know in whatever way I can help, including
testing/debugging for other approaches if required.
Many thanks to Alex and Lu for their continued support on the issue.
P.S. :
I might have missed mentioning the information about the device that
causes flooding.
Please find it below :
######################################
sudo lspci -vvv
0a:00.0 SD Host controller: O2 Micro, Inc. OZ600FJ0/OZ900FJ0/OZ600FJS
SD/MMC Card Reader Controller (rev 05) (prog-if 01)
Subsystem: Dell OZ600FJ0/OZ900FJ0/OZ600FJS SD/MMC Card Reader Controller
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 17
IOMMU group: 14
Region 0: Memory at e2c20000 (32-bit, non-prefetchable) [size=512]
Capabilities: [a0] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [80] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
SlotPowerLimit 10.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit
Latency L0s <512ns, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM L0s Enabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [200 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn-
ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Kernel driver in use: sdhci-pci
Kernel modules: sdhci_pci
######################################
Thanks and Regards,
Ajay
On Tue, Oct 12, 2021 at 7:27 PM Ajay Garg <[email protected]> wrote:
>
> Origins at :
> https://lists.linuxfoundation.org/pipermail/iommu/2021-October/thread.html
>
> === Changes from v1 => v2 ===
>
> a)
> Improved patch-description.
>
> b)
> A more root-level fix, as suggested by
>
> 1.
> Alex Williamson <[email protected]>
>
> 2.
> Lu Baolu <[email protected]>
>
>
>
> === Issue ===
>
> Kernel-flooding is seen, when an x86_64 L1 guest (Ubuntu-21) is booted in
> qemu/kvm
> on a x86_64 host (Ubuntu-21), with a host-pci-device attached.
>
> Following kind of logs, along with the stacktraces, cause the flood :
>
> ......
> DMAR: ERROR: DMA PTE for vPFN 0x428ec already set (to 3f6ec003 not 3f6ec003)
> DMAR: ERROR: DMA PTE for vPFN 0x428ed already set (to 3f6ed003 not 3f6ed003)
> DMAR: ERROR: DMA PTE for vPFN 0x428ee already set (to 3f6ee003 not 3f6ee003)
> DMAR: ERROR: DMA PTE for vPFN 0x428ef already set (to 3f6ef003 not 3f6ef003)
> DMAR: ERROR: DMA PTE for vPFN 0x428f0 already set (to 3f6f0003 not 3f6f0003)
> ......
>
>
>
> === Current Behaviour, leading to the issue ===
>
> Currently, when we do a dma-unmapping, we unmap/unlink the mappings, but
> the pte-entries are not cleared.
>
> Thus, following sequencing would flood the kernel-logs :
>
> i)
> A dma-unmapping makes the real/leaf-level pte-slot invalid, but the
> pte-content itself is not cleared.
>
> ii)
> Now, during some later dma-mapping procedure, as the pte-slot is about
> to hold a new pte-value, the intel-iommu checks if a prior
> pte-entry exists in the pte-slot. If it exists, it logs a kernel-error,
> along with a corresponding stacktrace.
>
> iii)
> Step ii) runs in abundance, and the kernel-logs run insane.
>
>
>
> === Fix ===
>
> We ensure that as part of a dma-unmapping, each (unmapped) pte-slot
> is also cleared of its value/content (at the leaf-level, where the
> real mapping from a iova => pfn mapping is stored).
>
> This completes a "deep" dma-unmapping.
>
>
>
> Signed-off-by: Ajay Garg <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index d75f59ae28e6..485a8ea71394 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -5090,6 +5090,8 @@ static size_t intel_iommu_unmap(struct iommu_domain
> *domain,
> gather->freelist = domain_unmap(dmar_domain, start_pfn,
> last_pfn, gather->freelist);
>
> + dma_pte_clear_range(dmar_domain, start_pfn, last_pfn);
> +
> if (dmar_domain->max_addr == iova + size)
> dmar_domain->max_addr = iova;
>
> --
> 2.30.2
>
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu