Another piece of information :
The observations are same, if the current pci-device (sd/mmc
controller) is detached, and another pci-device (sound controller) is
attached to the guest.
So, it looks that we can rule out any (pci-)device-specific issue.
For brevity, here are the details of the other pci-device I tried with :
###############################################
sudo lspci -vvv
00:1b.0 Audio device: Intel Corporation 6 Series/C200 Series Chipset
Family High Definition Audio Controller (rev 04)
DeviceName: Onboard Audio
Subsystem: Dell 6 Series/C200 Series Chipset Family High
Definition Audio Controller
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 31
IOMMU group: 5
Region 0: Memory at e2e60000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00358 Data: 0000
Capabilities: [70] Express (v1) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE- FLReset+
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
VC1: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=1 ArbSelect=Fixed TC/VC=22
Status: NegoPending- InProgress-
Capabilities: [130 v1] Root Complex Link
Desc: PortNumber=0f ComponentID=00 EltType=Config
Link0: Desc: TargetPort=00 TargetComponent=00 AssocRCRB-
LinkType=MemMapped LinkValid+
Addr: 00000000fed1c000
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
###############################################
On Fri, Oct 22, 2021 at 11:03 PM Ajay Garg <[email protected]> wrote:
>
> Ping ..
>
> Any updates please on this?
>
> It will be great to have the fix upstreamed (properly of course).
>
> Right now, the patch contains the change as suggested, of
> explicitly/properly clearing out dma-mappings when unmap is called.
> Please let me know in whatever way I can help, including
> testing/debugging for other approaches if required.
>
>
> Many thanks to Alex and Lu for their continued support on the issue.
>
>
>
> P.S. :
>
> I might have missed mentioning the information about the device that
> causes flooding.
> Please find it below :
>
> ######################################
> sudo lspci -vvv
>
> 0a:00.0 SD Host controller: O2 Micro, Inc. OZ600FJ0/OZ900FJ0/OZ600FJS
> SD/MMC Card Reader Controller (rev 05) (prog-if 01)
> Subsystem: Dell OZ600FJ0/OZ900FJ0/OZ600FJS SD/MMC Card Reader Controller
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0, Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 17
> IOMMU group: 14
> Region 0: Memory at e2c20000 (32-bit, non-prefetchable) [size=512]
> Capabilities: [a0] Power Management version 3
> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
> PME(D0+,D1+,D2+,D3hot+,D3cold+)
> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
> Capabilities: [48] MSI: Enable- Count=1/1 Maskable+ 64bit+
> Address: 0000000000000000 Data: 0000
> Masking: 00000000 Pending: 00000000
> Capabilities: [80] Express (v1) Endpoint, MSI 00
> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1
> <64us
> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> SlotPowerLimit 10.000W
> DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
> MaxPayload 128 bytes, MaxReadReq 512 bytes
> DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr-
> TransPend-
> LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit
> Latency L0s <512ns, L1 <64us
> ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
> LnkCtl: ASPM L0s Enabled; RCB 64 bytes, Disabled- CommClk-
> ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
> TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
> Capabilities: [100 v1] Virtual Channel
> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
> Arb: Fixed- WRR32- WRR64- WRR128-
> Ctrl: ArbSelect=Fixed
> Status: InProgress-
> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
> Status: NegoPending- InProgress-
> Capabilities: [200 v1] Advanced Error Reporting
> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
> AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn-
> ECRCChkCap- ECRCChkEn-
> MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
> HeaderLog: 00000000 00000000 00000000 00000000
> Kernel driver in use: sdhci-pci
> Kernel modules: sdhci_pci
> ######################################
>
>
>
> Thanks and Regards,
> Ajay
>
> On Tue, Oct 12, 2021 at 7:27 PM Ajay Garg <[email protected]> wrote:
> >
> > Origins at :
> > https://lists.linuxfoundation.org/pipermail/iommu/2021-October/thread.html
> >
> > === Changes from v1 => v2 ===
> >
> > a)
> > Improved patch-description.
> >
> > b)
> > A more root-level fix, as suggested by
> >
> > 1.
> > Alex Williamson <[email protected]>
> >
> > 2.
> > Lu Baolu <[email protected]>
> >
> >
> >
> > === Issue ===
> >
> > Kernel-flooding is seen, when an x86_64 L1 guest (Ubuntu-21) is booted in
> > qemu/kvm
> > on a x86_64 host (Ubuntu-21), with a host-pci-device attached.
> >
> > Following kind of logs, along with the stacktraces, cause the flood :
> >
> > ......
> > DMAR: ERROR: DMA PTE for vPFN 0x428ec already set (to 3f6ec003 not
> > 3f6ec003)
> > DMAR: ERROR: DMA PTE for vPFN 0x428ed already set (to 3f6ed003 not
> > 3f6ed003)
> > DMAR: ERROR: DMA PTE for vPFN 0x428ee already set (to 3f6ee003 not
> > 3f6ee003)
> > DMAR: ERROR: DMA PTE for vPFN 0x428ef already set (to 3f6ef003 not
> > 3f6ef003)
> > DMAR: ERROR: DMA PTE for vPFN 0x428f0 already set (to 3f6f0003 not
> > 3f6f0003)
> > ......
> >
> >
> >
> > === Current Behaviour, leading to the issue ===
> >
> > Currently, when we do a dma-unmapping, we unmap/unlink the mappings, but
> > the pte-entries are not cleared.
> >
> > Thus, following sequencing would flood the kernel-logs :
> >
> > i)
> > A dma-unmapping makes the real/leaf-level pte-slot invalid, but the
> > pte-content itself is not cleared.
> >
> > ii)
> > Now, during some later dma-mapping procedure, as the pte-slot is about
> > to hold a new pte-value, the intel-iommu checks if a prior
> > pte-entry exists in the pte-slot. If it exists, it logs a kernel-error,
> > along with a corresponding stacktrace.
> >
> > iii)
> > Step ii) runs in abundance, and the kernel-logs run insane.
> >
> >
> >
> > === Fix ===
> >
> > We ensure that as part of a dma-unmapping, each (unmapped) pte-slot
> > is also cleared of its value/content (at the leaf-level, where the
> > real mapping from a iova => pfn mapping is stored).
> >
> > This completes a "deep" dma-unmapping.
> >
> >
> >
> > Signed-off-by: Ajay Garg <[email protected]>
> > ---
> > drivers/iommu/intel/iommu.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index d75f59ae28e6..485a8ea71394 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -5090,6 +5090,8 @@ static size_t intel_iommu_unmap(struct iommu_domain
> > *domain,
> > gather->freelist = domain_unmap(dmar_domain, start_pfn,
> > last_pfn, gather->freelist);
> >
> > + dma_pte_clear_range(dmar_domain, start_pfn, last_pfn);
> > +
> > if (dmar_domain->max_addr == iova + size)
> > dmar_domain->max_addr = iova;
> >
> > --
> > 2.30.2
> >
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu