Hi Casey,
On Thu, May 07, 2015 at 12:34:51AM +0000, Casey Leedom wrote:
> Hello,
>
> I've included both the Network and PCI Development mailing lists because
> this crosses both domains. If this violates protocols, I apologise in
> advance. I believe that this ~probably~ will be a "PCI Development" issue
> but since it affects a Network Device, I figured input from that quarter
> should be solicited as well.
>
> We have a Network Device, T5, which unfortunately has a PCI Compliance bug
> in it[1]. Doubly unfortunately, the PCI SIG 3.0 Compliance Tests which we
> dutifully ranb before releasing the silicon two years ago didn't include a
> vector for this, so it wasn't discovered till very recently. In order to
> solve this bug, we need to modify the Root Complex's PCI Express Capability
> Device Control register in order to turn off the Enable No Snoop and Enable
> Relaxed Ordering features.
>
> I've cons'ed up demonstration code within the Device Driver for T5, cxgb4,
> to traverse the PCI graph to the Root Complex for the T5 Device and perform
> this modification[2]. This demonstrates the "workability" of doing the
> operation, but I'm betting that this is _not_ where either the Network or PCI
> Development teams would like such code to go. My expectation is that this
> code needs to go into drivers/pci/quirks.c.
>
> So, this mail is a request for advice on how the Network and PCI
> Development teams would like this handled. Once I receive this input, I will
> submit a patch to the correct team for inclusion in the kernel. Thank you
> for your time and attention.
There are a lot of fixups in drivers/pci/quirks.c. For things that have to
be worked around either before a driver claims the device or if there is no
driver at all, the fixup *has* to go in drivers/pci/quirks.c
But for things like this, where the problem can only occur after a driver
claims the device, I think it makes more sense to put the fixup in the
driver itself. The only wrinkle here is that the fixup has to be done on a
separate device, not the device claimed by the driver. But I think it
probably still makes sense to put this fixup in the driver.
> Casey
>
> [1] Chelsio T5 PCI-E Compliance Bug:
>
> The bug is that when the Root Complex send a Transaction Layer Packet
> (TLP)
> Request downstream to a Device,the TLP may contain Attributes. The PCI
> Specification states that two of these Attributes, No Snoop and Relaxed
> Ordering, must be included in the Device's TLP Response. Further, the PCI
> Specification "encourages" Root Complexes to drop TLP Responses which
> are out of compliance with this rule.
Can you include a pointer to the relevant part of the spec?
> [2] Demonstration Code for clearing Root Complex No Snoop and Relaxed
> Ordering:
>
> --- a/drivers/net/ethernet/chelsio/cxgb4_main.c Mon Apr 06 09:27:21
> 2015 -0700
> +++ b/drivers/net/ethernet/chelsio/cxgb4_main.c Tue Apr 07 13:39:05
> 2015 -0700
> @@ -9956,6 +9956,36 @@ static void enable_pcie_relaxed_ordering
> pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_DEVCTL_RELAX_EN);
> }
>
> +/*
> + * Find the highest PCI-Express bridge above a PCI Device. If found, that's
> + * the Root Complex PCI-PCI Bridge for the PCI Device. If we find the Root
> + * Comples, clear the Enable Relaxed Ordering and Enable No Snoop bits in
> that
s/Comples/Complex/, but the Root Complex itself does not appear as a PCI
device, so we'll never actually find *it*. But I think we should *always*
find a Root Port. Your code and text suggests that it's possible we
wouldn't (since you say "*If* found, ..."). Is there a case you're
thinking of where we wouldn't find a Root Port?
> + * bridge's PCI-E Capability Device Control register. This will prevent the
> + * Root Complex from setting those attributes in the Transaction Layer
> Packets
> + * of the Requests which it sends down stream to the PCI Device.
> + */
> +static void clear_root_complex_tlp_attributes(struct pci_dev *pdev)
> +{
> + struct pci_bus *bus = pdev->bus;
> + struct pci_dev *highest_pcie_bridge = NULL;
> +
> + while (bus) {
> + struct pci_dev *bridge = bus->self;
> +
> + if (!bridge || !bridge->pcie_cap)
> + break;
> + highest_pcie_bridge = bridge;
> + bus = bus->parent;
> + }
Can you use pci_upstream_bridge() here? There are a couple places where we
want to find the Root Port, so we might factor that out someday. It'll be
easier to find all those places if they use with pci_upstream_bridge().
> +
> + if (highest_pcie_bridge)
> + pcie_capability_clear_and_set_word(highest_pcie_bridge,
> + PCI_EXP_DEVCTL,
> + PCI_EXP_DEVCTL_RELAX_EN |
> + PCI_EXP_DEVCTL_NOSNOOP_EN,
> + 0);
Please include a dmesg note here, especially since the driver is changing
the config of a device other than its own.
> +}
> +
> static int init_one(struct pci_dev *pdev,
> const struct pci_device_id *ent)
> {
> @@ -9973,6 +10003,19 @@ static int init_one(struct pci_dev *pdev
> ++version_printed;
> }
>
> + /*
> + * T5 has a PCI-E Compliance bug in it where it doesn't copy the
> + * Transaction Layer Packet Attributes from downstream Requests into
> + * it's upstream Responses. Most Root Complexes are fine with this
s/it's/its/
> + * but a few get prissy and drop the non-compliant T5 Responses
> + * leading to endless Device Timeouts when TLP Attributes are set. So
> + * if we're a T5, attempt to clear our Root Complex's enable bits for
> + * TLP Attributes ...
> + */
> + if (CHELSIO_PCI_ID_VER(pdev->device) == CHELSIO_T5 ||
> + CHELSIO_PCI_ID_VER(pdev->device) == CHELSIO_T5_FPGA)
> + clear_root_complex_tlp_attributes(pdev);
> +
> err = pci_request_regions(pdev, KBUILD_MODNAME);
> if (err) {
> /* Just info, some other driver may have claimed the device.
> */--
Bjorn
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html