On 18.12.2018 14:25, Chris Chiu wrote:
> On Tue, Dec 18, 2018 at 3:08 AM Heiner Kallweit <hkallwe...@gmail.com> wrote:
>>
>> On 17.12.2018 14:25, Chris Chiu wrote:
>>> On Fri, Dec 14, 2018 at 3:37 PM Heiner Kallweit <hkallwe...@gmail.com> 
>>> wrote:
>>>>
>>>> On 14.12.2018 04:33, Chris Chiu wrote:
>>>>> On Thu, Dec 13, 2018 at 10:20 AM Chris Chiu <c...@endlessm.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>     We got an acer laptop which has a problem with ethernet networking 
>>>>>> after
>>>>>> resuming from S3. The ethernet is popular realtek r8168. The lspci shows 
>>>>>> as
>>>>>> follows.
>>>>>> 02:00.1 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
>>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] 
>>>>>> (rev 12)
>>>>>>
>>>> Helpful would be a "dmesg | grep r8169", especially chip name + XID.
>>>>
>>> [   22.362774] r8169 0000:02:00.1 (unnamed net_device)
>>> (uninitialized): mac_version = 0x2b
>>> [   22.365580] libphy: r8169: probed
>>> [   22.365958] r8169 0000:02:00.1 eth0: RTL8411, 00:e0:b8:1f:cb:83,
>>> XID 5c800800, IRQ 38
>>> [   22.365961] r8169 0000:02:00.1 eth0: jumbo features [frames: 9200
>>> bytes, tx checksumming: ko]
>>>
>> Thanks for the info.
>>
>>>>>>     The problem is the ethernet is not accessible after resume. Pinging 
>>>>>> via
>>>>>> ethernet always shows the response `Destination Host Unreachable`. 
>>>>>> However,
>>>>>> the interesting part is, when I run tcpdump to monitor the problematic 
>>>>>> ethernet
>>>>>> interface, the networking is back to alive. But it's dead again after
>>>>>> I stop tcpdump.
>>>>>> One more thing, if I ping the problematic machine from others, it 
>>>>>> achieves the
>>>>>> same effect as above tcpdump. Maybe it's about the register setting for 
>>>>>> RX path?
>>>>>>
>>>> You could compare the register dumps (ethtool -d) before and after S3 sleep
>>>> to find out whether there's a difference.
>>>>
>>>
>>> Actually, I just found I lead the wrong direction. The S3 suspend does
>>> help to reproduce,
>>> but it's not necessary. All I need to do is ping around 5 mins and the
>>> network connection
>>> fails.  And I also find one thing interesting, disabling the  MSI-X
>>> interrupt like commit
>>> [d49c88d7677ba737e9d2759a87db0402d5ab2607] can fix this problem.
>>> Although I don't
>>> understand the root cause. Anything I can do to help?
>>>
>> This is indeed very, very weird. You say switching from MSI-X to MSI fixes
>> the issue, but also pinging the machine from outside brings back the network.
>> Both actions affect totally different corners.
>>
>> The commit and related issue you mention was a workaround in the driver,
>> the root cause was a MSI-X-related  issue with certain Intel chipsets deep
>> in the PCI core. After this was fixed we removed the workaround again.
>> This shouldn't be related to your issue.
>>
>> Hard to say for now is whether the issue is:
>> - a driver issue
>> - a hardware issue in the RTL8411
>> - an issue with the chipset on your mainboard
>>
>> According to your description it doesn't take a special scenario to trigger
>> the issue, so most likely also other users of Acer notebooks with RTL8411
>> should be affected (after briefly checking this should be at least Aspire
>> F15, V15, V7). Therefore I wonder why there aren't more reports.
>>
>> This commit added MSI-X support: 6c6aa15fdea5 ("r8169: improve interrupt 
>> handling")
>> So you could test this revision and the one before.
>>
>> Eventually, if the issue really should be caused by a side effect of using
>> MSI-X, then the question is whether we need to disable MSI-X for RTL8411
>> in general or just for RTL8411 and a certain subsystem id.
>>
> 
> I tried the kernel with the head on 6c6aa15fdea5 ("r8169: improve
> interrupt handling"),
> the problem still there. Then I revert to the previous revision, the
> problem goes away.
> So I think it's pretty much the side effect of MSI-X. However, as you
> mentioned that
> you didn't hit this problem, I'll ask the vendor to verify if this
> problem also happens on
> other machines with the same chip. Then we can determine to disable for 
> specific
> mac version or just a certain subsystem id.
> 
>>>>>>     I tried the latest 4.20 rc version but the problem still there. I
>>>>>> also tried some
>>>>>> hw_reset or init thing in the resume path but no effect. Any
>>>>>> suggestion for this?
>>>>>> Thanks
>>>>>>
>>>> Did previous kernel versions work? If it's a regression, a bisect would be
>>>> appreciated, because with the chip versions I've got I can't reproduce the 
>>>> issue.
>>>>
>>>>>> Chris
>>>>>
>>>>> Gentle ping. Any additional information required?
>>>>>
>>>>> Chris
>>>>>
>>>> Heiner
>>>
>>
> 

As an additional note:
I found that the rtsx_pci driver doesn't support MSI-X currently.
The following patch adds MSI-X support (it's compile-tested only
because I don't have a system with RTL8411).
Would be interesting to see whether it makes a difference if both
components on this combo chip use MSI-X.

---
 drivers/misc/cardreader/rtsx_pcr.c | 51 ++++++++++--------------------
 include/linux/rtsx_pci.h           |  1 -
 2 files changed, 16 insertions(+), 36 deletions(-)

diff --git a/drivers/misc/cardreader/rtsx_pcr.c 
b/drivers/misc/cardreader/rtsx_pcr.c
index da445223f..d1349c248 100644
--- a/drivers/misc/cardreader/rtsx_pcr.c
+++ b/drivers/misc/cardreader/rtsx_pcr.c
@@ -35,10 +35,6 @@
 
 #include "rtsx_pcr.h"
 
-static bool msi_en = true;
-module_param(msi_en, bool, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(msi_en, "Enable MSI");
-
 static DEFINE_IDR(rtsx_pci_idr);
 static DEFINE_SPINLOCK(rtsx_pci_lock);
 
@@ -1049,22 +1045,21 @@ static irqreturn_t rtsx_pci_isr(int irq, void *dev_id)
 
 static int rtsx_pci_acquire_irq(struct rtsx_pcr *pcr)
 {
-       pcr_dbg(pcr, "%s: pcr->msi_en = %d, pci->irq = %d\n",
-                       __func__, pcr->msi_en, pcr->pci->irq);
+       int ret;
 
-       if (request_irq(pcr->pci->irq, rtsx_pci_isr,
-                       pcr->msi_en ? 0 : IRQF_SHARED,
-                       DRV_NAME_RTSX_PCI, pcr)) {
-               dev_err(&(pcr->pci->dev),
-                       "rtsx_sdmmc: unable to grab IRQ %d, disabling device\n",
-                       pcr->pci->irq);
-               return -1;
-       }
+       ret = pci_alloc_irq_vectors(pcr->pci, 1, 1, PCI_IRQ_ALL_TYPES);
+       if (ret < 0)
+               goto err;
 
-       pcr->irq = pcr->pci->irq;
-       pci_intx(pcr->pci, !pcr->msi_en);
+       ret = pci_request_irq(pcr->pci, 0, rtsx_pci_isr, NULL, pcr,
+                             DRV_NAME_RTSX_PCI);
+       if (ret)
+               goto err;
 
        return 0;
+err:
+       pci_err(pcr->pci, "rtsx_sdmmc: unable to grab interrupt\n");
+       return ret;
 }
 
 static void rtsx_enable_aspm(struct rtsx_pcr *pcr)
@@ -1496,19 +1491,11 @@ static int rtsx_pci_probe(struct pci_dev *pcidev,
        INIT_DELAYED_WORK(&pcr->carddet_work, rtsx_pci_card_detect);
        INIT_DELAYED_WORK(&pcr->idle_work, rtsx_pci_idle_work);
 
-       pcr->msi_en = msi_en;
-       if (pcr->msi_en) {
-               ret = pci_enable_msi(pcidev);
-               if (ret)
-                       pcr->msi_en = false;
-       }
-
        ret = rtsx_pci_acquire_irq(pcr);
        if (ret < 0)
-               goto disable_msi;
+               goto free_dma;
 
        pci_set_master(pcidev);
-       synchronize_irq(pcr->irq);
 
        ret = rtsx_pci_init_chip(pcr);
        if (ret < 0)
@@ -1528,10 +1515,8 @@ static int rtsx_pci_probe(struct pci_dev *pcidev,
        return 0;
 
 disable_irq:
-       free_irq(pcr->irq, (void *)pcr);
-disable_msi:
-       if (pcr->msi_en)
-               pci_disable_msi(pcr->pci);
+       pci_free_irq(pcr->pci, 0, pcr);
+free_dma:
        dma_free_coherent(&(pcr->pci->dev), RTSX_RESV_BUF_LEN,
                        pcr->rtsx_resv_buf, pcr->rtsx_resv_buf_addr);
 unmap:
@@ -1568,9 +1553,7 @@ static void rtsx_pci_remove(struct pci_dev *pcidev)
 
        dma_free_coherent(&(pcr->pci->dev), RTSX_RESV_BUF_LEN,
                        pcr->rtsx_resv_buf, pcr->rtsx_resv_buf_addr);
-       free_irq(pcr->irq, (void *)pcr);
-       if (pcr->msi_en)
-               pci_disable_msi(pcr->pci);
+       pci_free_irq(pcr->pci, 0, pcr);
        iounmap(pcr->remap_addr);
 
        pci_release_regions(pcidev);
@@ -1664,9 +1647,7 @@ static void rtsx_pci_shutdown(struct pci_dev *pcidev)
        rtsx_pci_power_off(pcr, HOST_ENTER_S1);
 
        pci_disable_device(pcidev);
-       free_irq(pcr->irq, (void *)pcr);
-       if (pcr->msi_en)
-               pci_disable_msi(pcr->pci);
+       pci_free_irq(pcr->pci, 0, pcr);
 }
 
 #else /* CONFIG_PM */
diff --git a/include/linux/rtsx_pci.h b/include/linux/rtsx_pci.h
index e964bbd03..10abfe7f2 100644
--- a/include/linux/rtsx_pci.h
+++ b/include/linux/rtsx_pci.h
@@ -1190,7 +1190,6 @@ struct rtsx_pcr {
        /* pci resources */
        unsigned long                   addr;
        void __iomem                    *remap_addr;
-       int                             irq;
 
        /* host reserved buffer */
        void                            *rtsx_resv_buf;
-- 
2.20.0

Reply via email to