On 1/30/2021 20:12, Jakub Kicinski wrote:
On Sat, 30 Jan 2021 16:00:06 +0200 Neftin, Sasha wrote:
On 1/30/2021 08:22, Jakub Kicinski wrote:
On Thu, 28 Jan 2021 13:38:48 -0800 Tony Nguyen wrote:
From: Kai-Heng Feng <kai.heng.f...@canonical.com>
Similar to commit 165ae7a8feb5 ("igb: Report speed and duplex as unknown
when device is runtime suspended"), if we try to read speed and duplex
sysfs while the device is runtime suspended, igc will complain and
stops working:
The more generic approach will be wrap get_link_ksettings() with begin()
and complete() callbacks, and calls runtime resume and runtime suspend
routine respectively. However, igc is like igb, runtime resume routine
uses rtnl_lock() which upper ethtool layer also uses.
So to prevent a deadlock on rtnl, take a different approach, use
pm_runtime_suspended() to avoid reading register while device is runtime
suspended.
Is someone working on the full fix to how PM operates?
There is another rd32(IGC_STATUS) in this file which I don't think
is protected either.
What is another rd32(IGC_STATUS) you meant? in igc_ethtool_get_regs?
Yes.
While the device in D3 state there is no configuration space registers
access.
That's to say similar stack trace will be generated to the one fixed
here, if someone runs ethtool -d, correct? I don't see anything
checking runtime there either.
yes.
This problem crosses many drivers. (not only igb, igc,...)
specific to this one (igc), can we check 'netif_running at begin of the
_get_regs method:
if (!netif_running(netdev))
return;
what do you think? (only OS can put device to the D3)
To be clear I'm not asking for this to be addressed in this series.
Rather for a strong commitment that PM handling will be restructured.
It seems to me you should depend on refcounting / locking that the PM
subsystem does more rather than involving rtnl_lock.