>>>>> Gosh. Can we also replace this BUG() into something less aggressive ?
>>>>
>>>>
>>>> There are currently 5 of these WARN() + BUG() constructs and 1 BUG()-only
>>>> for the 'default' TPACKET version spread all over af_packet, so probably
>>>> makes sense to rather make all of them less aggressive.
>>>>
>>>>
>>>
>>> Very few consumers actually go looking in the kernel logs to see the
>>> error-warnings and report them back here.
>>>
>>> This severity will get them to report the incident which in this case
>>> got fixed??
>>
>> But BUG_ONs in the datapath can cause outages in real production
>> environments. This should not happen for recoverable failures. For
>> users who cannot be bothered to check their logs, there is sysctl
>> kernel.panic_on_warn.
>
>
> Completely understand(and you should have failover to handle these
> outages).

Not for correlated failures where all systems can hit the same path.
This is especially dangerous when remote packets or untrusted
local users can trigger a BUG-enabled path.

> But then are you ok giving incorrect info to the
> application?

No, we should certainly signal an error. For instance, returning
TP_STATUS_WRONG_FORMAT instead of TP_STATUS_AVAILABLE.

> For this specific bug: it is so basic that you should hit this bug 1st
> time everytime when you are adding support or porting a new header.
> Correct?

Agreed, but that is small consolation if an unprivileged user (say, in
a namespace) finds out that it can trigger the codepath.

But I agree that this particular BUG_ON is one of the easier to
reason about.

Reply via email to