On 11.05.2020 17:14, Andrew Cooper wrote:
> On 04/05/2020 14:20, Jan Beulich wrote:
>> On 02.05.2020 00:58, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/traps.c
>>> +++ b/xen/arch/x86/traps.c
>>> @@ -774,10 +774,27 @@ static void do_reserved_trap(struct cpu_user_regs 
>>> *regs)
>>>            trapnr, vec_name(trapnr), regs->error_code);
>>>  }
>>>  
>>> +static bool exception_fixup(struct cpu_user_regs *regs, bool print)
>>> +{
>>> +    unsigned long fixup = search_exception_table(regs);
>>> +
>>> +    if ( unlikely(fixup == 0) )
>>> +        return false;
>>> +
>>> +    /* Can currently be triggered by guests.  Make sure we ratelimit. */
>>> +    if ( IS_ENABLED(CONFIG_DEBUG) && print )
>> I didn't think we consider dprintk()-s a possible security issue.
>> Why would we consider so a printk() hidden behind
>> IS_ENABLED(CONFIG_DEBUG)? IOW I think one of XENLOG_GUEST and
>> IS_ENABLED(CONFIG_DEBUG) wants dropping.
> 
> Who said anything about a security issue?

The need to rate limit is (among other aspects) to prevent a
(logspam) security issue, isn't it?

> I'm deliberately not using dprintk() for the reasons explained in the
> commit message, so IS_ENABLED(CONFIG_DEBUG) is to cover that.

Which is fine, and which I understood.

> XENLOG_GUEST is because everything (other than the boot path) hitting
> this caused directly by a guest action, and needs rate-limiting.  This
> was not consistent before.

But why do we need both CONFIG_DEBUG and XENLOG_GUEST? In a debug
build I think we'd better know of such events under the control
of the general log level setting, not under that of guest specific
messages.

>>> @@ -1466,12 +1468,11 @@ void do_page_fault(struct cpu_user_regs *regs)
>>>          if ( pf_type != real_fault )
>>>              return;
>>>  
>>> -        if ( likely((fixup = search_exception_table(regs)) != 0) )
>>> +        if ( likely(exception_fixup(regs, false)) )
>>>          {
>>>              perfc_incr(copy_user_faults);
>>>              if ( unlikely(regs->error_code & PFEC_reserved_bit) )
>>>                  reserved_bit_page_fault(addr, regs);
>>> -            regs->rip = fixup;
>> I'm afraid this modification can't validly be pulled ahead -
>> show_execution_state(), as called from reserved_bit_page_fault(),
>> wants to / should print the original RIP, not the fixed up one.
> 
> This path is bogus to begin with.  Any RSVD pagefault (other than the
> Shadow MMIO fastpath, handled earlier) is a bug in Xen and should be
> fatal rather than just a warning on extable-tagged instructions.

I could see reasons to convert to a fatal exception (without looking
up fixups), but then in a separate change with suitable justification.
I'd like to point out though that this would convert a logspam kind
of DoS to a Xen crash one, in case such a PTE would indeed be created
anywhere by mistake.

However, it is not helpful to the diagnosis of such a problem if the
logged address points at the fixup code rather than the faulting one.

Jan

Reply via email to