On Wed, Jul 20, 2016 at 3:07 PM, Andrew Pinski <pins...@gmail.com> wrote:
> On Wed, Jul 20, 2016 at 2:57 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>> On Wed, Jul 20, 2016 at 2:20 PM, Jeff Law <l...@redhat.com> wrote:
>>> On 07/20/2016 03:09 PM, Andrew Pinski wrote:
>>>>
>>>> On Wed, Jul 20, 2016 at 2:02 PM, Andrew Pinski <pins...@gmail.com> wrote:
>>>>>
>>>>> On Wed, Jul 20, 2016 at 1:32 PM, Jeff Law <l...@redhat.com> wrote:
>>>>>>
>>>>>> On 07/20/2016 02:21 PM, Segher Boessenkool wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 20, 2016 at 12:48:09PM +0530, Senthil Kumar Selvaraj wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I see this for some of the larger C frontend tests with lots of
>>>>>>>>> expected
>>>>>>>>> errors/warnings as well.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I also see this for tests with small output, but it happens more
>>>>>>> often for tests with big output.
>>>>>>>
>>>>>>>> Are you guys getting this everytime or is it sporadic?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Not always, but usually.  It seems related to how busy the system is,
>>>>>>> but that could be my imagination.
>>>>>>>
>>>>>>> It usually happens for a bunch of tests at the same time.
>>>>>>>
>>>>>>> Not always is the output just cut short: some random characters
>>>>>>> are inserted sometimes (in the individual log files, the consolidated
>>>>>>> log file does not have those; it looks like pointers).
>>>>>>
>>>>>>
>>>>>> Hmm, there was a kernel bug a while back which had similar behavior.
>>>>>> What
>>>>>> kernel version are you running?
>>>>>
>>>>>
>>>>> I am running with 4.2.  Let me find the old email and see I should
>>>>> include it in our tree (I thought we did).
>>>>
>>>>
>>>> Found it.  It looks like it was not put in yet:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=96311
>>>>
>>>> I know Jakub asked about a year ago saying it was not 4.1-rc1 yet.
>>>> Can someone look to see if it even made it into newer kernels?
>>>
>>> I know "a" fix is in modern Fedora kernels and I thought it came from
>>> upstream.
>>>
>>> Jeff
>>>
>>
>> commit 1a48632ffed61352a7810ce089dc5a8bcd505a60
>> Author: Peter Hurley <pe...@hurleysoftware.com>
>> Date:   Mon Apr 13 13:24:34 2015 -0400
>>
>>     pty: Fix input race when closing
>>
>>     A read() from a pty master may mistakenly indicate EOF (errno == -EIO)
>>     after the pty slave has closed, even though input data remains to be 
>> read.
>>     For example,
>>
>
>
> I definitely have this patch in my kernel which I am using and still
> seeing failures.
> I wonder if the memory barriers are still wrong.  Since I see this on
> aarch64 where the memory ordering is weak with respect to stores.
>
> Thanks,
> Andrew
>>
>> --
>> H.J.

There may be another fix:

commit 0f40fbbcc34e093255a2b2d70b6b0fb48c3f39aa
Author: Brian Bloniarz <brian.bloni...@gmail.com>
Date:   Sun Mar 6 13:16:30 2016 -0800

    Fix OpenSSH pty regression on close

    OpenSSH expects the (non-blocking) read() of pty master to return
    EAGAIN only if it has received all of the slave-side output after
    it has received SIGCHLD. This used to work on pre-3.12 kernels.

    This fix effectively forces non-blocking read() and poll() to
    block for parallel i/o to complete for all ttys. It also unwinds
    these changes:

    1) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
       tty: Fix pty master read() after slave closes

    2) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
       pty, n_tty: Simplify input processing on final close

    3) 1a48632ffed61352a7810ce089dc5a8bcd505a60
       pty: Fix input race when closing


-- 
H.J.

Reply via email to