On Jun 7, 2016, at 10:22 PM, Oleg Drokin wrote:
>
> On Jun 7, 2016, at 8:03 PM, Jeff Layton wrote:
>
>>>> That said, this code is quite subtle. I'd need to look over it in more
>>>> detail before I offer up any fixes. I'd also appreciate it if anyone
>>>> else wants to sanity check my analysis there.
>>>>
>> Yeah, I think you're right. It's fine since r/w opens have a distinct
>> slot, even though the refcounting just tracks the number of read and
>> write references. So yeah, the leak probably is in an error path
>> someplace, or maybe a race someplace.
>
> So I noticed that set_access is always called locked, but clear_access is not,
> this does not sound right.
>
> So I placed this strategic WARN_ON:
> @@ -3991,6 +4030,7 @@ static __be32 nfs4_get_vfs_file(struct svc_rqst *rqstp,
> struct nfs4_file *fp,
> goto out_put_access;
> spin_lock(&fp->fi_lock);
> if (!fp->fi_fds[oflag]) {
> +WARN_ON(!test_access(open->op_share_access, stp));
> fp->fi_fds[oflag] = filp;
> filp = NULL;
>
> This is right in the place where nfsd set the access flag already, discovered
> that the file is not opened and went on to open it, yet some parallel thread
> came in and cleared the flag by the time we got the file opened.
> It did trigger (but there are 30 minutes left till test finish, so I don't
> know yet if this will correspond to the problem at hand yet, so below is
> speculation).
Duh, I looked for a warning, but did not cross reference, and it was not this
one that
hit yet.
Though apparently I am hitting some of the "impossible" warnings, so you might
want to
look into that anyway.
status = nfsd4_process_open2(rqstp, resfh, open);
WARN(status && open->op_created,
"nfsd4_process_open2 failed to open newly-created file!
status=%u\n",
be32_to_cpu(status));
and
filp = find_readable_file(fp);
if (!filp) {
/* We should always have a readable file here */
WARN_ON_ONCE(1);
locks_free_lock(fl);
return -EBADF;
}