Hi Marek,

On Fri, Jun 07, 2019 at 03:55:05PM +0200, Marek Szyprowski wrote:
> On 2019-06-07 15:40, Hans Verkuil wrote:
> > On 6/7/19 2:47 PM, Hans Verkuil wrote:
> >> On 6/7/19 2:23 PM, Hans Verkuil wrote:
> >>> On 6/7/19 2:14 PM, Marek Szyprowski wrote:
> >>>> On 2019-06-07 14:01, Hans Verkuil wrote:
> >>>>> On 6/7/19 1:16 PM, Laurent Pinchart wrote:
> >>>>>> Thank you for the patch.
> >>>>>>
> >>>>>> On Fri, Jun 07, 2019 at 10:45:31AM +0200, Hans Verkuil wrote:
> >>>>>>> The __prepare_userptr() function made the incorrect assumption that 
> >>>>>>> if the
> >>>>>>> same user pointer was used as the last one for which memory was 
> >>>>>>> acquired, then
> >>>>>>> there was no need to re-acquire the memory. This assumption was never 
> >>>>>>> properly
> >>>>>>> tested, and after doing that it became clear that this was in fact 
> >>>>>>> wrong.
> >>>>>> Could you explain in the commit message why the assumption is not
> >>>>>> correct ?
> >>>>> You can free the memory, then allocate it again and you can get the 
> >>>>> same pointer,
> >>>>> even though it is not necessarily using the same physical pages for the 
> >>>>> memory
> >>>>> that the kernel is still using for it.
> >>>>>
> >>>>> Worse, you can free the memory, then allocate only half the memory you 
> >>>>> need and
> >>>>> get back the same pointer. vb2 wouldn't notice this. And it seems to 
> >>>>> work (since
> >>>>> the original mapping still remains), but this can corrupt userspace 
> >>>>> memory
> >>>>> causing the application to crash. It's not quite clear to me how the 
> >>>>> memory can
> >>>>> get corrupted. I don't know enough of those low-level mm internals to 
> >>>>> understand
> >>>>> the sequence of events.
> >>>>>
> >>>>> I have test code for v4l2-compliance available if someone wants to test 
> >>>>> this.
> >>>> I'm interested, I would really like to know what happens in the mm
> >>>> subsystem in such case.
> >>> Here it is:
> >>>
> >>> diff --git a/utils/v4l2-compliance/v4l2-test-buffers.cpp 
> >>> b/utils/v4l2-compliance/v4l2-test-buffers.cpp
> >>> index be606e48..9abf41da 100644
> >>> --- a/utils/v4l2-compliance/v4l2-test-buffers.cpp
> >>> +++ b/utils/v4l2-compliance/v4l2-test-buffers.cpp
> >>> @@ -797,7 +797,7 @@ int testReadWrite(struct node *node)
> >>>           return 0;
> >>>   }
> >>>
> >>> -static int captureBufs(struct node *node, const cv4l_queue &q,
> >>> +static int captureBufs(struct node *node, cv4l_queue &q,
> >>>                   const cv4l_queue &m2m_q, unsigned frame_count, int 
> >>> pollmode,
> >>>                   unsigned &capture_count)
> >>>   {
> >>> @@ -962,6 +962,21 @@ static int captureBufs(struct node *node, const 
> >>> cv4l_queue &q,
> >>>                                   buf.s_flags(V4L2_BUF_FLAG_REQUEST_FD);
> >>>                                   buf.s_request_fd(buf_req_fds[req_idx]);
> >>>                           }
> >>> +                 if (v4l_type_is_capture(buf.g_type()) && q.g_memory() 
> >>> == V4L2_MEMORY_USERPTR) {
> >>> +                         printf("\nidx: %d", buf.g_index());
> >>> +                         for (unsigned p = 0; p < q.g_num_planes(); p++) 
> >>> {
> >>> +                                 printf(" old buf[%d]: %p ", p, 
> >>> buf.g_userptr(p));
> >>> +                                 fflush(stdout);
> >>> +                                 free(buf.g_userptr(p));
> >>> +                                 void *m = calloc(1, q.g_length(p)/2);
> >>> +
> >>> +                                 fail_on_test(m == NULL);
> >>> +                                 q.s_userptr(buf.g_index(), p, m);
> >>> +                                 printf("new buf[%d]: %p", p, m);
> >>> +                                 buf.s_userptr(m, p);
> >>> +                         }
> >>> +                         printf("\n");
> >>> +                 }
> >>>                           fail_on_test(buf.qbuf(node, q));
> >>>                           fail_on_test(buf.g_flags() & 
> >>> V4L2_BUF_FLAG_DONE);
> >>>                           if (buf.g_flags() & V4L2_BUF_FLAG_REQUEST_FD) {
> >>>
> >>>
> >>>
> >>> Load the vivid driver and just run 'v4l2-compliance -s10' and you'll see:
> >>>
> >>> ...
> >>> Streaming ioctls:
> >>>          test read/write: OK
> >>>          test blocking wait: OK
> >>>          test MMAP (no poll): OK
> >>>          test MMAP (select): OK
> >>>          test MMAP (epoll): OK
> >>>          Video Capture: Frame #000
> >>> idx: 0 old buf[0]: 0x7f71c6e7c010 new buf[0]: 0x7f71c6eb4010
> >>>          Video Capture: Frame #001
> >>> idx: 1 old buf[0]: 0x7f71c6e0b010 new buf[0]: 0x7f71c6e7b010
> >>>          Video Capture: Frame #002
> >>> idx: 0 old buf[0]: 0x7f71c6eb4010 free(): invalid pointer
> >>> Aborted
> >> To clarify: two full size buffers are allocated and queued (that happens 
> >> in setupUserPtr()),
> >> then streaming starts and captureBufs is called which basically just calls 
> >> dqbuf
> >> and qbuf.
> >>
> >> Tomasz pointed out that all the pointers in this log are actually 
> >> different. That's
> >> correct, but here is a log where the old and new buf ptr are the same:
> >>
> >> Streaming ioctls:
> >>          test read/write: OK
> >>          test blocking wait: OK
> >>          test MMAP (no poll): OK
> >>          test MMAP (select): OK
> >>          test MMAP (epoll): OK
> >>          Video Capture: Frame #000
> >> idx: 0 old buf[0]: 0x7f1094e16010 new buf[0]: 0x7f1094e4e010
> >>          Video Capture: Frame #001
> >> idx: 1 old buf[0]: 0x7f1094da5010 new buf[0]: 0x7f1094e15010
> >>          Video Capture: Frame #002
> >> idx: 0 old buf[0]: 0x7f1094e4e010 new buf[0]: 0x7f1094e4e010
> >>          Video Capture: Frame #003
> >> idx: 1 old buf[0]: 0x7f1094e15010 free(): invalid pointer
> >> Aborted
> >>
> >> It's weird that the first log fails that way: if the pointers are 
> >> different,
> >> then vb2 will call get_userptr and it should discover that the buffer isn't
> >> large enough, causing qbuf to fail. That doesn't seem to happen.
> > I think that the reason for this corruption is that the memory pool used
> > by glibc is now large enough for vb2 to think it can map the full length
> > of the user pointer into memory, even though only the first half is actually
> > from the buffer that's allocated. When you capture a frame you just 
> > overwrite
> > a random part of the application's memory pool, causing this invalid 
> > pointer.
> >
> > But that's a matter of garbage in, garbage out. So that's not the issue 
> > here.
> >
> > The real question is what happens when you free the old buffer, allocate a
> > new buffer, end up with the same userptr, but it's using one or more 
> > different
> > pages for its memory compared to the mapping that the kernel uses.
> >
> > I managed to reproduce this with v4l2-ctl:
> >
> > diff --git a/utils/v4l2-ctl/v4l2-ctl-streaming.cpp 
> > b/utils/v4l2-ctl/v4l2-ctl-streaming.cpp
> > index 28b2b3b9..8f2ed9b5 100644
> > --- a/utils/v4l2-ctl/v4l2-ctl-streaming.cpp
> > +++ b/utils/v4l2-ctl/v4l2-ctl-streaming.cpp
> > @@ -1422,6 +1422,24 @@ static int do_handle_cap(cv4l_fd &fd, cv4l_queue &q, 
> > FILE *fout, int *index,
> >              * has the size that fits the old resolution and might not
> >              * fit to the new one.
> >              */
> > +           if (q.g_memory() == V4L2_MEMORY_USERPTR) {
> > +                   printf("\nidx: %d", buf.g_index());
> > +                   for (unsigned p = 0; p < q.g_num_planes(); p++) {
> > +                           unsigned *pb = (unsigned *)buf.g_userptr(p);
> > +                           printf(" old buf[%d]: %p first pixel: 0x%x", p, 
> > buf.g_userptr(p), *pb);
> > +                           fflush(stdout);
> > +                           free(buf.g_userptr(p));
> > +                           void *m = calloc(1, q.g_length(p));
> > +
> > +                           if (m == NULL)
> > +                                   return QUEUE_ERROR;
> > +                           q.s_userptr(buf.g_index(), p, m);
> > +                           if (m == buf.g_userptr(p))
> > +                                   printf(" identical new buf");
> > +                           buf.s_userptr(m, p);
> > +                   }
> > +                   printf("\n");
> > +           }
> >             if (fd.qbuf(buf) && errno != EINVAL) {
> >                     fprintf(stderr, "%s: qbuf error\n", __func__);
> >                     return QUEUE_ERROR;
> >
> >
> > Load vivid, setup a pure white test pattern:
> >
> > v4l2-ctl -c test_pattern=6
> >
> > Now run v4l2-ctl --stream-user and you'll see:
> >
> > idx: 0 old buf[0]: 0x7f91551cb010 first pixel: 0x80ea80ea identical new buf
> > <
> > idx: 1 old buf[0]: 0x7f915515a010 first pixel: 0x80ea80ea identical new buf
> > <
> > idx: 2 old buf[0]: 0x7f91550e9010 first pixel: 0x80ea80ea identical new buf
> > <
> > idx: 3 old buf[0]: 0x7f9155078010 first pixel: 0x80ea80ea identical new buf
> > <
> > idx: 0 old buf[0]: 0x7f91551cb010 first pixel: 0x0 identical new buf
> > <
> > idx: 1 old buf[0]: 0x7f915515a010 first pixel: 0x0 identical new buf
> > < 5.00 fps
> >
> > idx: 2 old buf[0]: 0x7f91550e9010 first pixel: 0x0 identical new buf
> > <
> > idx: 3 old buf[0]: 0x7f9155078010 first pixel: 0x0 identical new buf
> >
> > The first four dequeued buffers are filled with data, after that the
> > returned buffer is empty because vivid is actually writing to different
> > memory pages.
> >
> > With this patch the first pixel is always non-zero.
> 
> Good catch. The question is weather we treat that as undefined behavior 
> and keep the optimization for 'good applications' or assume that every 
> broken userspace code has to be properly handled.

Given how long we've been saying that USERPTR should be replaced by
DMABUF, I would consider that any userspace code using USERPTR is broken
:-) One could however question whether we were effective at getting that
message across...

> The good thing is that 
> there is still imho no security issue. The physical pages gathered by 
> vb2 in worst case belongs to noone else (vb2 is their last user, they 
> are not yet returned to free pages pool).
> 
> > I wonder if it isn't possible to just check the physical address of
> > the received user pointer with the physical address of the previous
> > user pointer. Or something like that. I'll dig around a bit more.
> 
> Such check won't be so simple. Pages contiguous in the virtual memory 
> won't map to pages contiguous in the physical memory, so you would need 
> to check every single memory page. Make no sense. It is better to 
> reacquire buffer on every queue operation. This indeed show how broken 
> the USERPTR related part of v4l2 API is.

-- 
Regards,

Laurent Pinchart

Reply via email to