On Wed, Jun 03, 2020 at 05:52:05PM +0100, Al Viro wrote: > On Wed, Jun 03, 2020 at 01:29:00AM -0400, Michael S. Tsirkin wrote: > > On Wed, Jun 03, 2020 at 02:48:15AM +0100, Al Viro wrote: > > > On Tue, Jun 02, 2020 at 04:45:05AM -0400, Michael S. Tsirkin wrote: > > > > So vhost needs to poke at userspace *a lot* in a quick succession. It > > > > is thus benefitial to enable userspace access, do our thing, then > > > > disable. Except access_ok has already been pre-validated with all the > > > > relevant nospec checks, so we don't need that. Add an API to allow > > > > userspace access after access_ok and barrier_nospec are done. > > > > > > BTW, what are you going to do about vq->iotlb != NULL case? Because > > > you sure as hell do *NOT* want e.g. translate_desc() under STAC. > > > Disable it around the calls of translate_desc()? > > > > > > How widely do you hope to stretch the user_access areas, anyway? > > > > So ATM I'm looking at adding support for the packed ring format. > > That does something like: > > > > get_user(flags, desc->flags) > > smp_rmb() > > if (flags & VALID) > > copy_from_user(&adesc, desc, sizeof adesc); > > > > this would be a good candidate I think. > > Perhaps, once we get stac/clac out of raw_copy_from_user() (coming cycle, > probably).
That sounds good. Presumably raw_copy_from_user will be smart enough to optimize aligned 2 byte accesses for flags above? > BTW, how large is the structure and how is it aligned? It's a batch of 16 byte structures, aligned to 16 bytes. At the moment I used batch size of 64 which seems enough. Ideally we'd actually read the whole batch like this without stac/clac back and forth. E.g. struct vring_desc adesc[64] = {}; stac() for (i = 0; i < 64; ++i) { get_user(flags, desc[i].flags) smp_rmb() if (!(flags & VALID)) break; copy_from_user(&adesc[i], desc + i, sizeof adesc[i]); } clac() > > > BTW, speaking of possible annotations: looks like there's a large > > > subset of call graph that can be reached only from vhost_worker() > > > or from several ioctls, with all uaccess limited to that subgraph > > > (thankfully). Having that explicitly marked might be a good idea... > > > > Sure. What's a good way to do that though? Any examples to follow? > > Or do you mean code comments? > > Not sure... FWIW, the part of call graph from "known to be only > used by vhost_worker" (->handle_kick/vhost_work_init callback/ > vhost_poll_init callback) and "part of ->ioctl()" to actual uaccess > primitives is fairly large - the longest chain is > handle_tx_net -> > handle_tx -> > handle_tx_zerocopy -> > get_tx_bufs -> > vhost_net_tx_get_vq_desc -> > vhost_tx_batch -> > vhost_net_signal_used -> > vhost_add_used_and_signal_n -> > vhost_signal -> > vhost_notify -> > vhost_get_avail_flags -> > vhost_get_avail -> > vhost_get_user -> > __get_user() > i.e. 14 levels deep and the graph doesn't factorize well... > > Something along the lines of "all callers of thus annotated function > must be annotated the same way themselves, any implicit conversion > of pointers to such functions to anything other than boolean yields > a warning, explicit cast is allowed only with __force", perhaps? > Then slap such annotations on vhost_{get,put,copy_to,copy_from}_user(), > on ->handle_kick(), a force-cast in the only caller of ->handle_kick() > and force-casts in the 3 callers in ->ioctl(). > > And propagate the annotations until the warnings stop, basically... > > Shouldn't be terribly hard to teach sparse that kind of stuff and it > might be useful elsewhere. It would act as a qualifier on function > pointers, with syntax ultimately expanding to __attribute__((something)). > I'll need to refresh my memories of the parser, but IIRC that shouldn't > require serious modifications. Most of the work would be in > evaluate_call(), just before calling evaluate_symbol_call()... > I'll look into that; not right now, though. Thanks, that does sound useful! > BTW, __vhost_get_user() might be better off expanded in both callers - > that would get their structure similar to vhost_copy_{to,from}_user(), > especially if you expand __vhost_get_user_slow() as well. I agree, that does sound like a good cleanup. > Not sure I understand what's going with ->meta_iotlb[] - what are the > lifetime rules for struct vhost_iotlb_map and what prevents the pointers > from going stale? It can be zeroed at any point. We just try to call __vhost_vq_meta_reset whenever anything can go stale.