Hi Luca, thanks for the details. Indeed I understand why the "content_t" needs to be allocated dynamically: it's just like the control block used by STL's std::shared_ptr<>.
And you're right: I'm not sure how much gain there is in removing 100% of malloc operations from my TX path... still I would be curious to find it out but right now it seems I need to patch ZMQ source code to achieve that. Anyway I wonder if it could be possible to expose in the public API a method like "zmq::msg_t::init_external_storage()" that, AFAICS, allows to create a non-shared zero-copy long message... it appears to be used only by v2 decoder internally right now... Is there a specific reason why that's not accessible from the public API? Thanks, Francesco Il giorno gio 4 lug 2019 alle ore 20:25 Luca Boccassi < [email protected]> ha scritto: > Another reason for that small struct to be on the heap is so that it > can be shared among all the copies of the message (eg: a pub socket has > N copies of the message on the stack, one for each subscriber). The > struct has an atomic counter in it, so that when all the copies of the > message on the stack have been closed, the userspace buffer > deallocation callback can be invoked. If the atomic counter were on the > stack inlined in the message, this wouldn't work. > So even if room were to be found, a malloc would still be needed. > > If you _really_ are worried about it, and testing shows it makes a > difference, then one option could be to pre-allocate a set of these > metadata structures at startup, and just assign them when the message > is created. It's possible, but increases complexity quite a bit, so it > needs to be worth it. > > On Thu, 2019-07-04 at 17:42 +0100, Luca Boccassi wrote: > > The second malloc cannot be avoided, but it's tiny and fixed in size > > at > > compile time, so the compiler and glibc will be able to optimize it > > to > > death. > > > > The reason for that is that there's not enough room in the 64 bytes > > to > > store that structure, and increasing the message allocation on the > > stack past 64 bytes means it will no longer fit in a single cache > > line, > > which will incur in a performance penalty far worse than the small > > malloc (I tested this some time ago). That is of course unless you > > are > > running on s390 or a POWER with 256 bytes cacheline, but given it's > > part of the ABI it would be a bit of a mess for the benefit of very > > few > > users if any. > > > > So I'd recommend to just go with the second plan, and compare what > > the > > result is when passing a deallocation function vs not passing it (yes > > it will leak the memory but it's just for the test). My bet is that > > the > > difference will not be that large. > > > > On Thu, 2019-07-04 at 16:30 +0200, Francesco wrote: > > > Hi Stephan, Hi Luca, > > > > > > thanks for your hints. However I inspected > > > > https://github.com/dasys-lab/capnzero/blob/master/capnzero/src/Publisher.cpp > > > > > > and I don't think it's saving from malloc()... see my point 2) > > > below: > > > > > > Indeed I realized that probably current ZMQ API does not allow me > > > to > > > achieve the 100% of what I intended to do. > > > Let me rephrase my target: my target is to be able to > > > - memory pool creation: do a large memory allocation of, say, 1M > > > zmq_msg_t only at the start of my program; let's say I create all > > > these zmq_msg_t of a size of 2k bytes each (let's assume this is > > > the > > > max size of message possible in my app) > > > - during application lifetime: call zmq_msg_send() at anytime > > > always > > > avoiding malloc() operations (just picking the first available > > > unused > > > entry of zmq_msg_t from the memory pool). > > > > > > Initially I thought that was possible but I think I have identified > > > 2 > > > blocking issues: > > > 1) If I try to recycle zmq_msg_t directly: in this case I will fail > > > because I cannot really change only the "size" member of a > > > zmq_msg_t > > > without reallocating it... so that I'm forced (in my example) to > > > always send 2k bytes out (!!) > > > 2) if I do create only a memory pool of buffers of 2k bytes and > > > then > > > wrap the first available buffer inside a zmq_msg_t (allocated on > > > the > > > stack, not in the heap): in this case I need to know when the > > > internals of ZMQ have completed using the zmq_msg_t and thus when I > > > can mark that buffer as available again in my memory pool. However > > > I > > > see that zmq_msg_init_data() ZMQ code contains: > > > > > > // Initialize constant message if there's no need to > > > deallocate > > > if (ffn_ == NULL) { > > > ... > > > _u.cmsg.data = data_; > > > _u.cmsg.size = size_; > > > ... > > > } else { > > > ... > > > _u.lmsg.content = > > > static_cast<content_t *> (malloc (sizeof (content_t))); > > > ... > > > _u.lmsg.content->data = data_; > > > _u.lmsg.content->size = size_; > > > _u.lmsg.content->ffn = ffn_; > > > _u.lmsg.content->hint = hint_; > > > new (&_u.lmsg.content->refcnt) zmq::atomic_counter_t (); > > > } > > > > > > So that I skip malloc() operation only if I pass ffn_ == NULL. The > > > problem is that if I pass ffn_ == NULL, then I have no way to know > > > when the internals of ZMQ have completed using the zmq_msg_t... > > > > > > Any way to workaround either issue 1) or issue 2) ? > > > > > > I understand that the malloc is just of size(content_t)~= 40B... > > > but > > > still I'd like to avoid it... > > > > > > Thanks! > > > Francesco > > > > > > > > > > > > > > > > > > Il giorno gio 4 lug 2019 alle ore 14:58 Stephan Opfer < > > > [email protected] > > > > ha scritto: > > > > On 04.07.19 14:29, Luca Boccassi wrote: > > > > > How users make use of these primitives is up to them though, I > > > > > > > > don't > > > > > think anything special was shared before, as far as I remember. > > > > > > > > Some example can be found here: > > > > https://github.com/dasys-lab/capnzero/tree/master/capnzero/src > > > > > > > > > > > > The classes Publisher and Subscriber should replace the publisher > > > > and > > > > subscriber in a former Robot-Operating-System-based System. I > > > > hope > > > > that > > > > the subscriber is actually using the method Luca is talking about > > > > on the > > > > receiving side. > > > > > > > > The message data here is a Cap'n Proto container that we > > > > "simply" > > > > serialize and send via ZeroMQ -> therefore the name Cap'nZero ;-) > > > > > > > > _______________________________________________ > > > > zeromq-dev mailing list > > > > [email protected] > > > > > > > > > > > > https://lists.zeromq.org/mailman/listinfo/zeromq-dev > > > > > > > > > > > > > > > -- > Kind regards, > Luca Boccassi > _______________________________________________ > zeromq-dev mailing list > [email protected] > https://lists.zeromq.org/mailman/listinfo/zeromq-dev >
_______________________________________________ zeromq-dev mailing list [email protected] https://lists.zeromq.org/mailman/listinfo/zeromq-dev
