On Thu, 2016-12-08 at 00:35 +0000, Duyck, Alexander H wrote: > Well there ends up being a few aspects to it. First we don't need the > precision of a full 64b inverse multiplication, that is why we can get > away with multiple by 85 and shift. The assumption is we should never > see a buffer larger than 64K for a TSO frame. That being the case we > can do the same thing without having to use a 64b value which isn't an > option on 32b architectures. > > So basically what it comes down to is dealing with the "optimized for > size" kernel option, and 32b architectures not being able to do this. > Arguably both are corner cases but better to deal with them than take > a performance hit we don't have to.
ok ok ;) Too bad the 65536 value is accepted, (is it ?) otherwise unsigned int foo(unsigned short size) { return size / 0x3000; } -> generates the same kind of instructions, with maybe a better precision. foo: movzwl %di, %eax imull $43691, %eax, %eax shrl $29, %eax ret