I've also observed this performance regression. The minimal fix for me is removing the two > if (unlikely(len > (unsigned long)ctx)) checks added in 680557c.
After digging a little more, the reason that check can fail appears to be that add_recvbuf_mergeable sometimes includes a hole at the end, which is included in len but not ctx. I'd send a patch removing those conditions, but I'm not certain whether "truesize" in receive_mergeable should also be changed back to be the max of len/ctx, or should remain as-is. - Euan