On 26/10/18 00:42 +0200, Marc Glisse wrote:
On Fri, 26 Oct 2018, Ville Voutilainen wrote:
I would rather not introduce a behavioral difference between us and
libc++.
Why not? There are already several, and it helps find bugs. Maybe you
could convince libc++ to change as well if you want to keep the
behavior the same?
For the libc++ string zeroing the length of a small string happens to
be faster. See Howard's answer at the stackoverflow link I gave. He
says:
'So in summary, the setting of the source to an empty string is necessary in "long
mode" to transfer ownership of the pointer, and also necessary in short mode for
performance reasons to avoid a broken pipeline.'
That's not the case for our SSO string, which wasn't designed from the
ground up to make this move constructor optimal. We can choose whether
to leave moved-from strings empty or not.
I'm not persuaded that preserving portable behaviour between
implementations is useful here. This is unspecified behaviour. I agree
with Marc that introducing a difference helps find bugs, and teaches
people not to rely on unspecified properties.
I haven't been able to measure any performance difference, and in many
cases the writes to the rvalue will be dead stores if the object is
about to be destroyed anyway. But the code does look slightly better
(caveat: I don't know what I'm talking about).
Before:
leaq 16(%rdi), %rdx
movq %rdi, %rax
movq %rdx, (%rdi)
movq (%rsi), %rcx
leaq 16(%rsi), %rdx
cmpq %rdx, %rcx
je .L5
movq %rcx, (%rdi)
movq 16(%rsi), %rcx
movq %rcx, 16(%rdi)
.L3:
movq 8(%rsi), %rcx
movq %rdx, (%rsi)
movq $0, 8(%rsi)
movq %rcx, 8(%rax)
movb $0, 16(%rsi)
ret
.L5:
movdqu 16(%rsi), %xmm0
movups %xmm0, 16(%rdi)
jmp .L3
After:
leaq 16(%rdi), %rdx
movq %rdi, %rax
movq %rdx, (%rdi)
movq 8(%rsi), %rdx
movq (%rsi), %rcx
movq %rdx, 8(%rdi)
leaq 16(%rsi), %rdx
cmpq %rdx, %rcx
je .L5
movq %rcx, (%rdi)
movq 16(%rsi), %rcx
movq %rdx, (%rsi)
movq %rcx, 16(%rdi)
movq $0, 8(%rsi)
movb $0, 16(%rsi)
ret
.L5:
movdqu 16(%rsi), %xmm0
movups %xmm0, 16(%rdi)
ret
It does slightly concern me that some users might
actually semantically expect a moved-from string to be empty, even
though that's not guaranteed, although for non-SSO cases
it *is* guaranteed.
Is it? In debug mode, I'd be tempted to leave the string as "moved"
(size 5, short string so there is no allocation).
That's not a bad idea.