On Tue, Jun 20, 2017 at 12:18 PM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Tue, Jun 20, 2017 at 10:03 AM, Uros Bizjak <ubiz...@gmail.com> wrote:
>> On Mon, Jun 19, 2017 at 7:51 PM, Jakub Jelinek <ja...@redhat.com> wrote:
>>> On Mon, Jun 19, 2017 at 11:45:13AM -0600, Jeff Law wrote:
>>>> On 06/19/2017 11:29 AM, Jakub Jelinek wrote:
>>>> >
>>>> > Also, on i?86 orq $0, (%rsp) or orl $0, (%esp) is used to probe stack,
>>>> > while it is shorter, is it actually faster or as slow as movq $0, (%rsp)
>>>> > or movl $0, (%esp) ?
>>>> Florian raised this privately to me as well.  THere's a couple issues.
>>>>
>>>> 1. Is there a performance penalty/gain for sub-word operations?  If not,
>>>>    we can improve things slighly there.  Even if it's performance
>>>>    neutral we can probably do better on code size.
>>>
>>> CCing Uros and Honza here, I believe there are at least on x86 penalties
>>> for 2-byte, maybe for 1-byte and then sometimes some stalls when you
>>> write or read in a different size from a recent write or read.
>>
>> Don't use orq $0, (%rsp), as this is a high latency RMW insn.
>
> Well, but _maybe_ it's optimized because oring 0 never changes anything?
> At least it would be nice if it would only trigger the page-fault side-effect
> and then not consume other CPU resources.

It doesn't look so:

--cut here--
void
__attribute__ ((noinline))
test_or (void)
{
  volatile int a;
  unsigned int n;

  for (n = 0; n < (unsigned) -1; n++)
    asm ("orl $0, %0" : "+m" (a));
}

void
__attribute__ ((noinline))
test_movb (void)
{
  volatile int a;
  unsigned int n;

  for (n = 0; n < (unsigned) -1; n++)
    asm ("movb $0, %0" : "+m" (a));
}

void
__attribute__ ((noinline))
test_movl (void)
{
  volatile int a;
  unsigned int n;

  for (n = 0; n < (unsigned) -1; n++)
    asm ("movl $0, %0" : "+m" (a));
}

int main()
{
 test_or ();
 test_movb ();
 test_movl ();
 return 0;
}
--cut here--

  74,99%  a.out    a.out          [.] test_or
  12,50%  a.out    a.out          [.] test_movb
  12,50%  a.out    a.out          [.] test_movl

Uros.

Reply via email to