https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84719

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW

--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> ---
I compared __builtin_memcpy one size at a time.  Here are results in
cycles:

clang 1 bytes: 17193410146
gcc   1 bytes: 15440244966
clang 2 bytes: 8997535880
gcc   2 bytes: 8147449530
clang 3 bytes: 6002276628
gcc   3 bytes: 5430387704
clang 4 bytes: 4497121282
gcc   4 bytes: 4069604454
clang 5 bytes: 3644879742
gcc   5 bytes: 3258094970
clang 6 bytes: 3045612708
gcc   6 bytes: 2728410608
clang 7 bytes: 2574110178
gcc   7 bytes: 2330365680
clang 8 bytes: 969894432
gcc   8 bytes: 6436950208

GCC is faster except for 8 byte size.

Reply via email to