https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109484
--- Comment #2 from 。 <570070308 at qq dot com> --- (In reply to Richard Biener from comment #1) > but you clobber 'temp' early and fail to indicate that so GCC allocates the > same register as part of the "+m" output. The requirements you describe are not reflected in the documentation. The document only says that `GCC assumpts that the assembler code consumes its inputs before producing outputs`, and this code fits the assumption. First, it reads the input from %1, then write the output to %0, then write the output to %1. No outputs happend before inputs.