>> Uros noted that STV with !TImode isn't supposed to run before combine. >> The following adjusts things accordingly and now the pass runs twice >> for TARGET_64BIT. I've also adjusted another gpr->xmm move to >> use (vec_merge (vec_duplicate..)) style rather than using a subreg. >> This isn't strictly neccesary to fix the bug though and my previous >> needs to do this might have been caused by the pass running too early. >> >> So - with or without this consistency part?
The true STV pass (SI/DImode) is intended to run after combine, so memory operands are merged to the instruction together with other scalar simplications combine can perform. As noted by HJ, there is intended pessimization of moves betwen register sets that was introduced before IRA and preffered_for_{speed,size} attribute. >> Bootstrap / regtest running on x86_64-unknown-linux-gnu, OK? IMO, the patch is OK, the regression should be solved by retuning iter-regset move cost. We should not trick the compiler by hiding the move using tricks where we make the move RTX more complex, this is fragile and will regress as soon as the compiler learns corresponding simplification. One example is the referred failing minmax-6.c case. Uros. > It regresses > > FAIL: gcc.target/i386/minmax-6.c scan-assembler-not rsp > > Have to investigate that again then (it caused the change to > use (vec_merge (vec_duplicate..)) for conversion in the first place) > > Richard. > >> Thanks, >> Richard. >> >> 2019-08-19 Richard Biener <rguent...@suse.de> >> >> PR target/91154 >> * config/i386/i386-features.c (general_scalar_chain::convert_op): >> Use (vec_merge (vec_duplicate..)) style vector from scalar move. >> (convert_scalars_to_vector): Add timode_p parameter and use it >> to guard TImode-only operation. >> (pass_stv::gate): Adjust so STV runs twice for TARGET_64BIT. >> (pass_stv::execute): Pass down timode_p. >> >> * gcc.target/i386/minmax-7.c: New testcase.