When vectorising a comparison between N-bit integers or N-bit floats, we want the boolean result to use the vector mask type for N-bit elements. On most targets this is a vector of N-bit integers, but for SVE it's a vector predicate and on AVX512 it's a scalar integer mask.
On the other hand, when loading or storing an M-byte boolean, we want to treat it like any other M-byte integer type. This difference leads to some complicated handling. E.g. booleean logic ops fed by two N-bit comparisons should use a vector mask for N-bit elements. But boolean logic ops fed by two M-byte data loads should use normal M-byte integer vectors. Boolean logic ops fed by an N-bit comparison and an M-bit comparison need to convert one of the inputs first (handled via pattern stmts). Boolean logic ops fed by an N-bit comparison and a load are not yet supported. Etc. Historically we've tried to make this choice on the fly. This has two major downsides: (a) search_type_for_mask has to use a worklist to find the mask type for a particular operation. The results are not cached between calls, so this is a potential source of quadratic behavior. (b) we can only choose the vector type for a boolean result once we know the vector types of the inputs. So both the loop and SLP vectorisers make another pass for boolean types. The second example in PR 92596 is another case in which (b) causes problems. I tried various non-invasive ways of working around it, but although they worked for the testcase and testsuite, it was easy to see that they were flaky and would probably cause problems later. In the end I think the best fix is to stop trying to make this decision on the fly and record it in the stmt_vec_info instead. Obviously it's not ideal to be doing something like this in stage 3, but it is a bug fix and I think it will make bool-related problems easier to handle in future. Each patch tested individually on aarch64-linux-gnu and the series as a whole on x86_64-linux-gnu. OK to install? Richard