https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56829
vincenzo Innocente <vincenzo.innocente at cern dot ch> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Feature request: "generic" |Feature request: "generic" |builtin for "movemask" |builtin to support control | |flow in vectorized code | |("movemask", | |"vec_any/all_*") --- Comment #1 from vincenzo Innocente <vincenzo.innocente at cern dot ch> --- as gcc 4.9 is now out I would like to come back to this request. As more support for it I have found this interesting talk http://llvm.org/devmtg/2012-04-12/Slides/Ralf_Karrenberg.pdf that from slide 17 addresses the issue of "divergent control flow" and its implementation on cpu (in the contest of OpenCL, still the argument is fully valid for other type of implementations) including a praise for a "a way to express predication in IR" in slide 25. For a general discussion and implementation see also http://www.mcs.anl.gov/publication/introducing-control-flow-vectorized-code and reference therein My preference is still for a builtin that converts a mask into an integer (movemask behavior). one can then use _builtin_popcount, __builtin_ctz etc to "cast" it in an bool. for altivec, gcc implements vec_any_"cpm" and vec_all_"cpm" set of functions that combine the comparison and the mask->int conversion. This is a possible alternative syntax. My understanding it that neon does not support any form of predication in its instruction set. (see http://stackoverflow.com/questions/11870910/sse-mm-movemask-epi8-equivalent-method-for-arm-neon for instance). This is an even more compelling reason for the compiler to provide a "generic" builtin!