------- Comment #4 from adam at consulting dot net dot nz 2010-01-05 04:17 ------- /* Workaround discovered! */ void test_int_vectors_containing_fp_data_using_local_reg_var_overlay() { //create local register variables of the required floating point type //(for the same global register variables) register xmm_2f64_t local_xmm_c __asm__("xmm6"); register xmm_2f64_t local_xmm_d __asm__("xmm7"); //same calculation upon the local register variables. No casts are required. local_xmm_c = local_xmm_c + local_xmm_d; //the local changes above will be optimised away unless the global register //variables are updated. The casts below should be a no-op as the local //register variables are aliased to the global register variables. xmm_c=(xmm_2i64_t) local_xmm_c; xmm_d=(xmm_2i64_t) local_xmm_d; }
With this workaround generated code is still optimal when the global register variables have an integer vector type: 0000000000400550 <test_int_vectors_containing_fp_data_using_local_reg_var_overlay>: 400550: 66 0f 58 f7 addpd xmm6,xmm7 400554: c3 ret -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42596