[valgrind] [Bug 360035] New: POWER PC instruction bcdadd and bcdsubtract generate result with non-zero shadow bits

Carl Love via KDE Bugzilla Thu, 03 Mar 2016 09:19:45 -0800

https://bugs.kde.org/show_bug.cgi?id=360035


            Bug ID: 360035
           Summary: POWER PC instruction bcdadd and bcdsubtract generate
                    result with non-zero shadow bits
           Product: valgrind
           Version: 3.12 SVN
          Platform: Other
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: NOR
         Component: vex
          Assignee: jsew...@acm.org
          Reporter: c...@us.ibm.com

The Power BCD ADD and BCD SUB instructions generate a result with shadow bits
set.    
This then results in valgrind generating a message:                             

==8979== Use of uninitialised value of size 8                                   
==8979==    at 0x40F8BD8: _itoa_word (_itoa.c:180)                              
==8979==    by 0x40FB05B: vfprintf@@GLIBC_2.17 (vfprintf.c:1641)                
==8979==    by 0x1000084F: main (test_vector_misc.c:83                          

The shadow bits for both of the source operands are all zeros.                  
The vbits should all be set to zero as a result of executing the instruction,   
However, two of the 128 shadow bits are set to 1 causing Valgrind to generate   
the above error message.                                                        

The gdb interface to print the contents of the Power VR registers and its       
shadow bits doesn't work.   See bugzilla 360008.  If you edit                   
coregrind/m_gdbserver/valgrind-low-ppc64.c to put in debug prints that          
show the contents of the VSR32 and VSR33 as follows:                            

   case 72:  *mod = False; break; // GDBTD???? VEX { "trap", 4512, 64 },        
   case 73:  VG_(transfer) (&ppc64->guest_VSR32, buf, dir, size, mod);          
      VG_(printf)("\nCARLL, VSR32 = 0x%x %x %x %x\n",                           
                 ppc64->guest_VSR32[3], ppc64->guest_VSR32[2],                  
                 ppc64->guest_VSR32[1], ppc64->guest_VSR32[0]);                 
      break;                                                                    

   case 74:  VG_(transfer) (&ppc64->guest_VSR33, buf, dir, size, mod);          
      VG_(printf)("CARLL, VSR33 = 0x%x %x %x %x\n",                             
                  ppc64->guest_VSR33[3], ppc64->guest_VSR33[2],                 
                  ppc64->guest_VSR33[1], ppc64->guest_VSR33[0]);                
      break;                                                                    

You can see the correct values in the registers.                                

The P9/Tests/test_vector_convert test case demonstrates this.  Run the test     
case under valgrind with the option --vgdb-shadow-registers=yes to display      
the register and shadow register contents.   The command is:                    

   valgrind --vex-iropt-register-updates=allregs-at-each-insn --vgdb=yes        
               --vgdb-error=0 --vgdb-shadow-registers=yes ./test_vector_misc    

Stop at function test_vector_convert.  Put gdb into assembly mode with the      
command "layout asm".  Do stepi to get to the bcdadd instruction.            

0x10000894 <test_vector_convert+28>     nop                                     
   │0x10000898 <test_vector_convert+32>     addi    r9,r2,-32624                
   │0x1000089c <test_vector_convert+36>     lxvd2x  vs0,0,r9                    
   │0x100008a0 <test_vector_convert+40>     xxswapd vs0,vs0                     
   │0x100008a4 <test_vector_convert+44>     nop                                 
   │0x100008a8 <test_vector_convert+48>     addi    r9,r2,-32560                
   │0x100008ac <test_vector_convert+52>     lxvd2x  vs12,0,r9                   
   │0x100008b0 <test_vector_convert+56>     xxswapd vs12,vs12                   
   │0x100008b4 <test_vector_convert+60>     xxlor   vs32,vs0,vs0                
   │0x100008b8 <test_vector_convert+64>     xxlor   vs33,vs12,vs12              
   │0x100008bc <test_vector_convert+68>     bcdadd. v0,v0,v1,0                  
   │0x100008c0 <test_vector_convert+72>     xxlor   vs0,vs32,vs32               

Note, the Power VS register 0 to 31 map to the VSR32 to VSR64 registers.  The   
floating point registers to the upper 64-bits of the VSR0 to VSR31 registers.  
   The instruction
is using vr0 and vr1 which correspond to VSR32 and VSR33.    

When you stop at the bcdadd instruction the register contents are:              

register  contents                                                              
CARLL, VSR32 = 0xa0 0 0 8                                                       
CARLL, VSR33 = 0x0 99999999 99999999 9999959c                                   
shadow s1                                                                       
CARLL, VSR32 = 0x0 0 0 0                                                        
CARLL, VSR33 = 0x0 0 0 0                                                        
shadow s2
CARLL, VSR32 = 0x0 0 0 0                                                        
CARLL, VSR33 = 0x0 0 0 0                                                        

on gdb use "info registers all" to have gdb display the register contents:      

vr0            {uint128 = 0x00000000000000000000000000000000, v4_float = {0x0,
0x0, 0x0,  
    0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,      
    0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}}                             
vr1            {uint128 = 0x00000000000000000000000000000000, v4_float = {0x0,
0x0, 0x0,  
    0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,      

...                                                                             
vr0s1          {uint128 = 0x00000000000000000000000000000000, v4_float = {0x0,
0x0, 0x0,  
    0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,      
    0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}}                             
vr1s1          {uint128 = 0x00000000000000000000000000000000, v4_float = {0x0,
0x0, 0x0,  
    0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,      
    0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}}

Not that the contents are all zeros according to gdb but the debug prints say
otherwise.                                                                      

Stepi one more time to execute the bcd instructiion and you get:                

register contents                                                               
CARLL, VSR32 = 0x100 99999999 99999999 9999959c                                 
CARLL, VSR33 = 0x0 99999999 99999999 9999959c                                   
the s1 shadow contents
CARLL, VSR32 = 0x0 0 0 c                                                        
CARLL, VSR33 = 0x0 0 0 0                                                        
the s2 shadow contents
CARLL, VSR32 = 0x0 0 0 0                                                        
CARLL, VSR33 = 0x0 0 0 0                                                        

  gdb gives the following

vr0            {uint128 = 0x00000000000000000000000000000000, v4_float = {0x0,
0x0, 0x0,        
    0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,           
    0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}}                             
vr1            {uint128 = 0x00000000000000000000000000000000, v4_float = {0x0,
0x0, 0x0,       
    0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,           

...                                                                             
vr0s1          {uint128 = 0x00000000000000000000000000000000, v4_float = {0x0,
0x0, 0x0,       
    0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,           
    0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}}                             
vr1s1          {uint128 = 0x00000000000000000000000000000000, v4_float = {0x0,
0x0, 0x0,       
    0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0,
0x0,           
    0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}}

The value VSR32 = 0x100 99999999 99999999 9999959c is the correct result.  It   
will be printed by the test case.  Note, the shadow bits are not all zeros,     
that is the bug!!!                                                              

So we see that the source register VBits were all valid before the instruction  
but the result of the bcdadd instruction has two of the shadow register         
vbits set indicating the bits are uninitialized.                                

The vbits are set by the function do_shadow_store() in                          
memcheck/MC_translate.c, line 5018.  We are doing 128-bit                       
operations so in the " else if (UNLIKELY(ty == Ity_V128)) {"                    
block we have a call to MC_(helperc_STOREV64le) which calls                     
mc_storev64() do the actual vbit store.  mc_storev64() is defined in            
memcheck/mc_main.c at about line 4646.  If you put an if statement              
in the code:                                                                    

/*------------------------------------------------------------*/                
/*--- STOREV64                                             ---*/                
/*------------------------------------------------------------*/                

static INLINE                                                                   
void mc_STOREV64 ( Addr a, ULong vbits64, Bool isBigEndian )                    
{                                                                               
   PROF_EVENT(MCPE_STOREV64);                                                   

#ifndef PERF_FAST_STOREV                                                        
   // XXX: this slow case seems to be marginally faster than the fast case!     
   // Investigate further.                                                      
   mc_STOREVn_slow( a, 64, vbits64, isBigEndian );                              
#else                                                                           
   {                                                                            
      UWord   sm_off16, vabits16;                                               
      SecMap* sm;                                                               

      if (vbits64 == 0xC)                                                       
         VG_(printf)("CARLL, write bad vbits\n");                               

      if (UNLIKELY( UNALIGNED_OR_HIGH(a,64) )) {                                
         PROF_EVENT(MCPE_STOREV64_SLOW1);                                       
         mc_STOREVn_slow( a, 64, vbits64, isBigEndian );                        
         return;                                                                
      }    

      sm       = get_secmap_for_reading_low(a);                                 
      sm_off16 = SM_OFF_16(a);                                                  
      vabits16 = ((UShort*)(sm->vabits8))[sm_off16];                            

...                                                                             
}                                                                               

You can get it to print the message "write bad vbits" for this particular case
which corresponds to the valgrind generated messages:                           

==8979== Use of uninitialised value of size 8                                   
==8979==    at 0x40F8BD8: _itoa_word (_itoa.c:180)                              
==8979==    by 0x40FB05B: vfprintf@@GLIBC_2.17 (vfprintf.c:1641)                
==8979==    by 0x1000084F: main (test_vector_misc.c:83                          

What I have not been able to do is to find exactly where the vbits64 value      
gets generated when the bcdadd instruction executes.  I don't know if it is     
done via generated assembly code or a call out to a C function like the         
mc_STOREV64() function that stores the vgit value.  Need help to figure out why 
the vbits64 value is calculation is wrong.                         

Reproducible: Always

Steps to Reproduce:
1. given in the details.  Attached the test program and a script to compile it.
2.
3.

-- 
You are receiving this mail because:
You are watching all bug changes.

[valgrind] [Bug 360035] New: POWER PC instruction bcdadd and bcdsubtract generate result with non-zero shadow bits

Reply via email to