On Mon, Feb 23, 2009 at 10:39 AM, Jiri Olsa <olsaj...@gmail.com> wrote: > On Mon, Feb 23, 2009 at 7:35 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Mon, Feb 23, 2009 at 10:05 AM, Jiri Olsa <olsaj...@gmail.com> wrote: >>> Hi, >>> >>> my shared library crashes with movaps instruction using not aligned memory. >>> >>> Since the shared library function is being called from dynamic linker, which >>> basically prepares the memory location, I'm not sure whoose side issues >>> this is. >>> >>> I have following function in C: >>> >>> typedef float La_x86_64_xmm __attribute__ ((__vector_size__ (16))); >>> >>> typedef struct La_x86_64_retval >>> { >>> uint64_t lrv_rax; >>> uint64_t lrv_rdx; >>> La_x86_64_xmm lrv_xmm0; >>> La_x86_64_xmm lrv_xmm1; >>> long double lrv_st0; >>> long double lrv_st1; >>> } La_x86_64_retval; >>> >>> unsigned int la_x86_64_gnu_pltexit (Elf64_Sym *__sym, >>> unsigned int __ndx, uintptr_t *__refcook, uintptr_t >>> *__defcook, >>> const La_x86_64_regs *__inregs, La_x86_64_retval >>> *__outregs, const char *symname) >>> { >>> La_x86_64_xmm b __attribute__ ((aligned(16))); >>> b = __outregs->lrv_xmm0; >>> return 0; >>> } >>> >>> this will endup in following assembly: >>> >>> 00000000000007d7 <la_x86_64_gnu_pltexit>: >>> 7d7: 55 push %rbp >>> 7d8: 48 89 e5 mov %rsp,%rbp >>> 7db: 48 89 7d e8 mov %rdi,-0x18(%rbp) >>> 7df: 89 75 e4 mov %esi,-0x1c(%rbp) >>> 7e2: 48 89 55 d8 mov %rdx,-0x28(%rbp) >>> 7e6: 48 89 4d d0 mov %rcx,-0x30(%rbp) >>> 7ea: 4c 89 45 c8 mov %r8,-0x38(%rbp) >>> 7ee: 4c 89 4d c0 mov %r9,-0x40(%rbp) >>> 7f2: 48 8b 45 c0 mov -0x40(%rbp),%rax >>> 7f6: 0f 28 40 10 movaps 0x10(%rax),%xmm0 >>> 7fa: 0f 29 45 f0 movaps %xmm0,-0x10(%rbp) >>> 7fe: b8 00 00 00 00 mov $0x0,%eax >>> 803: c9 leaveq >>> 804: c3 retq >>> >>> >>> Looks like xmm0 register is being used to transfer the data. However >>> the structure's alignment is not 16, so it will crash. >>> >> >> Where exactly is it crashed? Which the structure isn't aligned at 16byte? >> >> >> >> -- >> H.J. >> > > > sry, it crashes on this one > > 7f6: 0f 28 40 10 movaps 0x10(%rax),%xmm0 > > This structure/argument is not aligned at 16 > La_x86_64_retval *__outreg > > the '__outregs->lrv_xmm0' is at 16th byte of the structure... >
Why isn't __outregs aligned at 16byte? According to x86-64 psABI, La_x86_64_retval should be aligned at 16byte. -- H.J.