Control: tags -1 patch Hi all,
Am Montag, den 08.06.2015, 11:45 +0200 schrieb Fabian Greffrath: > So, in absence of a better approach, [...] I think I have found another, maybe even the final, fix for this issue. Remember that the operands in SSE functions must be aligned on 16-byte boundaries. In the init_xrpow_core_sse() function these operands are on the stack. However, when the code is called from the ocaml bindings, the stack is allocated by ocaml which does not adhere to the 16-byte boundary rule and thus casues the code to crash. So what we really need here is a means for the init_xrpow_core_sse() function to maintain its own stack and align it according to its needs. Now, guess what compiler flag I found yesterday? ;) -mstackrealign Realign the stack at entry. On the x86, the -mstackrealign option generates an alternate prologue and epilogue that realigns the run-time stack if necessary. This supports mixing legacy codes that keep 4-byte stack alignment with modern codes that keep 16-byte stack alignment for SSE compatibility. See also the attribute force_align_arg_pointer, applicable to individual functions. This flag applies per-file. If it is added to liblamevectorroutines_la_CFLAGS (next to the -msse flag) in libmp3lame/vector/Makefile.am the crash does not occure anymore. There is also a very similar per-function variant in the form of the force_align_arg_pointer attribute, but in the case at hand, all functions in the libmp3lame/vector/xmm_quantize_sub.c file call SSE -related code and thus I think it is safe to apply this flag file-wide. I'll be glad to read that this flag fixes the issue for you as well and read your opinions about the per-function or per-file variants. Best regards, Fabian
--- a/libmp3lame/vector/xmm_quantize_sub.c +++ b/libmp3lame/vector/xmm_quantize_sub.c @@ -52,6 +52,7 @@ static const FLOAT costab[TRI_SIZE * 2] +__attribute__((force_align_arg_pointer)) void init_xrpow_core_sse(gr_info * const cod_info, FLOAT xrpow[576], int upper, FLOAT * sum) {
--- a/libmp3lame/vector/Makefile.am +++ b/libmp3lame/vector/Makefile.am @@ -20,6 +20,7 @@ xmm_sources = xmm_quantize_sub.c if WITH_XMM liblamevectorroutines_la_SOURCES = $(xmm_sources) +liblamevectorroutines_la_CFLAGS = -msse -mstackrealign endif noinst_HEADERS = lame_intrin.h
signature.asc
Description: This is a digitally signed message part