https://bugs.kde.org/show_bug.cgi?id=369053

            Bug ID: 369053
           Summary: AMD64 fma4 instructions missing 256 bit support
           Product: valgrind
           Version: 3.9.0
          Platform: Other
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: NOR
         Component: vex
          Assignee: jsew...@acm.org
          Reporter: m...@redhat.com
                CC: andreas.boer...@w84u.org, bellamy.b...@gmail.com,
                    dra...@shaw.ca, m...@redhat.com, p4pl...@gmail.com,
                    smj...@gmail.com, t...@compton.nu
        Depends on: 369000
            Blocks: 339596

Lets clone this once again. We now have fma4 support, but only for the
128bit/xmm cases. There are already tests (in none/tests/amd64/fma4.c) for the
256bit/ymm cases, but these are disabled for now because those aren't
implemented yet. This would need to check getVexL(pfx) and extend the
operations to the full 256bits.

+++ This bug was initially created as a clone of Bug #369000 +++

Lets split the fma4 and xop instructions into separate bugs & patches.
(I have already looked at the fma4 ones, but haven't had time for the xop
instructions.)

+++ This bug was initially created as a clone of Bug #339596 +++

When running valgrind upon program startup I immediately run into an illegal
instruction.  My first action was to try using the latest source from SVN, this
however, did not help.  Below are a few extra details on the instruction in
question, my system, and anything else I can think of.

vex amd64->IR: unhandled instruction bytes: 0x8F 0xE8 0x78 0xCD 0xC1 0x4 0xC5
0xF9
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==16432== valgrind: Unrecognised instruction at address 0x5adb623.
==16432==    at 0x5ADB623: findChar(QChar const*, int, QChar, int,
Qt::CaseSensitivity) (in /usr/lib64/libQt5Core.so.5.4.0)
==16432==    by 0x5AEC0C9: QString::split(QChar, QString::SplitBehavior,
Qt::CaseSensitivity) const (in /usr/lib64/libQt5Core.so.5.4.0)
==16432==    by 0x5BC8E61:
QStandardPaths::standardLocations(QStandardPaths::StandardLocation) (in
/usr/lib64/libQt5Core.so.5.4.0)
==16432==    by 0x5B7D9B7:
QStandardPaths::locate(QStandardPaths::StandardLocation, QString const&,
QFlags<QStandardPaths::LocateOption>) (in /usr/lib64/libQt5Core.so.5.4.0)
==16432==    by 0x5BB6028: QLoggingRegistry::init() (in
/usr/lib64/libQt5Core.so.5.4.0)
==16432==    by 0x5C2BC8A: QCoreApplication::init() (in
/usr/lib64/libQt5Core.so.5.4.0)
==16432==    by 0x5C2BEF5:
QCoreApplication::QCoreApplication(QCoreApplicationPrivate&) (in
/usr/lib64/libQt5Core.so.5.4.0)
==16432==    by 0x5560718:
QGuiApplication::QGuiApplication(QGuiApplicationPrivate&) (in
/usr/lib64/libQt5Gui.so.5.4.0)
==16432==    by 0x4F8F23C: QApplication::QApplication(int&, char**, int) (in
/usr/lib64/libQt5Widgets.so.5.4.0)
==16432==    by 0x40D62D: main (main.cpp:37)

I've run it through GDB and grabbed a disassembly output (More can be provided
if requested):
   0x0000000005adb619 <+329>:    mov    %r8,%rax
   0x0000000005adb61c <+332>:    vmovups (%rax),%xmm0
   0x0000000005adb620 <+336>:    mov    %rcx,%r8
=> 0x0000000005adb623 <+339>:    vpcomw $0x4,%xmm1,%xmm0,%xmm0
   0x0000000005adb629 <+345>:    vpmovmskb %xmm0,%esi
   0x0000000005adb62d <+349>:    test   %si,%si
   0x0000000005adb630 <+352>:    je     0x5adb610
<_ZL8findCharPK5QChariS_iN2Qt15CaseSensitivityE+320>
   0x0000000005adb632 <+354>:    bsf    %esi,%esi
   0x0000000005adb635 <+357>:    sub    %rdi,%rax

My system CPU is an AMD FX-8150, and Qt (where the instruction seems to
originate) is compiled with GCC 4.8.  I am running a Gentoo based system and
have used -march=native, -O2, and -fomit-frame-pointer as my only three default
CFLAGS.   

If it helps I can setup a VM with SSH access temporarily to aid in
testing/debugging of the problem.  However, for sanity sake I will only do this
Valgrind developers.

My final observersations seem to be that the XOP and FM4 instructions
introduced in the bulldozer generation AMD processors seem to cause the most
trouble.  But that may be beyond the scope of this bug report.

Reproducible: Always

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to