Control: tag -1 confirmed

Good day,

having a closer look at kmc, there is simde set up, and it looks
like enabling -march=x86-64-v2 leads through a buggy build path.
Some parts of the source code are designed to build against some
specific combinations of machine specific flags.  In the present
case, if I inject -march=x86-64-v2, and some diagnostic output,
then a mismatch appears between code targetting the specific cpu
capability sse2, caused by availability of sse4.1 in build
options:
                                                  vvvv
        In file included from kmer_counter/raduls_sse2.cpp:12:
        kmer_counter/raduls_impl.h:734:2: warning: #warning 
"RADULS_RADIX_SORT_FUNNAME=RadixSortMSD_SSE41" [-Wcpp]
                                                                                
                    ^^^^^
thus, explaining the error reported by Matthias:
> ./kmer_counter/kmc.h:1132: undefined reference to `void 
> RadulsSort::RadixSortMSD_SSE2<CKmer<2u> >(CKmer<2u>*, CKmer<2u>*, unsigned 
> long long, unsigned int, unsigned int, CMemoryPool*)'
> collect2: error: ld returned 1 exit status

To a larger extent, this is bound to fail with additional
symptoms when using machine types x86-64-v3, as it appends
various avx support, without mentionning various combinations
thrown by -march=native.  I see two options to mitigate this:
 1. either disable build of raduls_sse2.cpp, which is unneeded
    in x86-64-v2 context, since it is overred by the sse4.1
    implementation raduls_sse41.cpp any ways;
 2. or cheat with the C preprocessor macro definitions, to get
    back the missing function in its sse2 only context.

Point 1 looks cleaner to me, but I only have been able to
implement point 2 successfully so far, without causing baseline
violations in normal builds, nor beating the purpose of
optimisations, and with support for further flags combinations
than just the baseline or -march=x86-64-v2.  Will push changes
on Salsa this evening, most likely.

Have a nice day,  :)
-- 
Étienne Mollier <emoll...@emlwks999.eu>
Fingerprint:  8f91 b227 c7d6 f2b1 948c  8236 793c f67e 8f0d 11da
Sent from /dev/pts/6, please excuse my verbosity.

Attachment: signature.asc
Description: PGP signature

Reply via email to