This is one example, but it illustrates a general concept that I think is really useful and I personally have used numerous times for lots of other instructions than SCAS. If there is a way to achieve this without using a naked function then please advise.
Keeping the __asm syntax, I'd be surprised if this did not work: template<typename T> int find_first_nonzero_scas(T* x, int cnt) { int result = 0; __asm { xor eax, eax mov edi, x mov ecx, cnt } if (sizeof (T) == 1) __asm { rep scasb; mov result, edi } if (sizeof (T) == 2) __asm { rep scasw; mov result, edi } if (sizeof (T) == 4) __asm { rep scasl; mov result, edi } result -= reinterpret_cast<int>(x); result /= sizeof(T); return --result; } Paolo