2016-08-29 20:53 GMT+02:00 Fons Adriaensen <f...@linuxaudio.org>: Hi Fons,
>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=832095 >> >> Actually it is a patch speed up zita-resampler processing 3x quicker >> on sse machines. >> I am reluctant apply it without your approval. > > Thanks for your message. > > Apparently Steinar didn't get my reply to his latest message, > so I'm adding him in CC. > > What I wrote a few weeks ago is basically this: > > I wil not accept the patch in its current form, but OTOH the > code is too good to just be ignored, so I will integrate it > in another way. > > For the next release of zita-resampler I will reorganise the > code a bit, so it will be possible to have separate optimised > Resampler1,2,4 classes (for 1,2,4 channels respectively) using > the SSE code, and without too much code duplication. Same for > Vresampler. > > So Steinar, could you provide optimised 1 and 4 chan versions > as well ? Even better would be if the latter could handle any > multiple of 4 channels. In all cases you may assume (hlen % 4 == 0). > > There is one comment in your patch which I don't understand: > > + // Writes two bytes more than we want, but this is fine since out_count > >= 2. > + _mm_storeu_ps (out_data, s); > > What are those two extra bytes ? Doesn't this instruction just write > four floats ? > > > P.S. I will indeed be in holiday as of tomorrow :-) > I'll keep an eye on my mailbox, but it will have > low priority so expect some delay. Thank you for clarifying I will look forward for next release! Enjoy your holidays! best regrads mira