---Andrew Pinski <[EMAIL PROTECTED]> wrote: > <[EMAIL PROTECTED]> wrote: > > The PPC has a very fast dcbz (data cache block zero) to clear memory, > > and also dcbi (data cache block invalidate) which permit to have a > > cached line caching an address without reading first the memory (when > > you plan to write the whole line). > > The code in opensolaris.org doesn't seem to handle that. > > Except dcbz does not work for caching-inhibited memory (as it will > cause an alignment exception) so ...
Just for info, I redone the test to clear/preset and test 256 Mbytes of memory on this proprietary PPC target (at boot time): - memset take 1172 ms, memset code is basically: unsigned val = ch; val |= (val << 8); val |= (val << 16); len /= 4; dst = (void *)((unsigned)dst - sizeof (unsigned)); asm volatile (" 1: stu %2,4(%0) ; bdnz+ 1b " : "+b" (dst), "+c" (len) : "r" (val) : "memory" ); // bdnz+ or bdnz- or bdnz gives same execution time - processor internal DMA writing to 256 Mbytes: 657 ms - clearing the memory by dcbz: 154 ms Reading/testing the 256 Mbytes is 720 ms. The very short time of dcbz may worth a cache test for bzero(), on PPC a tlbsx / tlbre should be enough (if the whole range of memory does not need to be tested) - Intel is another story (and it is probably best there to do it inside the processor: when ecx big enough on a rep, each transfer should be a full cache line to stop reading the data to overwrite). Etienne. _____________________________________________________________________________ Envoyez avec Yahoo! Mail. Une boite mail plus intelligente http://mail.yahoo.fr