---Andrew Pinski <[EMAIL PROTECTED]> wrote:
> <[EMAIL PROTECTED]> wrote:
> >  The PPC has a very fast dcbz (data cache block zero) to clear memory,
> > and also dcbi (data cache block invalidate) which permit to have a
> > cached line caching an address without reading first the memory (when
> > you plan to write the whole line).
> >  The code in opensolaris.org doesn't seem to handle that.
> 
> Except dcbz does not work for caching-inhibited memory (as it will
> cause an alignment exception) so ...

 Just for info, I redone the test to clear/preset and test 256 Mbytes
of memory on this proprietary PPC target (at boot time):
- memset take 1172 ms, memset code is basically:
        unsigned val = ch;
        val |= (val << 8);
        val |= (val << 16);
        len /= 4;
        dst = (void *)((unsigned)dst - sizeof (unsigned));
        asm volatile (" 1: stu  %2,4(%0) ; bdnz+        1b "
                : "+b" (dst), "+c" (len) : "r" (val) : "memory" );
         // bdnz+ or bdnz- or bdnz gives same execution time
- processor internal DMA writing to 256 Mbytes: 657 ms
- clearing the memory by dcbz: 154 ms
 Reading/testing the 256 Mbytes is 720 ms.

 The very short time of dcbz may worth a cache test for bzero(), on PPC a
 tlbsx / tlbre should be enough (if the whole range of memory does not
 need to be tested) - Intel is another story (and it is probably best
 there to do it inside the processor: when ecx big enough on a rep, each
 transfer should be a full cache line to stop reading the data to
 overwrite).

Etienne.


      
_____________________________________________________________________________ 
Envoyez avec Yahoo! Mail. Une boite mail plus intelligente http://mail.yahoo.fr

Reply via email to