On Apr 22, 2010, at 7:12 AM, Matthew Dowle wrote: > > Is this a thumbs up for memcpy for DUPLICATE_ATOMIC_VECTOR at least ? > > If there is further specific testing then let me know, happy to help, but > you seem to have beaten me to it. >
I was not volunteering to do anything - I was just looking at whether it makes sense to bother at all and pointing out the bugs in your code ;). I have a sufficiently long list of TODOs already :P Cheers, Simon > > "Simon Urbanek" <simon.urba...@r-project.org> wrote in message > news:65d21b93-a737-4a94-bdf4-ad7e90518...@r-project.org... >> >> On Apr 21, 2010, at 2:15 PM, Seth Falcon wrote: >> >>> On 4/21/10 10:45 AM, Simon Urbanek wrote: >>>> Won't that miss the last incomplete chunk? (and please don't use >>>> DATAPTR on INTSXP even though the effect is currently the same) >>>> >>>> In general it seems that the it depends on nt whether this is >>>> efficient or not since calls to short memcpy are expensive (very >>>> small nt that is). >>>> >>>> I ran some empirical tests to compare memcpy vs for() (x86_64, OS X) >>>> and the results were encouraging - depending on the size of the >>>> copied block the difference could be quite big: tiny block (ca. n = >>>> 32 or less) - for() is faster small block (n ~ 1k) - memcpy is ca. 8x >>>> faster as the size increases the gap closes (presumably due to RAM >>>> bandwidth limitations) so for n = 512M it is ~30%. >>>> >>> >>>> Of course this is contingent on the implementation of memcpy, >>>> compiler, architecture etc. And will only matter if copying is what >>>> you do most of the time ... >>> >>> Copying of vectors is something that I would expect to happen fairly >>> often in many applications of R. >>> >>> Is for() faster on small blocks by enough that one would want to branch >>> based on size? >>> >> >> Good question. Given that the branching itself adds overhead possibly not. >> In the best case for() can be ~40% faster (for single-digit n) but that >> means billions of copies to make a difference (since the operation itself >> is so fast). The break-even point on my test machine is n=32 and when I >> added the branching it took 20% hit so I guess it's simply not worth it. >> The only case that may be worth branching is n:1 since that is likely a >> fairly common use (the branching penalty in copy routines is lower than >> comparing memcpy/for implementations since the branching can be done >> before the outer for loop so this may vary case-by-case). >> >> Cheers, >> Simon >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel