Re: [Rd] suggestion how to use memcpy in duplicate.c

2010-04-24 Thread Simon Urbanek
Herve,

I think you code just confirms what I said -- for small nt for() wins, 
otherwise memcpy wins. Taking your measurements (they are a bit crude since 
they measure overhead as well):

ginaz:sandbox$ time ./hmc.mc 1

real0m7.294s
user0m7.239s
sys 0m0.054s
ginaz:sandbox$ time ./hmc 1

real0m3.773s
user0m3.746s
sys 0m0.024s

so for() is about 2x faster

ginaz:sandbox$ time ./hmc 3

real0m4.751s
user0m4.718s
sys 0m0.023s
ginaz:sandbox$ time ./hmc.mc 3

real0m3.098s
user0m3.051s
sys 0m0.045s

memcpy is about 50% faster.

It also proves me right when I said we should only special-case the common case 
of scalar recycling and use memcpy for everything else.

Cheers,
Simon




On Apr 23, 2010, at 9:21 PM, Hervé Pagès wrote:

> Follow up...
> 
> Hervé Pagès wrote:
>> Hi Matthew,
>> Matthew Dowle wrote:
>>> Just to add some clarification, the suggestion wasn't motivated by speeding 
>>> up a length 3 vector being recycled 3.3 million times.  But its a good 
>>> point that any change should not make that case slower.  I don't know how 
>>> much vectorCopy is called really,  DUPLICATE_ATOMIC_VECTOR seems more 
>>> significant, which doesn't recycle, and already had the FIXME next to it.
>>> 
>>> Where copyVector is passed a large source though, then memcpy should be 
>>> faster than any of the methods using a for loop through each element 
>>> (whether recycling or not),  allowing for the usual caveats. What are the 
>>> timings like if you repeat the for loop 100 times to get a more robust 
>>> timing ?  It needs to be a repeat around the for loop only, not the 
>>> allocVector whose variance looks to be included in those timings below. 
>>> Then increase the size of the source vector,  and compare to memcpy.
>> On my system (DELL LATITUDE laptop with 64-bit 9.04 Ubuntu):
>> #include 
>> #include 
>> #include 
>> void *memcpy2(char *dest, const char *src, size_t n)
>> {
>>int i;
>>for (i = 0; i < n; i++) *(dest++) = *(src++);
>>return dest;
>> }
>> int main()
>> {
>>int n, kmax, k;
>>char *x, *y;
>>n = 2500;
>>kmax = 100;
>>x = (char *) malloc(n);
>>y = (char *) malloc(n);
>>for (k = 0; k < kmax; k++)
>>//memcpy2(y, x, n);
>>memcpy(y, x, n);
>>return 0;
>> }
>> Benchmarks:
>> n = 2500, kmax = 100, memcpy2:
>>  real0m8.123s
>>  user0m8.077s
>>  sys0m0.040s
>> n = 2500, k = 100, memcpy:
>>  real0m1.076s
>>  user0m1.004s
>>  sys0m0.060s
>> n = 25000, kmax = 10, memcpy2:
>>  real0m8.033s
>>  user0m8.005s
>>  sys0m0.012s
>> n = 25000, kmax = 10, memcpy:
>>  real0m0.353s
>>  user0m0.352s
>>  sys0m0.000s
>> n = 25, kmax = 1, memcpy2:
>>  real0m8.351s
>>  user0m8.313s
>>  sys0m0.008s
>> n = 25, kmax = 1, memcpy:
>>  real0m0.628s
>>  user0m0.624s
>>  sys0m0.004s
>> So depending on the size of the memory area to copy, GNU memcpy() is
>> between 7.5x and 22x faster than using a for() loop. You can reasonably
>> expect that the authors of memcpy() have done their best to optimize
>> the code for most platforms they support, for big and small memory
>> areas, and that if there was a need to branch based on the size of the
>> area, that's already done *inside* memcpy() (I'm just speculating here,
>> I didn't look at memcpy's source code).
> 
> So for copying a vector of integer (with recycling of the source),
> yes, a memcpy-based implementation is much faster, for long and small
> vectors (even for a length 3 vector being recycled 3.3 million
> times ;-) ), at least on my system:
> 
> nt = 3; ns = 1000; kmax = 100; copy_ints:
> 
>  real 0m1.206s
>  user 0m1.168s
>  sys  0m0.040s
> 
> nt = 3; ns = 1000; kmax = 100; copy_ints2:
> 
>  real 0m6.326s
>  user 0m6.264s
>  sys  0m0.052s
> 
> 
> Code:
> ===
> #include 
> #include 
> #include 
> 
> void memcpy_with_recycling_of_src(char *dest, size_t dest_nblocks,
> const char *src, size_t src_nblocks,
> size_t blocksize)
> {
>   int i, imax, q;
>   size_t src_size;
> 
>   imax = dest_nblocks - src_nblocks;
>   src_size = src_nblocks * blocksize;
>   for (i = 0; i <= imax; i += src_nblocks) {
>   memcpy(dest, src, src_size);
>   dest += src_size;
>   i += src_nblocks;
>   }
>   q = dest_nblocks - i;
>   if (q > 0)
>   memcpy(dest, src, q * blocksize);
>   return;
> }
> 
> void copy_ints(int *dest, int dest_length,
>   const int *src, int src_length)
> {
>   memcpy_with_recycling_of_src((char *) dest, dest_length,
>(char *) src, src_length,
>sizeof(int));
> }
> 
> /* the copyVector() way */
> void copy_ints2(int

[Rd] S4 Inheritance of environments

2010-04-24 Thread Christopher Brown
I looked through the documentation and the mailing lists and could not
find an answer to this.  My apologies if it has already been answered.
 If it has, a pointer to the relevant discussion would be greatly
appreciated.

Creating S4 classes containing environments exhibits unexpected
behavior/features.  These have a different in two ways:

1) slotName for the data: ".xData" instead of ".Data" and do not respond to the
2) Response to the is.* function seems to indicate that the object
does not know of its inheritance.  ( Notably, the inherits function
works as expected. )

Here is a working illustration:

> #  LIST
> setClass( 'inheritList', contains='list')
[1] "inheritList"
> inList <- new( 'inheritList' )
> class( inList )
[1] "inheritList"
attr(,"package")
[1] ".GlobalEnv"
> is.list( inList )  # TRUE
[1] TRUE
> slotNames(inList)  # ".Data"
[1] ".Data"
> inherits(inList, 'list' )  # TRUE
[1] TRUE
>
>
> # ENVIRONMENT
> setClass( 'inheritEnv', contains='environment' )
Defining type "environment" as a superclass via class ".environment"
[1] "inheritEnv"
> inEnv <- new( 'inheritEnv' )
> class(inEnv)
[1] "inheritEnv"
attr(,"package")
[1] ".GlobalEnv"
> is.environment(inEnv) # FALSE
[1] FALSE
> slotNames(inEnv)  # ".xData"
[1] ".xData"
> inherits(inEnv, 'environment' )   # TRUE
[1] TRUE

My questions is whether this behavior is a bug? By design?  A work
around?  Etc.?

Thanks kindly for your reply,

Chris


the Open Data Group
 http://www.opendatagroup.com
 http://blog.opendatagroup.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] S4 Inheritance of environments

2010-04-24 Thread Roger Peng
I think using 'is(inEnv, "environment")' produces the answer you
expect. Can't explain the other anomalies though.

-roger

On Sat, Apr 24, 2010 at 1:15 PM, Christopher Brown
 wrote:
> I looked through the documentation and the mailing lists and could not
> find an answer to this.  My apologies if it has already been answered.
>  If it has, a pointer to the relevant discussion would be greatly
> appreciated.
>
> Creating S4 classes containing environments exhibits unexpected
> behavior/features.  These have a different in two ways:
>
> 1) slotName for the data: ".xData" instead of ".Data" and do not respond to 
> the
> 2) Response to the is.* function seems to indicate that the object
> does not know of its inheritance.  ( Notably, the inherits function
> works as expected. )
>
> Here is a working illustration:
>
>> #  LIST
>> setClass( 'inheritList', contains='list')
> [1] "inheritList"
>> inList <- new( 'inheritList' )
>> class( inList )
> [1] "inheritList"
> attr(,"package")
> [1] ".GlobalEnv"
>> is.list( inList )          # TRUE
> [1] TRUE
>> slotNames(inList)          # ".Data"
> [1] ".Data"
>> inherits(inList, 'list' )  # TRUE
> [1] TRUE
>>
>>
>> # ENVIRONMENT
>> setClass( 'inheritEnv', contains='environment' )
> Defining type "environment" as a superclass via class ".environment"
> [1] "inheritEnv"
>> inEnv <- new( 'inheritEnv' )
>> class(inEnv)
> [1] "inheritEnv"
> attr(,"package")
> [1] ".GlobalEnv"
>> is.environment(inEnv)             # FALSE
> [1] FALSE
>> slotNames(inEnv)                  # ".xData"
> [1] ".xData"
>> inherits(inEnv, 'environment' )   # TRUE
> [1] TRUE
>
> My questions is whether this behavior is a bug? By design?  A work
> around?  Etc.?
>
> Thanks kindly for your reply,
>
> Chris
>
>
> the Open Data Group
>  http://www.opendatagroup.com
>  http://blog.opendatagroup.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel