Re: [R] apply on large arrays

Erich Neuwirth Thu, 14 Feb 2008 10:47:47 -0800

 > system.time({
+   tab2 <- tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
+ tab2[] <- 0
+ tab2[which(tab1 == 1, arr.ind = TRUE)] <- 1
+ tab3 <- rowSums(tab2)
+ })
    user  system elapsed
    3.17    0.99    4.17
 >
 > system.time({
+   tab4 <- rowSums(tab1 == 1)
+ })
    user  system elapsed
    1.02    0.18    1.20
 >



And yes,
the results were identical.


[EMAIL PROTECTED] wrote:
> Was the answer the same as the one you were getting with the original
> code?
> 
> How long did the original code take compared to these two versions?
> 
> Cheers,
> Bill V. 
> 
> 
> Bill Venables
> CSIRO Laboratories
> PO Box 120, Cleveland, 4163
> AUSTRALIA
> Office Phone (email preferred): +61 7 3826 7251
> Fax (if absolutely necessary):  +61 7 3826 7304
> Mobile:                         +61 4 8819 4402
> Home Phone:                     +61 7 3286 7700
> mailto:[EMAIL PROTECTED]
> http://www.cmis.csiro.au/bill.venables/ 
> 
> -----Original Message-----
> From: Erich Neuwirth [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, 14 February 2008 5:08 PM
> To: Venables, Bill (CMIS, Cleveland)
> Subject: Re: [R] apply on large arrays
> 
> Thanks, this version is definitely faster than the first one.
> system.time gives 0.13 instead of 0.79 seconds.
> 
> 
> 
> [EMAIL PROTECTED] wrote:
>> Hmm.  I think this could be faster still:
>>
>>      tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
>>      tab3 <- rowSums(tab1 == 1)
>>
>> but check it...
>>
>> Bill Venables
>> CSIRO Laboratories
>> PO Box 120, Cleveland, 4163
>> AUSTRALIA
>> Office Phone (email preferred): +61 7 3826 7251
>> Fax (if absolutely necessary):  +61 7 3826 7304
>> Mobile:                         +61 4 8819 4402
>> Home Phone:                     +61 7 3286 7700
>> mailto:[EMAIL PROTECTED]
>> http://www.cmis.csiro.au/bill.venables/ 
>>
>> -----Original Message-----
>> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]
>> On Behalf Of Venables, Bill (CMIS, Cleveland)
>> Sent: Thursday, 14 February 2008 10:30 AM
>> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
>> Subject: Re: [R] apply on large arrays
>>
>> Your code is
>>
>>
>>      tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
>>      tab2 <- apply(tab1, 1:4, 
>>                      function(x) ifelse(sum(x) == 1, 1, 0))
>>      tab3 <- apply(tab2, 1, sum)
>>
>> As far as I can see, step 2, (the problematic one), merely replaces
> any
>> entries in tab1 that are not equal to one by zeros.  I think this
> would
>> do the same job a bit faster:
>>
>>      tab2 <- tab1 <- with(pisa1, table(CNT,GENDER,ISCOF,ISCOM))
>>      tab2[] <- 0
>>      tab2[which(tab1 == 1, arr.ind = TRUE)] <- 1
>>      tab3 <- rowSums(tab2)
>>
>> If you don't need to keep tab1, you would make things even better by
>> removing it.
>>
>> Bill Venables.
>>      
>>
>>
>>
>>
>> Bill Venables
>> CSIRO Laboratories
>> PO Box 120, Cleveland, 4163
>> AUSTRALIA
>> Office Phone (email preferred): +61 7 3826 7251
>> Fax (if absolutely necessary):  +61 7 3826 7304
>> Mobile:                         +61 4 8819 4402
>> Home Phone:                     +61 7 3286 7700
>> mailto:[EMAIL PROTECTED]
>> http://www.cmis.csiro.au/bill.venables/ 
>>
>> -----Original Message-----
>> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]
>> On Behalf Of Erich Neuwirth
>> Sent: Thursday, 14 February 2008 9:52 AM
>> To: r-help
>> Subject: [R] apply on large arrays
>>
>> I have a big contingency table, approximately of size 60*2*500*500,
>> and I need to count the number of cells containing a count of 1 for
> each
>> of the factors values defining the first dimension.
>> Here is my attempt:
>>
>> tab1<-with(pisa1,table(CNT,GENDER,ISCOF,ISCOM))
>> tab2<-apply(tab1,1:4,function(x)ifelse(sum(x)==1,1,0))
>> tab3<-apply(tab2,1,sum)
>>
>> Computing tab2 is very slow.
>> Is there a faster and/or more elegant way of doing this?
> 

-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply on large arrays

Reply via email to