Re: [Rd] data frame subset patch, take 2

2006-12-19 Thread Vladimir Dergachev
On Saturday 16 December 2006 4:41 pm, Martin Maechler wrote: > > Correction: the problems show on both platforms; > > one is in mgcv, gam(), an error in [[ <- -- pretty clearly linked to your > changes but not reproducible when tried isolatedly > interactively, > > the other one is a seg.fault "me

Re: [Rd] data frame subset patch, take 2

2006-12-16 Thread Martin Maechler
> "MM" == Martin Maechler <[EMAIL PROTECTED]> > on Sat, 16 Dec 2006 22:31:21 +0100 writes: > "Vladimir" == Vladimir Dergachev <[EMAIL PROTECTED]> > on Wed, 13 Dec 2006 13:03:21 -0500 writes: Vladimir> On Wednesday 13 December 2006 6:01 am, Martin Maechler wrote: >>

Re: [Rd] data frame subset patch, take 2

2006-12-16 Thread Martin Maechler
> "Vladimir" == Vladimir Dergachev <[EMAIL PROTECTED]> > on Wed, 13 Dec 2006 13:03:21 -0500 writes: Vladimir> On Wednesday 13 December 2006 6:01 am, Martin Maechler wrote: >> >> - Vladimir, have you verified your 'take2' against recent versions >> of R-devel? Vlad

Re: [Rd] data frame subset patch, take 2

2006-12-13 Thread Robert Gentleman
Robert Gentleman wrote: > Hi, >We had the "names" discussion and, AFAIR, the idea that someone might > misinterpret the output as suggesting that one could index by number, > seemed to kill it. A more reasonable argument against is that names<- is > problematic. > > You can use $, [[ (wit

Re: [Rd] data frame subset patch, take 2

2006-12-13 Thread Robert Gentleman
Hi, We had the "names" discussion and, AFAIR, the idea that someone might misinterpret the output as suggesting that one could index by number, seemed to kill it. A more reasonable argument against is that names<- is problematic. You can use $, [[ (with character subscripts), and yes ls does

Re: [Rd] data frame subset patch, take 2

2006-12-13 Thread Vladimir Dergachev
On Wednesday 13 December 2006 1:23 pm, Marcus G. Daniels wrote: > Vladimir Dergachev wrote: > > 2. It would be nice to have true hashed arrays in R (i.e. O(1) access > > times). So far I have used named lists for this, but they are O(n): > > new.env(hash=TRUE) with get/assign/exists works ok.

Re: [Rd] data frame subset patch, take 2

2006-12-13 Thread Marcus G. Daniels
Vladimir Dergachev wrote: > 2. It would be nice to have true hashed arrays in R (i.e. O(1) access > times). So far I have used named lists for this, but they are O(n): > new.env(hash=TRUE) with get/assign/exists works ok. But I suspect its just too easy to use named lists because it is e

Re: [Rd] data frame subset patch, take 2

2006-12-13 Thread Vladimir Dergachev
On Wednesday 13 December 2006 6:01 am, Martin Maechler wrote: > > - Vladimir, have you verified your 'take2' against recent versions > of R-devel? Yes. > > - If they still work, could you re-post them to R-devel, this > time using a proper MIME type, > i.e. most probably one of > appl

Re: [Rd] data frame subset patch, take 2

2006-12-13 Thread Jason Barnhart
I think the efficiency gain is worthwhile. Thx. -jason - Original Message - From: "Martin Maechler" <[EMAIL PROTECTED]> To: "Marcus G. Daniels" <[EMAIL PROTECTED]> Cc: ; "Vladimir Dergachev" <[EMAIL PROTECTED]> Sent: Tuesday, Decembe

Re: [Rd] data frame subset patch, take 2

2006-12-13 Thread Tony Plate
Martin Maechler wrote: > [snip] > Note however that some of these changes are backward > incompatible. I do hope that the changes gaining efficiency > for such large data frames are worth some adaption of > current/old R source code.. > > Feedback on this topic is very welcome! Martin, my f

Re: [Rd] data frame subset patch, take 2

2006-12-12 Thread Robert Gentleman
Hi, I tried take 1, and it failed. I have been traveling (and with Martin's changes also waiting for things to stabilize) before trying take 2, probably later this week and I will send an email if it goes in. Anyone wanting to try it and run R through check and check-all is welcome to do so a

Re: [Rd] data frame subset patch, take 2

2006-12-12 Thread Marcus G. Daniels
Hi Martin, Conventions for optimizing away long, useless row name vector sound very useful. Nice timings too! I've noticed that before, and not been sure quite what to do. e.g. the hdf5 module just gives up past a certain threshold as the long vectors cause performance problems and HDF5 doesn

Re: [Rd] data frame subset patch, take 2

2006-12-12 Thread Martin Maechler
> "Marcus" == Marcus G Daniels <[EMAIL PROTECTED]> > on Tue, 12 Dec 2006 09:05:15 -0700 writes: Marcus> Vladimir Dergachev wrote: >> Here is the second iteration of data frame subset patch. >> It now passes make check on both 2.4.0 and 2.5.0 (svn as >> of a few days ago

Re: [Rd] data frame subset patch, take 2

2006-12-12 Thread Marcus G. Daniels
Vladimir Dergachev wrote: > Here is the second iteration of data frame subset patch. > It now passes make check on both 2.4.0 and 2.5.0 (svn as of a few days ago). > Same speedup as before. > Hi, I was wondering if this patch would make it into the next release. I don't see it in SVN, but it'

[Rd] data frame subset patch, take 2

2006-12-06 Thread Vladimir Dergachev
Hi Robert, Here is the second iteration of data frame subset patch. It now passes make check on both 2.4.0 and 2.5.0 (svn as of a few days ago). Same speedup as before. Changes: * Introduced two new functions .subassign2 and .subassign that are complimentary to .subset2 and .subset.

[Rd] data frame subset patch

2006-11-28 Thread Vladimir Dergachev
Hi all, Here is a patch that significantly speeds up `[.data.frame` operator. It applies cleanly to both 2.4.0 and svn trunk. Make check was OK for 2.40. (for svn trunk it fails even without this patch.. ). What it does - we get rid of class and attr statements that modify incoming data