aggregate package is available at
http://www.timhesterberg.net/r-packages
Here is the inst/doc/missingValues.txt file from that package:
--
Copyright 2012 Google Inc. All Rights Reserved.
Author: Tim Hesterberg
Distributed under GPL 2 or later.
I suggest adding this to R_HOME/doc/KEYWORDS.db:
Programming|testing: Software testing
and add a corresponding entry in R_HOME/doc/KEYWORDS.
[[alternative HTML version deleted]]
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailma
n/src/R-3-0-branch/src/main/memory.c:2478
> #3 0x7790bedf in growData () at gram.y:3391
>
>and the memory allocations are from these lines in the parser gram.y
>
> PROTECT( bigger = allocVector( INTSXP, data_size * DATA_ROWS ) ) ;
> PROTECT( biggertext
I did some benchmarking of data frame code, and
it appears that R 3.0.0 is far worse than earlier versions of R
in terms of how many large objects it allocates space for,
for data frame operations - creation, subscripting, subscript replacement.
For a data frame with n rows, it makes either 2 or 4
>On Sep 11, 2012, at 16:02 , Warnes, Gregory wrote:
>
>>
>> On 9/7/12 2:42 PM, "peter dalgaard" wrote:
>>
>>>
>>> On Sep 7, 2012, at 17:16 , Tim Hesterberg wrote:
>>>
>>>> I suggest adding a 'pivot' argument to qr.
I suggest adding a 'pivot' argument to qr.R, to obtain columns in the
same order as the original x, so that
a <- qr(x)
qr.Q(a) %*% qr.R(a, pivot=TRUE)
returns x.
--
# File src/library/base/R/qr.R
qr.R <- function(qr, complete = FALSE, pivot = F
When creating a package, I would like a way to tell R that
a function with a period in its name is not a method.
I'm writing a package now with a modified version of qr.R.
R CMD check gives warnings:
* checking S3 generic/method consistency ... WARNING
qr:
function(x, ...)
qr.R:
function(qr,
I've been playing with passing arguments to .C(), and found that replacing
as.double(x)
with
if(is.double(x)) x else as.double(x)
saves time and avoids one copy, in the case that x is already double.
I suggest modifying as.double to avoid the extra copy and just
return x, when x is already
In base/R/tabulate.R, tabulate() calls .C("R_tabulate";
I suggest adding DUP = FALSE to that call.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
information.
Tim Hesterberg
--
% File src/library/base/man/qraux.Rd
% Part of the R package, http://www.R-project.org<http://www.r-project.org/>
% Copyright 1995-2007 R Core Development Team
% Distributed under GPL 2 or later
\name{QR.Auxiliaries}
I've written a "dataframe" package that replaces existing methods for
data frame creation and subscripting with versions that use less
memory. For example, as.data.frame(a vector) makes 4 copies of the
data in R 2.9.2, and 1 copy with the package. There is a small speed
gain.
I and others have b
I also favor deprecating mean.data.frame.
One possible exception would be for a single-column data frame.
But even here I'd say no, lest people expect the same behavior for
median, var, ...
Pat's suggestion of using stop() would work nicely for mean.
(but omit paste - stop handles t
For consistency with rowSums colSums rowMeans etc., the names should be
colMins colMaxs
rowMins rowMaxs
This is also consistent with S+.
FYI, the rowSums naming convention was chosen to avoid conflict
with rowsum (which computes column sums!).
Tim Hesterberg
>> A well-de
Having aperm() return an object of the same class is dangerous, there
are undoubtedly classes for which that is not appropriate, producing an
illegal object for that class or quietly giving incorrect results.
Three alternatives are to:
* add the keep.class option but with default FALSE
* make aper
erg/articles/JSM04-bootknife.pdf
All three are undefined for samples of size 1. You need to go to some
other bootstrap, e.g. a parametric bootstrap with variability estimated
from other data.
Tim Hesterberg
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
ion (x0, y0, x1 = x0, y1 = y0, col = par("fg"), lty = par("lty"),
---
> function (x0, y0, x1, y1, col = par("fg"), lty = par("lty"),
Arrows:
< function (x0, y0, x1 = x0, y1 = y0, length = 0.25, angle = 30, code = 2,
---
> function (x0, y0, x1, y1,
A number of as.data.frame methods do
names(x) <- NULL
Replacing that with
if(!is.null(names(x)))
names(x) <- NULL
appears to save making one copy of the data
(based on tracemem and Rprofmem in a copy of R compiled
with --enable-memory-profiling)
and gives a modest but consistent b
Any data frames with more than this have dup.row.names default to 2.
The name 'dup.row.names' is for consistency with S+; there the options
are NULL, F or T.
Tim Hesterberg
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
t this doesn't
go far enough; subscripting and other operations sometimes convert the
automatic names to real names, and check/enforce uniqueness, which is
a big waste of time when working with large data frames. I'll comment
more on this in a new thread.
Tim Hesterberg
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Others have commented on why this holds.
There is an alternative, 'ifelse1', part of the splus2R package, that
does what you'd like here.
Tim Hesterberg
>I find it slightly surprising, that
> ifelse(TRUE, character(0), "")
>returns NA instead of char
length(i) < 2 ||
+ (is.numeric(i) && min(i, 0, na.rm=TRUE) < 0)
||
+ (!any(is.na(i)) && all(i[-length(i)]
wrote:
> >>>>> "TH" == Tim Hesterberg <[EMAIL PROTECTED]>
> >>
ict,
any(x[-1] >= x[-n]),
any(x[-1] > x[-n]))
} else { # check for sort in increasing order
ifelse1(strict,
any(x[-1] <= x[-n]),
any(x[-1] < x[-n]))
}
}
On Tue, Jul 1, 2008 at 3:23 PM, Tim Hesterberg <[EMAIL PROTECTED]>
ng=FALSE, check for sort in increasing order
# If strict=TRUE, ties correspond to not being sorted
n <- length(x)
if(length(n) < 2)
return(FALSE)
if(!is.atomic(x) || (!na.rm && any(is.na(x
return(NA)
if(na.rm && any(ii <- is.na(x)))
x <- x[!ii]
Below is a version of [.data.frame that is faster
for subscripting rows of large data frames; it avoids calling
duplicated(rows)
if there is no need to check for duplicate row names, when:
i is logical
attr(x, "dup.row.names") is not NULL (S+ compatibility)
i is numeric and negative
By whitespace, I mean either a space or tab (preceding the newline).
I'm using ESS:
ess-version's value is "5.3.6"
GNU Emacs 21.4.1 (i486-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of
2007-08-28 on terranova, modified by Debian
I have the following in my .emacs:
(load "ess-5.3.6/lisp/ess-site")
Hi Oleg,
If there as a class to inherit from, then my point about an S4 class
requiring lots of methods is moot. I think it would come down then to
whether one prefers flexibility (advantage S3) or a definite structure
for use with C/C++ (advantage S4).
Tim
>well, I am not arguing that there ar
>Tim Hesterberg wrote:
>> It depends on what the object is to be used for.
>>
>> If you want users to be able to operate with the object as if it
>> were a normal vector, to do things like mean(x), cos(x), etc.
>> then the list would be very long indeed; for exam
LUS), plus additional
methods defined for inheriting classes.
In cases like this you might prefer using an S3 class, using
attributes rather than slots for auxiliary information, so that
you don't need to write so many methods.
Tim Hesterberg
>I am defining a new class. Shortly, I wil
mVector <- (1:size - runif(1))/size
* observation i is selected if cprob[i-1] < uniformVector[j] <= cprob[i]
for any j
In the case (size*max(prob) > 1), the number of times the observation
is selected is the number of j's for which the inequalities hold.
* the selected observat
ut not (yet) "freq", including
mean median ppoints tabulate. Other functions like lm
have always had a weights argument.
--
Tim Hesterberg
Disclaimer - my own opinions, not Insightful's.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
realized that in some cases we wanted to add a "call" attribute or
component/slot so that update() would work. If it had been an S3
object we could have done so, but as an S4 object we would have broken
existing objects of the class.
Tim Hesterberg
Disclaimer - this is my personal opini
In S-PLUS, is() does catch parent S3 classes. It does not
require a setOldClass definition to do so. I would prefer that
R work the same way, to make porting code easier.
I use is() in S-PLUS for both S3 and S4 classes because it is faster
than inherits(). I use inherits() only for testing a ve
e, and methods for data frames and other classes.
The code below seems to presume a list, and would be very slow for vectors.
For reasons of consistency between S-PLUS and R, I would ask that an R
function be called anyMissing rather than hasNA or anyNA.
Tim Hesterberg
>is there a hasNA() /
ing thing I've found
about using R. As I anticipate using R a lot in the future, I would
appreciate very much if it is changed. I spent a fair amount of time
trying to see if I could change it myself, but gave up.
Tim Hesterberg
Andy Liaw wrote:
>If I'm not mistaken, this works as
>>day 20
>>>svn rev 36812
>>>language R
>>
>> I responded:
>>>You can open them in R. On Windows, File:Open Script,
>>>change "Files of type" to "All Files", then open the .ssc file.
>>
>> So there is a workaroun
35 matches
Mail list logo