[moving this from R-help to R-devel]
Hi,
Right, so when you call `[`, the dispatch is made internally :
> d <- data.frame( x = 1:5, y = rnorm(5), z = rnorm(5) )
> trace( `[.data.frame` )
> d[ , 1:2] # ensuring the 1:2 is passed to j and the i is passed as
missing
Tracing `[.data.frame`(d, , 1:2) on entry
x y
1 1 0.98946922
2 2 0.05323895
3 3 -0.21803664
4 4 -0.47607043
5 5 1.23366151
> d[ 1:2] # only on argument, so it goes in i
Tracing `[.data.frame`(d, 1:2) on entry
x y
1 1 0.98946922
2 2 0.05323895
3 3 -0.21803664
4 4 -0.47607043
5 5 1.23366151
But that does not explain why this is hapening:
> d[ i = 1:2]
Tracing `[.data.frame`(d, i = 1:2) on entry
x y
1 1 0.98946922
2 2 0.05323895
3 3 -0.21803664
4 4 -0.47607043
5 5 1.23366151
> d[ j = 1:2]
Tracing `[.data.frame`(d, j = 1:2) on entry
x y z
1 1 0.98946922 -0.5233134
2 2 0.05323895 1.3646683
3 3 -0.21803664 -0.4998344
4 4 -0.47607043 -1.8849618
5 5 1.23366151 0.6723562
Arguments are dispatched to `[.data.frame` with their names, and
`[.data.frame` gets confused. I'm not suggesting allowing named
arguments because it already works, what does not work is how
`[.data.frame` treats them, and that needs to be changed, this is a bug.
Romain
> version
_
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status Under development (unstable)
major 2
minor 9.0
year 2009
month 03
day 09
svn rev 48093
language R
version.string R version 2.9.0 Under development (unstable) (2009-03-09
r48093)
baptiste auguie wrote:
Hi,
I got an off-line clarification from Martin Morgan which makes me
believe it's not a bug (admittedly, I was close to suggesting it before).
Basically, "[" is a .Primitive, for which the help page says,
The advantage of |.Primitive| over |.Internal
<file:///Library/Frameworks/R.framework/Resources/library/base/html/Internal.html>| functions
is the potential efficiency of argument passing. However, this is
done by ignoring argument names and using positional matching of
arguments (unless arranged differently for specific primitives such
as |rep
<file:///Library/Frameworks/R.framework/Resources/library/base/html/rep.html>|),
so this is discouraged for functions of more than one argument.
This explains why in my tests the argument names i and j were
completely ignored and only the number and order of arguments changed
the result.
I've learnt my lesson here, but I wonder what could be done to make
this discovery easier for others:
- add a note in the documentation of each .Primitive function (at
least a link to ?.Primitive)
- add such an example in lapply (all examples are for named arguments)
- echo a warning if trying to pass named arguments to a .Primitive
- allow for named arguments as you suggest
I'm not sure the last two would be possible without some cost in
efficiency.
Many thanks,
baptiste
On 26 Mar 2009, at 07:46, Romain Francois wrote:
Hi,
This is a bug I think. [.data.frame treats its arguments differently
depending on the number of arguments.
d <- data.frame(x = rnorm(5), y = rnorm(5), z = rnorm(5) )
d[, 1:2]
x y
1 0.45141341 0.03943654
2 -0.87954548 1.83690210
3 -0.91083710 0.22758584
4 0.06924279 1.26799176
5 -0.20477052 -0.25873225
base:::`[.data.frame`( d, j=1:2)
x y z
1 0.45141341 0.03943654 -0.8971957
2 -0.87954548 1.83690210 0.9083281
3 -0.91083710 0.22758584 -0.3104906
4 0.06924279 1.26799176 1.2625699
5 -0.20477052 -0.25873225 0.5228342
but also:
d[ j=1:2]
x y z
1 0.45141341 0.03943654 -0.8971957
2 -0.87954548 1.83690210 0.9083281
3 -0.91083710 0.22758584 -0.3104906
4 0.06924279 1.26799176 1.2625699
5 -0.20477052 -0.25873225 0.5228342
`[.data.frame` only is called with two arguments in the second case, so
the following condition is true:
if(Narg < 3L) { # list-like indexing or matrix indexing
And then, the function assumes the argument it has been passed is i, and
eventually calls NextMethod("[") which I think calls
`[.listof`(x,i,...), since i is missing in `[.data.frame` it is not
passed to `[.listof`, so you have something equivalent to as.list(d)[].
I think we can replace the condition with this one:
if(Narg < 3L && !has.j) { # list-like indexing or matrix indexing
or this:
if(Narg < 3L) { # list-like indexing or matrix indexing
if(has.j) i <- j
`[.data.frame`(d, j=1:2)
x y
1 0.45141341 0.03943654
2 -0.87954548 1.83690210
3 -0.91083710 0.22758584
4 0.06924279 1.26799176
5 -0.20477052 -0.25873225
However, we would still have this, which is expected (same as d[1:2] ):
`[.data.frame`(d, i=1:2)
x y
1 0.45141341 0.03943654
2 -0.87954548 1.83690210
3 -0.91083710 0.22758584
4 0.06924279 1.26799176
5 -0.20477052 -0.25873225
Romain
baptiste auguie wrote:
Dear all,
Trying to extract a few rows for each element of a list of
data.frames, I'm puzzled by the following behaviour,
d <- lapply(1:4, function(i) data.frame(x=rnorm(5), y=rnorm(5)))
str(d)
lapply(d, "[", i= c(1)) # fine, this extracts the first columns
lapply(d, "[", j= c(1, 3)) # doesn't do nothing ?!
library(plyr)
llply(d, "[", j= c(1, 3)) # same
Am i misinterpreting the meaning of "j", which I thought was an
argument of the method "[.data.frame"?
args(`[.data.frame`)
function (x, i, j, drop = if (missing(i)) TRUE else length(cols) ==
1)
Many thanks,
baptiste
_____________________________
Baptiste AuguiƩ
School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK
Phone: +44 1392 264187
http://newton.ex.ac.uk/research/emag
______________________________________________
r-h...@r-project.org <mailto:r-h...@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
_____________________________
Baptiste AuguiƩ
School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK
Phone: +44 1392 264187
http://newton.ex.ac.uk/research/emag
______________________________
--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel