Re: [R] Subscripting problem with is.na()

Ivan Calandra Thu, 23 Jun 2016 08:15:20 -0700

Thank you Bert for this clarification. It is indeed an important point.


Ivan

--
Ivan Calandra, PhD
Scientific Mediator
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
[email protected]
--
https://www.researchgate.net/profile/Ivan_Calandra
https://publons.com/author/705639/

Le 23/06/2016 à 17:06, Bert Gunter a écrit :

Sorry, Ivan, your statement is incorrect:

"When you use a single bracket on a list with only one argument in
between, then R extracts "elements", i.e. columns in the case of a
data.frame. This explains your errors. "

e.g.

ex <- data.frame(a = 1:3, b = letters[1:3])
a <- 1:3
identical(ex[1], a)

[1] FALSE

class(ex[1])

[1] "data.frame"

class(a)

[1] "integer"

Compare:

identical(ex[[1]], a)

[1] TRUE

Why? Single bracket extraction on a list results in a list; double
bracket extraction results in the element of the list ( a "column" in
the case of a data frame, which is a specific kind of list). The
relevant sections of ?Extract are:

"Indexing by [ is similar to atomic vectors and selects a **list** of
the specified element(s).

Both [[ and $ select a **single element of the list**. "


Hope this clarifies this often-confused issue.


Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Jun 23, 2016 at 7:34 AM, Ivan Calandra
<[email protected]> wrote:

My statement "Using a single bracket '[' on a data.frame does the same as
for matrices: you need to specify rows and columns" was not correct.


When you use a single bracket on a list with only one argument in between,
then R extracts "elements", i.e. columns in the case of a data.frame. This
explains your errors.

But it is possible to use a single bracket on a data.frame with 2 arguments
(rows, columns) separated by a comma, as with matrices. This is the solution
you received.

Ivan


--
Ivan Calandra, PhD
Scientific Mediator
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
[email protected]
--
https://www.researchgate.net/profile/Ivan_Calandra
https://publons.com/author/705639/

Le 23/06/2016 à 16:27, Ivan Calandra a écrit :

Dear Georg,

You need to learn a bit more about the subsetting methods, depending on
the object structure you're trying to subset.

More specifically, when you run this: ds_test[is.na(ds_test$var1)]
you get this error: "Error in `[.data.frame`(ds_test, is.na(ds_test$var1))
: undefined columns selected"

This means that R does not understand which column you're trying to
select. But you're actually trying to select rows.

Using a single bracket '[' on a data.frame does the same as for matrices:
you need to specify rows and columns, like this:
ds_test[is.na(ds_test$var1), ] ## notice the last comma
ds_test[is.na(ds_test$var1), ] <- 0 ## works on all columns because you
didn't specify any after the comma

If you want it only for "var1", then you need to specify the column:
ds_test[is.na(ds_test$var1), "var1"] <- 0

It's the same problem with your 2nd and 4th tries (4th one has other
problems). Your 3rd try does not change ds_test at all.

HTH,
Ivan

--
Ivan Calandra, PhD
Scientific Mediator
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
[email protected]
--
https://www.researchgate.net/profile/Ivan_Calandra
https://publons.com/author/705639/

Le 23/06/2016 à 15:57, [email protected] a écrit :

Hi All,

I would like to recode my NAs to 0. Using a single vector everything is
fine.

But if I use a data.frame things go wrong:

-- cut --

var1 <- c(1:3, NA, 5:7, NA, 9:10)
var2 <- c(1:3, NA, 5:7, NA, 9:10)
ds_test <-
    data.frame(var1, var2)

test <- var1
test[is.na(test)] <- 0
test  # NA recoded OK

# First try
ds_test[is.na(ds_test$var1)] <- 0  # duplicate subscripts WRONG

# Second try
ds_test[is.na("var1")] <- 0
ds_test$var1  # not recoded WRONG

# Third try: to me the most intuitive approach
is.na(ds_test["var1"]) <- 0  # attempt to select less than one element in
integerOneIndex WRONG

# Fourth try
ds_test[is.na(var1)] <- 0  # duplicate subscripts for columns WRONG

-- cut --
   How can I do it correctly?

Where could I have found something about it?

Kind regards

Georg

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subscripting problem with is.na()

Reply via email to