Re: [R] Error in .subset(x, j) : only 0's may be mixed with negative subscripts

David Winsemius Tue, 23 Jun 2009 14:11:57 -0700


On Jun 23, 2009, at 1:18 PM, Russell Ivory wrote:

I have a data set called datastep4  with 211484 rows and 95 columns

WHY ALL OF THE UNNEEDED EMPTY LINES???

dim(datastep4)


[1] 211484     95

The first few column names are given below, note the first one is
"RESPONDED"

names(datastep4)[1:5]


[1] "RESPONDED" "VAR_30"    "VAR_31"    "VAR_32"    "VAR_33"

A table of RESPONDED shows mostly zeros

table(datastep4$RESPONDED)


    0      1

210582    902

I reduce the data set by pulling out the RESPONDED column, then verify
all is well

test <- datastep4[,-datastep4$RESPONDED]

It may have "worked" but perhaps not for the reasons you thought itshould. Take a look carefully at this


> str(data2)
'data.frame':   300 obs. of  5 variables:
 $ x1 : num  0.0592 0.3976 0.9512 0.675 0.7129 ...
 $ x2 : num  0.625 0.328 0.721 0.779 0.233 ...
 $ y  : num  0.685 0.694 1.589 1.461 0.921 ...
 $ grp: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
 $ one: num  1 1 1 1 1 1 1 1 1 1 ...

> str(data2[,-data2$one])
'data.frame':   300 obs. of  4 variables:
 $ x2 : num  0.625 0.328 0.721 0.779 0.233 ...
 $ y  : num  0.685 0.694 1.589 1.461 0.921 ...
 $ grp: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
 $ one: num  1 1 1 1 1 1 1 1 1 1 ...

Notice that the "one" column was _not_ removed.

dim(test)


[1] 211484     94

names(test)[1:5]


[1] "VAR_30" "VAR_31" "VAR_32" "VAR_33" "VAR_34"

class(test)


[1] "data.frame"

test[1:10,1:10]

VAR_30 VAR_31 VAR_32 VAR_33 VAR_34 VAR_37 VAR_38 VAR_42 VAR_45VAR_46


1       0      0      0      0  15198      0      0      6     NA

3       0      0      0      0   8491      0      0      4     NA

4       0      0      0      0      0      0      0      0     NA

5       0      0      0      0  67671      0      0      7     NA

7       0      0      0      0   1334      0      0      1     NA

9       0      0      0      0      0      0      0      2     NA

10      0      0      0      0  24169      0      0     10     NA

11      0      0      0      0    438      0      0      3     NA

12      0      0      0      0   2158      0      0      1     NA

13      0      0      0      0  18804      0      0      4     NA

If I reduce the data frame datastep4 by removing a few records wherethe

variable G102 is not 1, and removing the column named "G102" (which is
column 84),

I end up with a smaller set called datastep5 with 192701 rows and 94
columns

datastep5 <- datastep4[datastep4$G102 != 1,-84]

This code does the _opposite_ of what you stated. It selects onlythose records that are not equal to 1. (And if that is not an integertype column the results could be further seen as undetermined,)

dim(datastep5)


[1] 192701     94

names(datastep5)[1:5]


[1] "RESPONDED" "VAR_30"    "VAR_31"    "VAR_32"    "VAR_33"

table(datastep5$RESPONDED)


     0      1

141096    584

Now, if I want to reduce this data set by removing the RESPONDEDcolumn

as was done for datastep4, it blows up

test <- datastep5[,-datastep5$RESPONDED]

I am guessing that the first element of datastep5$RESPONDED is now azero. You are abusing the indexing conventions. Try instead either:


test <- datastep5[,-1]

Or if you want to imagine that you cannot remember the column numberof "RESPONDED" then this will "work":


test <- datastep5[ , -which(names(datastep5)=="RESPONDED")]

Error in .subset(x, j) : only 0's may be mixed with negativesubscripts

Merrick Bank confidentiality trailed elided


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in .subset(x, j) : only 0's may be mixed with negative subscripts

Reply via email to