Hi,

I am having trouble with a large dataset I am importing from SPSS.  
The problem is I have to merge two datasets (which seems to be  
working OK) then select rows based on attributes. I have a column  
with either blank cells, B or E entered. I want to select all rows  
with E. I have other columns with numerical data which I will then do  
analyses on.
data[column==" E"] does not work. I use " E" not "E", because levels 
(column) returns " " " " " B" " E".

Any help on what I am doing wrong is much appreciated. I'm getting  
quite stressed as I have 10 files with approx 100,000 records in each  
to analyse so manipulating data becomes a pain.

Here is the code below, not sure it makes much sense without seeing  
the dataset:-


chaff<-read.spss("/Users/Kat/Desktop/papers in progress/btopaper/ 
edited BTO data/fatnewchaff.sav", to.data.frame=TRUE)
chafffat<-read.spss("/Users/Kat/Desktop/papers in progress/btopaper/ 
edited BTO data/fatmethods.sav")
chaffmerge2<-merge(chaff, chafffat, by.x=c("RINGNO", "FAT",  
"FATMTD"), by.y=c("RINGNO", "FAT", "FATMTD"), all=T)
attach(chaffmerge2)
chaffhabfactor<-factor(chaffmerge2$HYBRID_A)
levels(chaffhabfactor)
Echaff<-chaffmerge2[FATMTD==" E",]
attach(Echaff)
names(Echaff)
plotmeans(Echaff$FAT~Echaff$HYBRID_A)
chaffFat<-factor(Echaff$FAT)
levels(chaffFat)
chaffzeros<-table(chaffFat, Echaff$HYBRID_A)
chaffzeros

****
chaffFat    1    2    3    4    5
       0   261  354  345 1003  235
       1    38   23   17    6    2
       2    19    0    4    2    0
       3     7    0    1    0    1
       4     2    0    0    0    0
       5   145   34  123  100   60
       8     0    0    0    0    0
       10  202  141  248  279  101
       15   73   12   79   51    9
       20   84   60   64  133   19
       25   14    6   20   22    3
       30   30   25   22   54   13
       35    3    0    7    4    4
       40    7   10    2   12    5
       45    2    0    3    1    0
       50    1    0    0    2    1
       60    0    1    0    1    1
****
The 1,2,3,4,5, values of chaffFat above correspond to "B" which  
should have been removed!!!!

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to