On 25/05/2011 9:20 a.m., Lutz Fischer wrote:
Hi,
I have a bit of a problem with as.numeric or as.double.
I read in an excel-file (either xlsx::read.xlsx2 or gdata::read.xls).
Select a subset and then try to make it numeric:
# read in the excel-file
alldata<-read.xlsx2("input.xls",1)
# select the subset
s<-subset(alldata, select=c("cI","cII","cIII","cIV","cV"))
# unluckily we have "n/a" for missing values in the file - so we turn it
into "proper" missing values
s[s == "n/a"]<-NA
n<-data.matrix(s);
The problem I have is that it does not convert the date the way I would
expect.
just as an example:
> s[1,2]
[1] 30.94346629
3136 Levels: 0.026307482 0.028239812 0.02849896 0.029054564 0.029540352
0.030248034 0.030841352 0.032966308 ... n/a
turned into:
> n[1,2]
[1] 3020
And I would like to have there 30.94346629 as well. I assume that has to
do with the "Levels" attribute - but not sure what to make of these in
the first place.
I also tried to convert each value on its own:
#make some space that holds the actual numeric data
n <- array(dim=c(length(s[,1]),length(s)))
# now turn everything into doubles
for (c in 1:length(s)) {
for (r in 1:length(s[,1])) {
n[r,c]<-as.double(s[r,c])
}
}
but that gave the same result - just a lot slower.
Thanks
Lutz
Your problem is the conversion to factors when the data is read. Use
options(stringsAsFactors = FALSE)
before you read the data, then the mixed columns of numeric and missing
will be read as character data and the conversion to numeric will go as
you expect. (But I haven't tested this.)
David Scott
--
_________________________________________________________________
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142, NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email: d.sc...@auckland.ac.nz, Fax: +64 9 373 7018
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.