My bad, Bert 😉

My point is that my function/framework has very minimal expectations about the 
source data (mostly, that it is a rectangular shape table of data separated by 
some separator) and does not have any a-priori knowledge about what the first, 
second, etc columns in the data files must contain.... so while it would be 
possible to pass down some class vector which would be passed down as the 
colClasses argument to read.table, it is not necessarily reasonable in the 
context of the overall framework.

I guess I was surprised that read.table interprets NaN in an input file as the 
internal "Not a number" rather than as a string... there is nothing in the 
?read.table about that.

Anyways, as I said, I need to think more about this in the context of the 
framework where this function operates...

Thanks for the input


________________________________
From: Bert Gunter <bgunter.4...@gmail.com>
Sent: Thursday, October 24, 2019 10:39
To: Sebastien Bihorel <sebastien.biho...@cognigencorp.com>
Cc: r-help@r-project.org <r-help@r-project.org>
Subject: Re: [R] read.table and NaN

Not so. Read ?read.table carefully. You can use "NA" as a default. Moreover, 
you **specified** that you want NaN read as character, which means that any 
column containing NaN **must** be character. That's part of the specification 
for data frames (all columns must be one data type). So either change your 
specfication or change your data structure.

And, incidentally, my first name is "Bert" .

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Oct 24, 2019 at 6:43 AM Sebastien Bihorel 
<sebastien.biho...@cognigencorp.com<mailto:sebastien.biho...@cognigencorp.com>> 
wrote:
Thanks Gunter

It seems that one has to know the structure of the data and adapt the 
read.table call accordingly. I am working on a framework that is meant to 
process data files with unknown structure, so I have to think a bit more about 
that...
________________________________
From: Bert Gunter <bgunter.4...@gmail.com<mailto:bgunter.4...@gmail.com>>
Sent: Thursday, October 24, 2019 00:08
To: Sebastien Bihorel 
<sebastien.biho...@cognigencorp.com<mailto:sebastien.biho...@cognigencorp.com>>
Cc: r-help@r-project.org<mailto:r-help@r-project.org> 
<r-help@r-project.org<mailto:r-help@r-project.org>>
Subject: Re: [R] read.table and NaN

Like this?

con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', 
> stringsAsFactors = FALSE,
+                   colClasses = c("numeric", "character"))
> close.connection(con)
> tmp
   A   B
1  1 NaN
2 NA   2
> class(tmp[,1])
[1] "numeric"
> class(tmp[,2])
[1] "character"
> tmp[,2]
[1] "NaN" "2"


Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help 
<r-help@r-project.org<mailto:r-help@r-project.org>> wrote:
Hi,

Is there a way to make read.table consider NaN as a string of characters rather 
than the internal NaN? Changing the na.strings argument does not seems to have 
any effect on how R interprets the NaN string (while is does not the the NA 
string)

con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', 
stringsAsFactors = FALSE)
close.connection(con)
tmp
class(tmp[,1])
class(tmp[,2])


______________________________________________
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to