Suppose I have the following data sitting in a fwf file 'foo.txt'. The point of
this email is to ask the group how to properly read in the value in this
pseudo-data "1e-20" using the read_fwf function in the package readr.
11e-201043
1712201043
1912201055
First, suppose I do it this way, where in this case "D" is used for double
precision.
library(readr)
pos <- fwf_positions(c(1,2,7), c(1,6,10))
type <- c('N','D','N')
types <- paste0(type, collapse = '')
types <- chartr('NCD', 'ncd', types)
read_fwf(file = myFile, col_positions = pos, col_types = types)
# A tibble: 3 x 3
X1 X2 X3
<dbl> <dbl> <dbl>
1 1 1.00e-20 1043
2 1 7.12e+ 4 1043
3 1 9.12e+ 4 1055
This seemingly works well and properly captures the value. However, if I
instead were to indicate to the function that *all* of my columns were numeric
(just insert this one line in lieu of the other above)
type <- c('N','N','N')
# A tibble: 3 x 3
X1 X2 X3
<dbl> <dbl> <dbl>
1 1 1 1043
2 1 71220 1043
3 1 91220 1055
The read in is not correct. Here is the pragmatic issue. I have a legacy
program that spits out the layout structure of the fwf file (start, end
positions) and also indicates what the column types are. This layout file we
receive always uses a column type of numeric (N) for any numeric types
(including the column holding values such as 1e-20).
This layout file will not change so I need to figure out how to solve the
problem within my read in program. I suppose one option is that I could
manually change any values of "N" to "D" in my R code. That seems to work. But
not sure if that is the "right" way to solve this issue.
Thanks
Harold
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.