Hum...
This boils down to
> as.numeric("1.23e")
[1] 1.23
> as.numeric("1.23e-")
[1] 1.23
> as.numeric("1.23e+")
[1] 1.23
which in turn comes from this code in src/main/util.c (function R_strtod)
if (*p == 'e' || *p == 'E') {
int expsign = 1;
switch(*++p) {
case '-': expsign = -1;
case '+': p++;
default: ;
}
for (n = 0; *p >= '0' && *p <= '9'; p++) n = (n < MAX_EXPONENT_PREFIX)
? n * 10 + (*p - '0') : n;
expn += expsign * n;
}
which sets the exponent to zero even if the for loop terminates immediately.
This might qualify as a bug, as it differs from the C function strtod which
accepts
"A sequence of digits, optionally containing a decimal-point character (.),
optionally followed by an exponent part (an e or E character followed by an
optional sign and a sequence of digits)."
[Of course, there would be nothing to stop e.g. "1433E1" from being converted
to numeric.]
-pd
> On 16 Apr 2024, at 12:46 , jing hua zhao <[email protected]> wrote:
>
> Dear R-developers,
>
> I came to a somewhat unexpected behaviour of read.csv() which is trivial but
> worthwhile to note -- my data involves a protein named "1433E" but to save
> space I drop the quote so it becomes,
>
> Gene,SNP,prot,log10p
> YWHAE,13:62129097_C_T,1433E,7.35
> YWHAE,4:72617557_T_TA,1433E,7.73
>
> Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly
> confused by scientific notation) numeric 1433 which only alerts me when I
> tried to combine data,
>
> all_data <- data.frame()
> for (protein in proteins[1:7])
> {
> cat(protein,":\n")
> f <- paste0(protein,".csv")
> if(file.exists(f))
> {
> p <- read.csv(f)
> print(p)
> if(nrow(p)>0) all_data <- bind_rows(all_data,p)
> }
> }
>
> proteins[1:7]
> [1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z"
>
> dplyr::bind_rows() failed to work due to incompatible types nevertheless
> rbind() went ahead without warnings.
>
> Best wishes,
>
>
> Jing Hua
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [email protected] Priv: [email protected]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel