Hum... This boils down to
> as.numeric("1.23e") [1] 1.23 > as.numeric("1.23e-") [1] 1.23 > as.numeric("1.23e+") [1] 1.23 which in turn comes from this code in src/main/util.c (function R_strtod) if (*p == 'e' || *p == 'E') { int expsign = 1; switch(*++p) { case '-': expsign = -1; case '+': p++; default: ; } for (n = 0; *p >= '0' && *p <= '9'; p++) n = (n < MAX_EXPONENT_PREFIX) ? n * 10 + (*p - '0') : n; expn += expsign * n; } which sets the exponent to zero even if the for loop terminates immediately. This might qualify as a bug, as it differs from the C function strtod which accepts "A sequence of digits, optionally containing a decimal-point character (.), optionally followed by an exponent part (an e or E character followed by an optional sign and a sequence of digits)." [Of course, there would be nothing to stop e.g. "1433E1" from being converted to numeric.] -pd > On 16 Apr 2024, at 12:46 , jing hua zhao <jinghuaz...@hotmail.com> wrote: > > Dear R-developers, > > I came to a somewhat unexpected behaviour of read.csv() which is trivial but > worthwhile to note -- my data involves a protein named "1433E" but to save > space I drop the quote so it becomes, > > Gene,SNP,prot,log10p > YWHAE,13:62129097_C_T,1433E,7.35 > YWHAE,4:72617557_T_TA,1433E,7.73 > > Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly > confused by scientific notation) numeric 1433 which only alerts me when I > tried to combine data, > > all_data <- data.frame() > for (protein in proteins[1:7]) > { > cat(protein,":\n") > f <- paste0(protein,".csv") > if(file.exists(f)) > { > p <- read.csv(f) > print(p) > if(nrow(p)>0) all_data <- bind_rows(all_data,p) > } > } > > proteins[1:7] > [1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z" > > dplyr::bind_rows() failed to work due to incompatible types nevertheless > rbind() went ahead without warnings. > > Best wishes, > > > Jing Hua > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel