Re: [Rd] On read.csv and write.csv

Taras Zakharko Thu, 01 Jul 2021 00:55:32 -0700

Stephen, 

I am sure one can find a lot of small issues and inconsistencies with R and 
it’s standard library. It has to support a lot of legacy cruft and the design 
process — especially in the early days — focused on getting things done rather 
than delivering a standard library of immaculate quality. And it is way too 
late to make dramatic changes lest you want to risk breaking existing software. 
That ship has sailed decades ago.


Personally, I have taught myself a while ago to always use explicit 
configuration when using built-in functions, and in the last couple of years I 
have completely replaced them in favor of other packages (such as readr) that 
come with (arguably) more sane defaults and better diagnostics. 

Best, 

Taras


> On 30 Jun 2021, at 23:15, Stephen Ellison <s.elli...@lgcgroup.com> wrote:
> 
> Apologies if this is a well-worn question; I haven’t found it so far but 
> there's a lot of r-dev and I may have missed it in the archives. In the mean 
> time:
> 
> I've managed to avoid writing csv files with R for a couple of decades but 
> we're swopping data with a collaborator and I've tripped over an 
> inconsistency between read.csv and write.csv that seems less than helpful.
> The default line number behaviour for read.csv is to assume that, when the 
> number of items in the first row is one less than the number in the second, 
> that the first column contains row names. write.csv, however, includes an 
> empty string ("") as the first header entry over row names when writing. On 
> rereading, the original row names are then treated as data with unknown name, 
> replaced by "X".
> 
> That means that, unlike read.table and write.table,  something written with 
> write.csv is not read back correctly by read.csv .
> 
> Is that intentional?
> And whether it is intentional or not, is it wise?
> 
> Example:
> 
> ( D1 <- data.frame(A=letters[1:5], N=1:5, Y=rnorm(5) ) )
> write.csv(D1, "temp.csv")
> 
> ( D1w <- read.csv("temp.csv") )
> 
> # Note the unnecessary new X column ...
> #Tidy up
> unlink("temp.csv")
> 
> This differs from the parent .table defaults; write.table doesn’t add the 
> extra "" column label, so the object read back with read.table does not 
> contain an unwanted extra column.
> 
> Wouldn’t it be more sensible if write.csv() and read.csv() were consistent in 
> the same sense as read.table and write.table?
> Or at least if there were a switch (as.read.csv=TRUE ?) to tell write.csv to 
> omit the initial "", or vice versa?
> 
> Currently using R version 4.1.0 on Windows, but this reproduces at least as 
> far back as 3.6 
> 
> Steve E
> 
> 
> *******************************************************************
> This email and any attachments are confidential. Any u...{{dropped:13}}

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] On read.csv and write.csv

Reply via email to