Re: [Rd] S3 dispatch does not work for generics defined inside an environment
Hi Greg, That was my original plan as well, but managing and deploying dozens of little packages that are all under active development is a nightmare even with devtools. Just too much overhead, not to mention that coming up with names that would not have namespace conflicts was getting silly. In the end, I wrote a package that implements lightweight python-like modules for R and that has really improved my workflow. I hope to publish this package later this year after I have cleaned it up a bit. Thanks, Taras > On 1 Jul 2021, at 05:55, Greg Minshall wrote: > > Taras, > >> P.S. If you are wondering what I am trying to achieve here — we have a >> very large codebase and I am trying to use environments as a type of >> “poor man’s namespaces” to organize code in a modular fashion. But of >> course it’s all pointless if I can’t get the generics to work >> reliably. > > i'm not knowledgeable about S3. but, a different way to try to > modularize large code bases is to split them into separate packages. > just in case you hadn't already thought about, and rejected, that idea. > > cheers, Greg __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On read.csv and write.csv
Stephen, I am sure one can find a lot of small issues and inconsistencies with R and it’s standard library. It has to support a lot of legacy cruft and the design process — especially in the early days — focused on getting things done rather than delivering a standard library of immaculate quality. And it is way too late to make dramatic changes lest you want to risk breaking existing software. That ship has sailed decades ago. Personally, I have taught myself a while ago to always use explicit configuration when using built-in functions, and in the last couple of years I have completely replaced them in favor of other packages (such as readr) that come with (arguably) more sane defaults and better diagnostics. Best, Taras > On 30 Jun 2021, at 23:15, Stephen Ellison wrote: > > Apologies if this is a well-worn question; I haven’t found it so far but > there's a lot of r-dev and I may have missed it in the archives. In the mean > time: > > I've managed to avoid writing csv files with R for a couple of decades but > we're swopping data with a collaborator and I've tripped over an > inconsistency between read.csv and write.csv that seems less than helpful. > The default line number behaviour for read.csv is to assume that, when the > number of items in the first row is one less than the number in the second, > that the first column contains row names. write.csv, however, includes an > empty string ("") as the first header entry over row names when writing. On > rereading, the original row names are then treated as data with unknown name, > replaced by "X". > > That means that, unlike read.table and write.table, something written with > write.csv is not read back correctly by read.csv . > > Is that intentional? > And whether it is intentional or not, is it wise? > > Example: > > ( D1 <- data.frame(A=letters[1:5], N=1:5, Y=rnorm(5) ) ) > write.csv(D1, "temp.csv") > > ( D1w <- read.csv("temp.csv") ) > > # Note the unnecessary new X column ... > #Tidy up > unlink("temp.csv") > > This differs from the parent .table defaults; write.table doesn’t add the > extra "" column label, so the object read back with read.table does not > contain an unwanted extra column. > > Wouldn’t it be more sensible if write.csv() and read.csv() were consistent in > the same sense as read.table and write.table? > Or at least if there were a switch (as.read.csv=TRUE ?) to tell write.csv to > omit the initial "", or vice versa? > > Currently using R version 4.1.0 on Windows, but this reproduces at least as > far back as 3.6 > > Steve E > > > *** > This email and any attachments are confidential. Any u...{{dropped:13}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S3 dispatch does not work for generics defined inside an environment
Taras, > That was my original plan as well, but managing and deploying dozens > of little packages that are all under active development is a > nightmare even with devtools. Just too much overhead, not to mention > that coming up with names that would not have namespace conflicts was > getting silly. i can imagine. > In the end, I wrote a package that implements lightweight python-like > modules for R and that has really improved my workflow. I hope to > publish this package later this year after I have cleaned it up a bit. cool -- good luck with it. Greg __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S3 dispatch does not work for generics defined inside an environment
> In the end, I wrote a package that implements lightweight python-like > modules for R and that has really improved my workflow. I hope to publish > this package later this year after I have cleaned it up a bit. Hi, are you aware of the previous work in this direction https://github.com/klmr/box and https://github.com/wahani/modules Johannes [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S3 dispatch does not work for generics defined inside an environment
Thanks Johannes, I was aware of the modules package (it was not suitable for my needs unfortunately), but I did not know about box… somehow I managed to completely miss it in my search (embarrassing, really). My own package offers similar functionality to box, but is designed to closely follow the behavior of regular R packages and heavily relies on Roxygen2 to manage imports and exports. The idea is that you can convert a module to an R package (or via versa) with no additional effort. Basically, where box offers an opinionated implementation of modules, my design aims to be as “boring” as possible. You just write the code as you usually would do for a package and then you use a single new function to load it in a modular fashion. At any rate, I am exited that there is work in this area and I am looking forward to further exploring the design space. Maybe one day these efforts will culminate in an official R module system. Thanks, Taras > On 1 Jul 2021, at 11:42, Johannes Ranke wrote: > >> In the end, I wrote a package that implements lightweight python-like >> modules for R and that has really improved my workflow. I hope to publish >> this package later this year after I have cleaned it up a bit. > > Hi, are you aware of the previous work in this direction > > https://github.com/klmr/box > > and > > https://github.com/wahani/modules > > Johannes > > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] SET_COMPLEX_ELT and SET_RAW_ELT missing from Rinternals.h
Thanks! So what would be the prescribed way of assigning elements to a CPLXSXP if I needed to? One way I see is to do what most of the code inside the interpreter does and grab the vector's data pointer: COMPLEX(sexp)[index] = value; COMPLEX0(sexp)[index] = value; This will materialize an ALTREP CPLXSXP though, so maybe the best way would be to mirror what SET_COMPLEX_ELT does in Rinlinedfuns.h? if (ALTREP(sexp)) ALTCOMPLEX_SET_ELT(sexp, index, value); else COMPLEX0(sexp)[index] = vector; This seems better, but it's not used in the interpreter anywhere as far as I can tell, presumably because of the setter interface not being complete, as you point out. But should I be avoiding this second approach for some reaosn? k On Tue, Jun 29, 2021 at 4:06 AM wrote: > The setter interface for atomic types is not yer implemented. It may > be some day. > > Best, > > luke > > On Fri, 25 Jun 2021, Konrad Siek wrote: > > > Hello, > > > > I am working on a package that works with various types of R vectors, > > implemented in C. My code has a lot of SET_*_ELT operations in it for > > various types of vectors, including for CPLXSXPs and RAWSXPs. > > > > I noticed SET_COMPLEX_ELT and SET_RAW_ELT are defined in Rinlinedfuns.h > but > > not declared in Rinternals.h, so they cannot be used in packages. I was > > going to re-implement them or extern them in my package, however, > > interestingly, ALTCOMPLEX_SET_ELT and ALTRAW_SET_ELT are both declared > in > > Rinternals.h, making me think SET_COMPLEX_ELT and SET_RAW_ELT could be > > purposefully obscured. Otherwise it may just be an oversight and I should > > bring it to someone's attention anyway. > > > > I have three questions that I hope R-devel could help me with. > > > > 1. Is this an oversight, or are SET_COMPLEX_ELT and SET_RAW_ELT not > exposed > > on purpose? 2. If they are not exposed on purpose, I was wondering why. > > 3. More importantly, what would be good ways to set elements of these > > vectors while playing nice with ALTREP and avoiding whatever pitfalls > > caused these functions to be obscured in the first place? > > > > Best regards, > > Konrad, > > > > [[alternative HTML version deleted]] > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics andFax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tier...@uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On read.csv and write.csv
> the "unhelpful" column are the row names. They are considered an > important part of a data frame and therefore the default (row.names = > TRUE) is to not lose them (as there is no way back once you do). If you don't > want to preserve the row names you can simply set row.names=FALSE. Please run the reproducible example provided. When you do, you will see that write.csv writes an unnecessary empty header field ("") over the row names column. This makes the number of header fields equal to the number of columns _including_ row names. That causes the original row names to be read as data by read.csv, following the rule that the number of header fields determines whether row names are present. read.csv accordingly assumes that the former row names are unnamed data, calls the unnamed row names column "X" (or X.1 etc if X exists) and then adds new, default, row names _instead of the original row names written by write.csv_. That's not helpful. By contrast read.table correctly reads the first entry in each row as a row name when the number of header fields is one less than the number of data columns. write.table includes row names as row names _without a header field_, so a file written with write.table is correctly formatted for read.table to interpret the first data field as a row name. I think it would be more sensible if write.csv did the same as write.table when row.names=TRUE - as it is, by default. *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On read.csv and write.csv
On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison wrote: > > Please run the reproducible example provided. > When you do, you will see that write.csv writes an unnecessary empty > header field ("") over the row names column. This makes the number of > header fields equal to the number of columns _including_ row names. That > causes the original row names to be read as data by read.csv, following the > rule that the number of header fields determines whether row names are > present. read.csv accordingly assumes that the former row names are > unnamed data, calls the unnamed row names column "X" (or X.1 etc if X > exists) and then adds new, default, row names _instead of the original row > names written by write.csv_. > That's not helpful. > This depends on if you are reading the csv via R or something else, I would imagine. It not being "valid" CSV at all would likely cause some programs to choke entirely, I expect. I admit that's conjecture though, I don't have data on that one way or another. ~G [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On read.csv and write.csv
Dear Gabriel, On 2021-07-01 6:29 p.m., Gabriel Becker wrote: On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison wrote: Please run the reproducible example provided. When you do, you will see that write.csv writes an unnecessary empty header field ("") over the row names column. This makes the number of header fields equal to the number of columns _including_ row names. That causes the original row names to be read as data by read.csv, following the rule that the number of header fields determines whether row names are present. read.csv accordingly assumes that the former row names are unnamed data, calls the unnamed row names column "X" (or X.1 etc if X exists) and then adds new, default, row names _instead of the original row names written by write.csv_. That's not helpful. This depends on if you are reading the csv via R or something else, I would imagine. It not being "valid" CSV at all would likely cause some programs to choke entirely, I expect. I admit that's conjecture though, I don't have data on that one way or another. On Excel, for example, opening a .csv file without the empty initial field in the first line will cause the column names to be misaligned. As others have pointed out, .csv files are meant as a sort of least-common-denominator of data exchange, and so following the standard is probably a good idea. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ ~G [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] On read.csv and write.csv
Just for completeness, all this is well documented: CSV files: By default there is no column name for a column of row names. If ‘col.names = NA’ and ‘row.names = TRUE’ a blank column name is added, which is the convention used for CSV files to be read by spreadsheets. Note that such CSV files can be read in R by read.csv(file = "", row.names = 1) Cheers, Simon > On 2/07/2021, at 10:29 AM, Gabriel Becker wrote: > > > > On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison wrote: > > Please run the reproducible example provided. > When you do, you will see that write.csv writes an unnecessary empty header > field ("") over the row names column. This makes the number of header fields > equal to the number of columns _including_ row names. That causes the > original row names to be read as data by read.csv, following the rule that > the number of header fields determines whether row names are present. > read.csv accordingly assumes that the former row names are unnamed data, > calls the unnamed row names column "X" (or X.1 etc if X exists) and then adds > new, default, row names _instead of the original row names written by > write.csv_. > That's not helpful. > > This depends on if you are reading the csv via R or something else, I would > imagine. It not being "valid" CSV at all would likely cause some programs to > choke entirely, I expect. I admit that's conjecture though, I don't have data > on that one way or another. > > ~G __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel