date:20210701

Re: [Rd] S3 dispatch does not work for generics defined inside an environment

2021-07-01 Thread Taras Zakharko

Hi Greg, 

That was my original plan as well, but managing and deploying dozens of little 
packages that are all under active development is a nightmare even with 
devtools. Just too much overhead, not to mention that coming up with names that 
would not have namespace conflicts was getting silly. 

In the end, I wrote a package that implements lightweight python-like modules 
for R and that has really improved my workflow. I hope to publish this package 
later this year after I have cleaned it up a bit. 

Thanks, 

Taras

> On 1 Jul 2021, at 05:55, Greg Minshall  wrote:
> 
> Taras,
> 
>> P.S. If you are wondering what I am trying to achieve here — we have a
>> very large codebase and I am trying to use environments as a type of
>> “poor man’s namespaces” to organize code in a modular fashion. But of
>> course it’s all pointless if I can’t get the generics to work
>> reliably.
> 
> i'm not knowledgeable about S3.  but, a different way to try to
> modularize large code bases is to split them into separate packages.
> just in case you hadn't already thought about, and rejected, that idea.
> 
> cheers, Greg

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] On read.csv and write.csv

2021-07-01 Thread Taras Zakharko

Stephen, 

I am sure one can find a lot of small issues and inconsistencies with R and 
it’s standard library. It has to support a lot of legacy cruft and the design 
process — especially in the early days — focused on getting things done rather 
than delivering a standard library of immaculate quality. And it is way too 
late to make dramatic changes lest you want to risk breaking existing software. 
That ship has sailed decades ago. 

Personally, I have taught myself a while ago to always use explicit 
configuration when using built-in functions, and in the last couple of years I 
have completely replaced them in favor of other packages (such as readr) that 
come with (arguably) more sane defaults and better diagnostics. 

Best, 

Taras


> On 30 Jun 2021, at 23:15, Stephen Ellison  wrote:
> 
> Apologies if this is a well-worn question; I haven’t found it so far but 
> there's a lot of r-dev and I may have missed it in the archives. In the mean 
> time:
> 
> I've managed to avoid writing csv files with R for a couple of decades but 
> we're swopping data with a collaborator and I've tripped over an 
> inconsistency between read.csv and write.csv that seems less than helpful.
> The default line number behaviour for read.csv is to assume that, when the 
> number of items in the first row is one less than the number in the second, 
> that the first column contains row names. write.csv, however, includes an 
> empty string ("") as the first header entry over row names when writing. On 
> rereading, the original row names are then treated as data with unknown name, 
> replaced by "X".
> 
> That means that, unlike read.table and write.table,  something written with 
> write.csv is not read back correctly by read.csv .
> 
> Is that intentional?
> And whether it is intentional or not, is it wise?
> 
> Example:
> 
> ( D1 <- data.frame(A=letters[1:5], N=1:5, Y=rnorm(5) ) )
> write.csv(D1, "temp.csv")
> 
> ( D1w <- read.csv("temp.csv") )
> 
> # Note the unnecessary new X column ...
> #Tidy up
> unlink("temp.csv")
> 
> This differs from the parent .table defaults; write.table doesn’t add the 
> extra "" column label, so the object read back with read.table does not 
> contain an unwanted extra column.
> 
> Wouldn’t it be more sensible if write.csv() and read.csv() were consistent in 
> the same sense as read.table and write.table?
> Or at least if there were a switch (as.read.csv=TRUE ?) to tell write.csv to 
> omit the initial "", or vice versa?
> 
> Currently using R version 4.1.0 on Windows, but this reproduces at least as 
> far back as 3.6 
> 
> Steve E
> 
> 
> ***
> This email and any attachments are confidential. Any u...{{dropped:13}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S3 dispatch does not work for generics defined inside an environment

2021-07-01 Thread Greg Minshall

Taras,
> That was my original plan as well, but managing and deploying dozens
> of little packages that are all under active development is a
> nightmare even with devtools. Just too much overhead, not to mention
> that coming up with names that would not have namespace conflicts was
> getting silly.

i can imagine.

> In the end, I wrote a package that implements lightweight python-like
> modules for R and that has really improved my workflow. I hope to
> publish this package later this year after I have cleaned it up a bit.

cool -- good luck with it.

Greg

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S3 dispatch does not work for generics defined inside an environment

2021-07-01 Thread Johannes Ranke

> In the end, I wrote a package that implements lightweight python-like
> modules for R and that has really improved my workflow. I hope to publish
> this package later this year after I have cleaned it up a bit.

Hi, are you aware of the previous work in this direction

https://github.com/klmr/box

and

https://github.com/wahani/modules 

Johannes


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S3 dispatch does not work for generics defined inside an environment

2021-07-01 Thread Taras Zakharko

Thanks Johannes, 

I was aware of the modules package (it was not suitable for my needs 
unfortunately), but I did not know about box… somehow I managed to completely 
miss it in my search (embarrassing, really).  

My own package offers similar functionality to box, but is designed to closely 
follow the behavior of regular R packages and heavily relies on Roxygen2 to 
manage imports and exports. The idea is that you can convert a module to an R 
package (or via versa) with no additional effort.  Basically, where box offers 
an opinionated implementation of modules, my design aims to be as “boring” as 
possible. You just write the code as you usually would do for a package and 
then you use a single new function to load it in a modular fashion. 

At any rate, I am exited that there is work in this area and I am looking 
forward to further exploring the design space. Maybe one day these efforts will 
culminate in an official R module system.  

Thanks, 

Taras

> On 1 Jul 2021, at 11:42, Johannes Ranke  wrote:
> 
>> In the end, I wrote a package that implements lightweight python-like
>> modules for R and that has really improved my workflow. I hope to publish
>> this package later this year after I have cleaned it up a bit.
> 
> Hi, are you aware of the previous work in this direction
> 
> https://github.com/klmr/box
> 
> and
> 
> https://github.com/wahani/modules 
> 
> Johannes
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] SET_COMPLEX_ELT and SET_RAW_ELT missing from Rinternals.h

2021-07-01 Thread Konrad Siek

Thanks!

So what would be the prescribed way of assigning elements to a CPLXSXP if I
needed to?

One way I see is to do what most of the code inside the interpreter does
and grab the vector's data pointer:

COMPLEX(sexp)[index] = value;
COMPLEX0(sexp)[index] = value;

This will materialize an ALTREP CPLXSXP though, so maybe the best way would
be to mirror what SET_COMPLEX_ELT does in Rinlinedfuns.h?

if (ALTREP(sexp)) ALTCOMPLEX_SET_ELT(sexp, index, value); else
COMPLEX0(sexp)[index] = vector;

This seems better, but it's not used in the interpreter anywhere as far as
I can tell, presumably because of the setter interface not being complete,
as you point out. But should I be avoiding this second approach for some
reaosn?

k

On Tue, Jun 29, 2021 at 4:06 AM  wrote:

> The setter interface for atomic types is not yer implemented. It may
> be some day.
>
> Best,
>
> luke
>
> On Fri, 25 Jun 2021, Konrad Siek wrote:
>
> > Hello,
> >
> > I am working on a package that works with various types of R vectors,
> > implemented in C. My code has a lot of SET_*_ELT operations in it for
> > various types of vectors, including for CPLXSXPs and RAWSXPs.
> >
> > I noticed SET_COMPLEX_ELT and SET_RAW_ELT are defined in Rinlinedfuns.h
> but
> > not declared in Rinternals.h, so they cannot be used in packages. I was
> > going to re-implement them or extern them in my package, however,
> > interestingly, ALTCOMPLEX_SET_ELT and ALTRAW_SET_ELT  are both declared
> in
> > Rinternals.h, making me think SET_COMPLEX_ELT and SET_RAW_ELT could be
> > purposefully obscured. Otherwise it may just be an oversight and I should
> > bring it to someone's attention anyway.
> >
> > I have three questions that I hope R-devel could help me with.
> >
> > 1. Is this an oversight, or are SET_COMPLEX_ELT and SET_RAW_ELT not
> exposed
> > on purpose? 2. If they are not exposed on purpose, I was wondering why.
> > 3. More importantly, what would be good ways to set elements of these
> > vectors while playing nice with ALTREP and avoiding whatever pitfalls
> > caused these functions to be obscured in the first place?
> >
> > Best regards,
> > Konrad,
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] On read.csv and write.csv

2021-07-01 Thread Stephen Ellison

> the "unhelpful" column are the row names. They are considered an
> important part of a data frame and therefore the default (row.names =
> TRUE) is to not lose them (as there is no way back once you do). If you don't
> want to preserve the row names you can simply set row.names=FALSE.

Please run the reproducible example provided. 
When you do, you will see that write.csv writes an unnecessary empty header 
field ("") over the row names column. This makes the number of header fields 
equal to the number of columns _including_ row names. That causes the original 
row names to be read as data by read.csv, following the rule that the number of 
header fields determines whether row names are present. read.csv  accordingly 
assumes that the former row names are unnamed data, calls the unnamed row names 
column "X" (or X.1 etc if X exists) and then adds new, default, row names 
_instead of the original row names written by write.csv_. 
That's not helpful.

By contrast read.table correctly reads the first entry in each row as a row 
name when the number of header fields is one less than the number of data 
columns. write.table includes row names as row names _without a header field_, 
so a file written with write.table is correctly formatted for read.table to 
interpret the first data field as a row name.
I think it would be more sensible if write.csv did the same as write.table when 
row.names=TRUE - as it is, by default.





***
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] On read.csv and write.csv

2021-07-01 Thread Gabriel Becker

On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison 
wrote:

>
> Please run the reproducible example provided.
> When you do, you will see that write.csv writes an unnecessary empty
> header field ("") over the row names column. This makes the number of
> header fields equal to the number of columns _including_ row names. That
> causes the original row names to be read as data by read.csv, following the
> rule that the number of header fields determines whether row names are
> present. read.csv  accordingly assumes that the former row names are
> unnamed data, calls the unnamed row names column "X" (or X.1 etc if X
> exists) and then adds new, default, row names _instead of the original row
> names written by write.csv_.
> That's not helpful.
>

This depends on if you are reading the csv via R or something else, I would
imagine. It not being "valid" CSV at all would likely cause some programs
to choke entirely, I expect. I admit that's conjecture though, I don't have
data on that one way or another.

~G

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] On read.csv and write.csv

2021-07-01 Thread John Fox


Dear Gabriel,

On 2021-07-01 6:29 p.m., Gabriel Becker wrote:

On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison 
wrote:



Please run the reproducible example provided.
When you do, you will see that write.csv writes an unnecessary empty
header field ("") over the row names column. This makes the number of
header fields equal to the number of columns _including_ row names. That
causes the original row names to be read as data by read.csv, following the
rule that the number of header fields determines whether row names are
present. read.csv  accordingly assumes that the former row names are
unnamed data, calls the unnamed row names column "X" (or X.1 etc if X
exists) and then adds new, default, row names _instead of the original row
names written by write.csv_.
That's not helpful.



This depends on if you are reading the csv via R or something else, I would
imagine. It not being "valid" CSV at all would likely cause some programs
to choke entirely, I expect. I admit that's conjecture though, I don't have
data on that one way or another.


On Excel, for example, opening a .csv file without the empty initial 
field in the first line will cause the column names to be misaligned.


As others have pointed out, .csv files are meant as a sort of 
least-common-denominator of data exchange, and so following the standard 
is probably a good idea.


Best,
 John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/



~G

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] On read.csv and write.csv

2021-07-01 Thread Simon Urbanek



Just for completeness, all this is well documented:

CSV files:

 By default there is no column name for a column of row names.  If
 ‘col.names = NA’ and ‘row.names = TRUE’ a blank column name is
 added, which is the convention used for CSV files to be read by
 spreadsheets.  Note that such CSV files can be read in R by

   read.csv(file = "", row.names = 1)

Cheers,
Simon



> On 2/07/2021, at 10:29 AM, Gabriel Becker  wrote:
> 
> 
> 
> On Thu, Jul 1, 2021 at 1:46 PM Stephen Ellison  wrote:
> 
> Please run the reproducible example provided. 
> When you do, you will see that write.csv writes an unnecessary empty header 
> field ("") over the row names column. This makes the number of header fields 
> equal to the number of columns _including_ row names. That causes the 
> original row names to be read as data by read.csv, following the rule that 
> the number of header fields determines whether row names are present. 
> read.csv  accordingly assumes that the former row names are unnamed data, 
> calls the unnamed row names column "X" (or X.1 etc if X exists) and then adds 
> new, default, row names _instead of the original row names written by 
> write.csv_. 
> That's not helpful.
> 
> This depends on if you are reading the csv via R or something else, I would 
> imagine. It not being "valid" CSV at all would likely cause some programs to 
> choke entirely, I expect. I admit that's conjecture though, I don't have data 
> on that one way or another.
> 
> ~G

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S3 dispatch does not work for generics defined inside an environment

Re: [Rd] On read.csv and write.csv

Re: [Rd] S3 dispatch does not work for generics defined inside an environment

Re: [Rd] S3 dispatch does not work for generics defined inside an environment

Re: [Rd] S3 dispatch does not work for generics defined inside an environment

Re: [Rd] [External] SET_COMPLEX_ELT and SET_RAW_ELT missing from Rinternals.h

Re: [Rd] On read.csv and write.csv

Re: [Rd] On read.csv and write.csv

Re: [Rd] On read.csv and write.csv

Re: [Rd] On read.csv and write.csv

10 matches

Site Navigation

Mail list logo

Footer information