from:"Abby Spurdle"

Re: [Rd] nrow(rbind(character(), character())) returns 2 (as documented but very unintuitive, IMHO)

2019-05-16 Thread Abby Spurdle

Herve Pages wrote:

> In my experience, and more generally speaking, the desire to treat
> 0-length vectors as a special case that deviates from the
> non-zero-length case has never been productive.

Good idea.

Gabriel Becker Wrote:

> > nrow(rbind(aa = c("a", "b", "c"), AA = character()))
> [1] 1

> By rights of the invariance that you and Hadley are advocating,  as far as
> I understand it, the last should give 2 rows, one of which is all NAs,
> rather than giving only one row as it currently does (and, I assume?,
> always has).

I think, ideally, this example should generate an error or a warning.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] print.() not called when autoprinting

2019-05-17 Thread Abby Spurdle

I don't know the answer to your question.
However, here's a side issue that may be relevant.

Last year, I tried creating my own ecdf object, and redefined the print
method for ecdf.

It worked ok in the console, interactively.
However, when I tried calling the method (with autoprinting) inside an
Sweave document, the stats package method was used instead of my method.
I never determined why this was happening.
However, R check generated a warning later, so I renamed the classes and
methods.


Abs


On Fri, May 17, 2019 at 6:57 AM William Dunlap via R-devel <
r-devel@r-project.org> wrote:
>
> In R-3.6.0 autoprinting was changed so that print methods for the storage
> modes are not called when there is no explicit class attribute.   E.g.,
>
> % R-3.6.0 --vanilla --quiet
> > print.function <- function(x, ...) { cat("Function with argument list
");
> cat(sep="\n", head(deparse(args(x)), -1)); invisible(x) }
> > f <- function(x, ...) { sum( x * seq_along(x) ) }
> > f
> function(x, ...) { sum( x * seq_along(x) ) }
> > print(f)
> Function with argument list function (x, ...)
>
> Previous to R-3.6.0 autoprinting did call such methods
> % R-3.5.3 --vanilla --quiet
> > print.function <- function(x, ...) { cat("Function with argument list
");
> cat(sep="\n", head(deparse(args(x)), -1)); invisible(x) }
> > f <- function(x, ...) { sum( x * seq_along(x) ) }
> > f
> Function with argument list function (x, ...)
> > print(f)
> Function with argument list function (x, ...)
>
> Was this intentional?
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments

2019-05-19 Thread Abby Spurdle

Hi Pavel
(Back On List)

And my two cents...

> At this time, the update.formula() method always performs a number of
> transformations on the results, eliminating redundant variables and
> reordering interactions to be after the main effects.
> This the proposal is to add an option simplify= (defaulting to TRUE,
> for backwards compatibility) that if FALSE will skip the simplification
> step.
> Any thoughts? One particular question that Martin raised is whether the
> UI should be just a single logical argument, or something else.

Firstly, note that the constructor for formula objects behaves differently
to the update method, so I think any changes should be consistent between
the two functions.
> #constructor - doesn't simplify
> y ~ x + x
y ~ x + x
> #update method - does simplify
> update (y ~ x, ~. + x)
y ~ x

Interestingly, this doesn't simplify.
> update (y ~ I (x), ~. + x)
y ~ I(x) + x

I think that simplification could mean different things.
So, there could be something like:
> update (y ~ x, ~. + x, strip=FALSE)
y ~ I (2 * x)

I don't know how easy that would be to implement.
(Symbolic computation on par with computer algebra systems is a discussion
in itself...).
And you could have one argument (say, method="simplify") rather than two or
more logical arguments.

It would also be possible to allow partial forms of simplification, by
specifying which terms should be collapsed, however, I doubt any possible
usefulness of this, would justify the complexity.
However, feel free to disagree.

You made an interesting comment.

> This is not
> always the desired behavior, because formulas are increasingly used
> for purposes other than specifying linear models.

Can I ask what these purposes are?


kind regards
Abs

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Give update.formula() an option not to simplify or reorder the result -- request for comments

2019-05-24 Thread Abby Spurdle

> Martin Maechler has asked me to send this to R-devel for discussion
> after I submitted it as an enhancement request (
> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17563).

I think R needs to provide more support for CAS-style symbolic computation.
That is, support by either the R language itself or the standard packages,
or both.
(And certainly not by interfacing with another interpreted language).

Obviously, I don't speak for R Core.
However, this is how I would like to see R move in the future.
...improved symbolic and symbolic-numeric computation...

I think any changes to formula objects or their methods, should be
congruent with these symbolic improvements.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] survival changes

2019-06-01 Thread Abby Spurdle

> In the next version of the survival package I intend to make a
non-upwardly compatable
> change to the survfit object.  With over 600 dependent packages this is
not something to
> take lightly, and I am currently undecided about the best way to go about
it.  I'm looking
> for advice.
>
> The change: 20+ years ago I had decided not to include the initial
x=0,y=1 data point in
> the survfit object itself.

New Package -> Bad idea.
Copying Python -> The worst idea...
Version element -> Not sure I understand how that works, but probably a bad
idea.

If all you want to do, is add an initial data point, that shouldn't be an
issue.
However, I'm assuming that you want to make other more significant changes
to your object.
So, at face value, a new object class would be the best option, so number
(2) from your list of options.

Note there is another possibility.
With a little bit of tricky-ness, you can check if your constructor is
called by a function inside a package.
In which case, you can check the publication date of that package (if
published after your package), and then then respond accordingly.
Then you can ask the maintainers of the other packages to update their
packages, but at their own time.


Abs

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Offer zip builds

2019-06-03 Thread Abby Spurdle

> If you go here:
> https://cran.cnr.berkeley.edu/bin/windows/base
> you see EXE installers for Windows. This contrasts with other programming
> languages that offer both an executable installer and ZIP files that can
be
> extracted and run

Are you suggesting that R should do the same?
If so, I second that, excellent idea.
(However, gzip preferred).

I've had significant problems with the Windows installer.
I've never had significant problems with zip files.
Also, I assuming that the zip approach would be easier for systems
administrators.
However, I'm not a systems administrator...


Abs

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Offer zip builds

2019-06-05 Thread Abby Spurdle

> If they choose to continue with only EXE,
> I will just keep using other programming languages.

I did agree with your original suggestion.
However, I don't think that a lack of zip formats, is a disincentive from
using R.

If you have an issue with the Windows installer, the obvious option is to
install the source version, and compile from it.

This is, after all, how open source is designed to work.

Also, I agree with what Duncan said.


Abs

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Offer zip builds

2019-06-07 Thread Abby Spurdle

> Just to add to that point - it is expected that the registry is
appropriately updated so the correct version of R can be located. Just
unpacking a ZIP won't work in general since tools using R have no reliable
way to find it.

Shouldn't it be sufficient to set the "Path" system or environment
variables?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Halfway through writing an "IDE" with support for R; Proof of concept, and request for suggestions.

2019-06-10 Thread Abby Spurdle

I've written what I refer to as an "Integrated Console Environment".
Similar to an IDE, but more console oriented, so suitable for running
scripts and dynamic programming languages.
Also, it's designed to be congruent with the file system.

Obviously, I want to support R.
However, the long term plan is to make the core system relatively language
neutral, and to support R via a plugin.

Here's my (early, partially complete) prototype:
https://sites.google.com/site/spurdlea/java/symbyont

And I have some screenshots, which give the general idea:
https://sites.google.com/site/spurdlea/java/symbyont/screenshots

This biggest problem is that I don't have a fully functional console (or
terminal).
I don't know how this works under Windows, but I have found some
information on how it works under Linux.
If anyone would like to contribute or make suggestions in this area, please
email me.

Currently, I'm simply forking child processes, which works most of the
time, including the Windows command prompt.
There are some complications running R this way.
However, running "R --vanilla --ess" produces a reasonable result.

Also, I'm interested to hear what people would like to see in an R user
interface.

Any suggestions are welcome.
However, here are some specific questions that I have:
(1) What would people teaching R, like to see?
(2) If running multiple versions of R at the same time, are there any
GUI-level features that would be desirable?
(3) What should an outline viewer for R, look like and do?
(4) Should there be a data editor, and if so, should it be able to edit R
objects directly?

Noting that point (4) is contrary to the principle of being console
oriented.

Other notes:
(1) It's written in Java, and Swing.
This was the easiest way to create a cross platform user interface.
(2) Currently, it only supports Windows, very sorry.
I'm planning to have it working on Fedora, in the near future.
Then after that, we'll see...
(3) It's dual licensed under GPL 2 and GPL 3.
(4) I wrote most of this in 2006 and 2007.
I pulled it out of my personal archives at the end of April.
(5) It's badly written, and has some bugs and other problems.
Please don't email me and tell me it's badly written or has bugs, because I
know.
(6) I'm planning to completely rewrite it.
I'm likely to do one or two updates before I start rewriting it.
And hopefully, I'll have most of the problems solved, very soon.

It would be good to get suggestions *before* I start rewriting it.


Abs

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Halfway through writing an "IDE" with support for R; Proof of concept, and request for suggestions.

2019-06-13 Thread Abby Spurdle

I thought that I'd get more feedback.
But it's ok, I understand.

I wanted to note that I've moved symbyont to GitLab, which is where I
should have put it, in the first place.

Also, I'm not planning to start another thread.
However, if anyone has suggestions six months from now (or six years from
now...), you're still welcome to email me, and I will try to listen...

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Halfway through writing an "IDE" with support for R; Proof of concept, and request for suggestions.

2019-06-14 Thread Abby Spurdle

On Fri, Jun 14, 2019 at 7:24 PM Iñaki Ucar  wrote:
>
> There are many similar projects that are mature

I'm not sure what projects you're referring to.

If we create some constraints:

(1) Internal systems consoles (*plural*).
Rules out most things.
Noting that many tools are designed to bypass the console.

(2) Modern user interface.
Rules out Vim and Emacs.

(3) File system based rather than (IDE-dependent) project based.
Rules out Eclipse and many other IDEs.

(4) Multi-language focus.
Rules out RStudio and many other IDEs.

(5) Completely open source and completely free.
Also rules out RStudio, which limites many features to it's enterprise
edition.

(6) Cross platform desktop application, not web based.
(However, there is a need for web based tools).

None of the tools that I've looked at satisfy these constraints.
But if you know of some, I'd like to know... And I would consider
contributing...

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Halfway through writing an "IDE" with support for R; Proof of concept, and request for suggestions.

2019-06-14 Thread Abby Spurdle

> What about Atom, VS Code and the like? Or what about taking a project
> that meets most of the constraints and pushing to cover all of them,
> or even forking it and modifying the part you don't like?

I'm not prepared to endorse GitHub affiliated software.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] head with non integer n returns confusing output

2019-06-22 Thread Abby Spurdle

> `head()` returns a problematic output when a character is fed to its `n`
> parameter.
> this can lead to an unexpected and inconsistent result.
> I would suggest either using `as.integer` consistently on the input, or
> having a consistent error for all character input.

I use the head() and tail() functions, a lot.
I agree that the argument checking and argument handling is not as good as
it could be.

In march I posted the following thread:
https://stat.ethz.ch/pipermail/r-devel/2019-March/077512.html
https://stat.ethz.ch/pipermail/r-devel/2019-March/077527.html

Perhaps, head (1:10, "foo"), should return a clear error message...
Then it's up to the user to convert strings to integers, if that's want he
or she wants to do.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Making R CMD nicer

2019-06-30 Thread Abby Spurdle

> First time posting in the R mailing lists so hopefully this works well.
> I noticed when I type `R CMD` I get this unhelpful message:
> /usr/lib/R/bin/Rcmd: 60: shift: can't shift that many

I wasn't able to reproduce this.
Maybe it's a Linux thing.
But then, I suspect you've omitted some of your input.

> I also think it would be nice if `R CMD help` showed the usable commands.

What do you mean...
All of the following give you the "usable commands":

> R CMD
> R CMD -h
> R CMD --help

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Making R CMD nicer

2019-06-30 Thread Abby Spurdle

In that case, I was wrong.
And I must apologize...

In saying that, good to see Windows out performing Linux on the command
line...


On Mon, Jul 1, 2019 at 11:30 AM Gábor Csárdi  wrote:
>
> For the record, this is Linux R-devel:
>
> root@4bef68c16864:~# R CMD
> /opt/R-devel/lib/R/bin/Rcmd: 60: shift: can't shift that many
> root@4bef68c16864:~# R CMD -h
> /opt/R-devel/lib/R/bin/Rcmd: 62: exec: -h: not found
> root@4bef68c16864:~# R CMD --help
> /opt/R-devel/lib/R/bin/Rcmd: 62: exec: --help: not found
>
> This is R-release on macOS:
>
> ❯ R CMD
> /Library/Frameworks/R.framework/Resources/bin/Rcmd: line 62:
> /Library/Frameworks/R.framework/Resources/bin/: is a directory
> /Library/Frameworks/R.framework/Resources/bin/Rcmd: line 62: exec:
> /Library/Frameworks/R.framework/Resources/bin/: cannot execute:
> Undefined error: 0
> ❯ R CMD -h
> /Library/Frameworks/R.framework/Resources/bin/Rcmd: line 62: exec: -h:
> invalid option
> exec: usage: exec [-cl] [-a name] file [redirection ...]
> ❯ R CMD --help
> /Library/Frameworks/R.framework/Resources/bin/Rcmd: line 62: exec: --:
> invalid option
> exec: usage: exec [-cl] [-a name] file [redirection ...]
>
> On Windows you indeed get a useful list of commands and more helpful tips.
>
> Gabor
>
>
> On Sun, Jun 30, 2019 at 11:36 PM Abby Spurdle  wrote:
> >
> > > First time posting in the R mailing lists so hopefully this works
well.
> > > I noticed when I type `R CMD` I get this unhelpful message:
> > > /usr/lib/R/bin/Rcmd: 60: shift: can't shift that many
> >
> > I wasn't able to reproduce this.
> > Maybe it's a Linux thing.
> > But then, I suspect you've omitted some of your input.
> >
> > > I also think it would be nice if `R CMD help` showed the usable
commands.
> >
> > What do you mean...
> > All of the following give you the "usable commands":
> >
> > > R CMD
> > > R CMD -h
> > > R CMD --help
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Format printing inside a matrix

2019-07-07 Thread Abby Spurdle

> I am not sure if there is an existing solution to this, but I want my S4
> objects inside a list matrix showing correctly.
> R> matrix(lst, 2)
>  [,1] [,2] [,3] [,4] [,5]
> [1,] ?????
> [2,] ?????
> Is it possible that the print method for matrix can call some type of
generic
> such as `as.character` or `format` when it encounters such cases?

I had some difficulty understanding this question.
So, I'm going to paraphrase it.

R, to the best of my knowledge, does not support object arrays, as such.
(Or if it does, I've certainly missed the memo on this one).

The closest option, is to create an (S3) list of (S3 or S4) objects.
This is sufficient in the one-dimension case.
However, to provide functionality of two- or three-dimensional object
arrays, one can create a matrix (or array) from the list.

It's desirable to print such matrices and arrays.
This is possible but the output contains an array of question marks, which
isn't helpful.

Would it be possible for the print method for both matrices and arrays
(conditional on having a list type), call the format method for each
object, possibly creating a character matrix?
Presumably there are other approaches, but the main thing is that the
output is useful and it's easy for R users to control the way objects (in
matrices and arrays) are printed.

> Or is there any other place that I can override without introducing a new
S3 class?

In theory, the simplest approach is to redefine the print method for
matrices.
However, that would be unacceptable in a CRAN package, probably...

So, unless R Core change the print method, you may have to create a matrix
subclass.


Abs

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Format printing inside a matrix

2019-07-07 Thread Abby Spurdle

> The problem of wrapping the list into a S3/S4 object, i.e. subclassing
array
> or matrix, is that one also has to define a bunch of methods for
subsetting,
> joining, etc, in order to make it behave like a list array.

False, sorry.
Wrapping != Defining a New Class.
And you don't have to define any methods.
However, my understanding of your original post is that you want to modify
the printing.
So, there would only need to be one method, a print method.

And if you don't want to do that, you could just create a stand alone
custom print function:
my.print.function = function (my.matrix.object, quote=FALSE, max.chars=10L)
{   do.something ()
}

> It is not desirable if a
> simple matrix subsetting will remove the class attributes of the object.

I'm assuming by "the object" you are referring to the matrix.
And by "class attribute"-"s" you are referring to all the attributes.
This is a completely separate discussion from your original post.
And I don't see what it has to do with printing matrices with a list type.

Note that subsetting only removes attributes from the matrix, it does not
remove attributes (or slots) from each object in the matrix.
Also, note that you may need to use "obj [[i, j]]", with *double* brackets.
Because
> attributes (obj [i, j])
Will be NULL.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Format printing inside a matrix

2019-07-07 Thread Abby Spurdle

> This works fine but no longer work after we do some simple operations.
>   myArray[1:2, 1:2, 2]
>   #  [,1] [,2]
>   # [1,] ??
>   # [2,] ??

OK, that's a good point.
I didn't think of that.

Michael Lawrence was probably correct in his comment:

>However, as soon as you start
> treating these objects as data (like putting them into a matrix),
> you're likely going to want vectorized operations over them, which
> means formalized vector and matrix classes

I'm thinking that we need an array object designed for specifically for
other objects.
I think it would be good if it was part of the standard R distribution.
But that's not up to me.

Note that I'm planning to create an R package for matrix based and table
based objects, in the near future, possibly extending the Matrix package.
(So, likely to be in S4).

I will think about object arrays in R, some more.

Thank you for highlighting this issue.
Sorry, I can't offer an immediate solution.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

2019-07-12 Thread Abby Spurdle

> I assume there are lots of backwards-compatibility issues as well as valid
> use cases for this behavior, so I guess defaulting to M[1:6, 1:6] is out
of
> the question.

Agree.

> Is there any scope for adding a new argument to head.matrix that would
> allow this flexibility?

I agree with what you're trying to achieve.
However, I'm not sure this is as simple as you're suggesting.

What if the user wants "head" in rows but "tail" in columns.
Or "head" in rows, and both "head" and "tail" in columns.
With head and tail alone, there's a combinatorial explosion.

Also, when using tail on an unnamed matrix, it may be desirable to name
rows and columns.

And all of this assumes standard matrix objects.
Add in a matrix subclasses and related objects, and things get more complex
still.

As I suggested in a another thread, a few days ago, I'm planning to write
an R package for matrices and matrix-like objects (possibly extending the
Matrix package), with an initial emphasis on subsetting, printing and
formatting.
So, I'm interested to hear more suggestions on this topic.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Recommended Reading: Advanced R Second Edition

2019-07-21 Thread Abby Spurdle

> After having fully read "Advanced R First Edition"

Try the R Manuals.
https://cran.r-project.org/manuals.html

There's also some good books by John Chambers.

> which explains R Language Core concepts cristal clear,

I'm assuming that you mean "cr${y}stal clear".

> and
> shows the motivation behind libraries such as "rlang", "purrr", "bench",
> "profvis", "sloop", "lobstr", above others.

Novel != Advanced

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Rtools contains Python interpreter(s), and six copies?

2019-08-01 Thread Abby Spurdle

I've just discovered that Rtools (on Windows) contains Python
interpreter(s).
I'm assuming that Python is required to build R packages, on all operating
systems.

I think this is a mistake.

Also, by my count, Rtools contains six Python interpreters.
I've miscounted, I hope...

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Rtools contains Python interpreter(s), and six copies?

2019-08-02 Thread Abby Spurdle

(Excerpts only).
On Sat, Aug 3, 2019 at 12:48 AM Jeroen Ooms  wrote:
> > I'm assuming that Python is required to build R packages, on all
operating
> > systems.
> Please don't assume but read the documentation (preferably before
posting).

I can't find one reference to Python in the documentation:

https://cran.r-project.org/bin/windows/Rtools/
https://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset

Please write documentation (preferably before changing R).
And I will read it.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Rtools contains Python interpreter(s), and six copies?

2019-08-02 Thread Abby Spurdle

> > I can't find one reference to Python in the documentation:
> Maybe because it's *not* needed? There's a note here though:

Thank you.
I'm deleting it.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Underscores in package names

2019-08-15 Thread Abby Spurdle

> While
> package names are not functions, using dots in package names
> encourages the use of dots in functions, a dangerous practice.

"dangerous"...?
I can't understand the necessity of RStudio and Tiny-Verse affiliated
persons to repeatedly use subjective and unscientific phrasing.

Elegant, Advanced, Dangerous...
At UseR, there was even "Advanced Use of your Favorite IDE".

This is not science.
This is marketing.

There's nothing dangerous about it other than your belief that it's
dangerous.
I note that many functions in the stats package use dots in function names.
Your statement implies that the stats package is badly designed, which it
is not.
Out of 14,800-ish packages on CRAN, very few of them are even close to the
standard set by the stats package, in my opinion.

And as noted by other people in this thread, changing naming policies could
interfere with a lot of software "out there", which is dangerous.

> Dots in
> names is also one of the common stones cast at R as a language, as
> dots are used for object oriented method dispatch in other common
> languages.

I don't think the goal is to copy other OOP systems.
Furthermore, some shells use dot as the current working directory and Java
uses dots in package namespaces.
And then there's regular expressions...

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Conventions: Use of globals and main functions

2019-08-27 Thread Abby Spurdle

> this appears to disagree with the software-engineering principle of avoiding 
> a mutating global state

I disagree.
In embedded systems engineering, for example, it's customary to use
global variables to represent ports.

Also, I note that the use of global variables, is similar to using pen
and paper, to do mathematics and statistics.
(Which is good).
Whether that's consistent with software engineering principles or not,
I don't know.

However, I partly agree with you.
Given that there's interest from various parties in running R in
various ways, it may be good to document some of the options
available.

"Running R" (in "R Installation and Administration") links to
"Appendix B Invoking R" (in "An Introduction to R").
However, these sections do not cover the topics in this thread.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Conventions: Use of globals and main functions

2019-08-27 Thread Abby Spurdle

> "Running R" (in "R Installation and Administration") links to
> "Appendix B Invoking R" (in "An Introduction to R").
> However, these sections do not cover the topics in this thread.

Sorry, I made a mistake.
It is in the documentation (B.4 Scripting with R)
e.g.

(excerpts only)
R CMD BATCH "--args arg1 arg2" foo.R &
args <- commandArgs(TRUE)
Rscript foo.R arg1 arg2

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] inconsistent handling of factor, character, and logical predictors in lm()

2019-08-30 Thread Abby Spurdle

> I think that it would be better to handle factors, character predictors, and 
> logical predictors consistently.

"logical predictors" can be regarded as categorical or continuous (i.e. 0 or 1).
And the model matrix should be the same, either way.

I think the first question to be asked is, which is the best approach,
categorical or continuous?
The continuous approach seems simpler and more efficient to me, but
output from the categorical approach may be more intuitive, for some
people.

I note that the use factors and characters, doesn't necessarily
produce consistent output, for $xlevels.
(Because factors can have their levels re-ordered).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] inconsistent handling of factor, character, and logical predictors in lm()

2019-08-31 Thread Abby Spurdle

> I think that this misses the point I was trying to make: lm() et al. treat 
> logical variables as factors, not as numerical predictors.

I'm unenthusiastic about mapping TRUE to -1 and FALSE to 1, in the model matrix.
(I nearly got that back the front).

However, I've decided to agree with your original suggestion,
regarding $xlevels.
I think it should include the logical levels, if that's the right term...

However, I note that the output still won't be completely consistent.
Because one case leads to a logical vector and the other cases lead to
character vectors.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

2019-10-31 Thread Abby Spurdle

On Fri, Nov 1, 2019 at 10:02 AM Pages, Herve  wrote:
> That would be awesome! More generally I wonder how feasible it would be
> to fix all these inheritance quirks where inherits(x, "something"),
> is(x, "something"), and is.something(x) disagree. They've been such a
> nuisance for so many years...

This matter was raised in March:
https://stat.ethz.ch/pipermail/r-devel/2019-March/077457.html

In principle, I agree.
However, I'm not sure it's possible without causing compatibility problems.
Not to mention all the disagreement about what's the correct approach.

And I should probably apologize for incorrectly suggesting that there
was a non-backward-compatible design flaw...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] class() |--> c("matrix", "arrary") [was "head.matrix ..."]

2019-11-12 Thread Abby Spurdle



>x %inherits% "data.frame"

IMHO, I think that user-defined binary operators are being over-used
within the R community.

I don't think that they're "cute" or stylish.
I think their use should be limited to cases, where they significantly
increase the readability of the code.

However, readability, is a (partly) subjective topic...



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] class() |--> c("matrix", "arrary") [was "head.matrix ..."]

2019-11-12 Thread Abby Spurdle

> You can have your own rant about "user-defined binary operators being
> over-used within the R community" without suggesting that my rant was
> rude.

I wasn't suggesting that you were rude.
I was questioning a trend.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] class() |--> c("matrix", "arrary") [was "head.matrix ..."]

2019-11-15 Thread Abby Spurdle

> > And indeed I think you are right on spot and this would mean
> > that indeed the implicit class
> > "matrix" should rather become c("matrix", "array").
>
> I've made up my mind (and not been contradicted by my fellow R
> corers) to try go there for  R 4.0.0   next April.

I'm not enthusiastic about matrices extending arrays.
If a matrix is an array, then shouldn't all vectors in R, be arrays too?

> #mockup
> class (1)
[1] "numeric" "array"

Which is a bad idea.
It contradicts the central principle that R uses "Vectors" rather than "Arrays".
And I feel that matrices are and should be, a special case of vectors.
(With their inheritance from vectors taking precedence over anything else).

If the motivation is to solve the problem of 2D arrays, automatically
being mapped to matrices:

> class (array (1, c (2, 2) ) )
[1] "matrix"

Then wouldn't it be better, to treat 2D arrays, as a special case, and
leave matrices as they are?

> #mockup
> class (array (1, c (2, 2) ) )
[1] "array2d" "matrix" "array"

Then 2D arrays would have access to both matrix and array methods...

Note, I don't want to enter into (another) discussion on the
differences between implicit class and classes defined via a class
attribute.
That's another discussion, which has little to do with my points above.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] standard naming for components of R data structures

2020-01-06 Thread Abby Spurdle

Do you just need something on pen and paper?
(In which case, I don't see why it needs to be "standard").

Or do you need something that can be used with bison/yacc/cup/etc to
produce a parser?

On a side note, I would say that the R Language Definition is the
"standard" way.
But I do recognize that this has a different flavour to modern
language implementation *theory*.

https://cran.r-project.org/doc/manuals/r-release/R-lang.html


On Tue, Jan 7, 2020 at 5:17 AM Steve Dutky  wrote:
>
> I need to write some documentation:
>
> I'm looking for a standard, consistent way of referring  to the components
> and attributes of R data structures.   Googling and Stackoverflow yield a
> variety of github sites that do not seem to be particularly authoritative.
>
> I was hoping to find a BNF/ABNF grammar for R.
>
> I've looked at the output of bison -v ./R-3.6.2/src/main/gram.y but it does
> not appear helpful.
>
> I appreciate any suggestions for where to look or what to do.
>
> Thanks, Steve
>
> --
>
> Ever tried, Ever failed, No Matter:
>
> Try again, Fail again, Fail Better.
>
> Samuel Beckett *Worstward Ho*
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as-cran issue ==> set _R_CHECK_LENGTH_1_* settings!

2020-01-14 Thread Abby Spurdle

> I do want to entice people to have a long look beyond closed
> source OS into the world of Free Software where not only R is
> FOSS (Free and Open Source Software) but (all / almost) all the
> tools you use are of that same spirit.

And while everyone is talking about operating systems...

Recently, I tried to install R on Fedora.
However, it only gave me the option of downloading and installing R
3.6.1, when the current release is/was R 3.6.2.
I decided to wait, and may try again later, over the next week.

Is it possible for things to be free *and* simple?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as-cran issue ==> set _R_CHECK_LENGTH_1_* settings!

2020-01-14 Thread Abby Spurdle

> Which version of Fedora are you on?

I've got Fedora 31.
I just checked, and R 3.6.2 is available now.

Progress...
...however, there's another problem.

>From the dependencies:
R-java   x86_64 3.6.2-1.fc31   updates  10 k
R-java-devel x86_64 3.6.2-1.fc31   updates 9.9 k
java-1.8.0-openjdk   x86_64 1:1.8.0.232.b09-0.fc31 updates 281 k
java-1.8.0-openjdk-devel x86_64 1:1.8.0.232.b09-0.fc31 updates 9.3 M
java-1.8.0-openjdk-headless
 x86_64 1:1.8.0.232.b09-0.fc31 updates  32 M

So, Linux's R (or at least Fedora's R) is dependent on Java.
-> Bad idea...

I'm using OpenJ9, so I can't install R like this without causing
significant problems.
(But please someone correct me if I'm wrong).

I will allocate some time to investigate Dirk's suggestions, however, I'm
thinking the best option is to continue using *Windows* as my primary OS,
and build Linux versions of R from source.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as-cran issue ==> set _R_CHECK_LENGTH_1_* settings!

2020-01-20 Thread Abby Spurdle

> I do want to entice people to have a long look beyond closed
> source OS into the world of Free Software where not only R is
> FOSS (Free and Open Source Software) but (all / almost) all the
> tools you use are of that same spirit.
>
> Best,
> Martin

I've reconsidered.
You're 100% correct.

I'm planning to try ReactOS.
(Hope it works...)

Thanks Martin, great advice...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R package builder silently continues after unclosed brace

2020-01-25 Thread Abby Spurdle

Try R check or the source function:

 (From R check)

> R CMD check testpkg
R CMD check testpkg
* using log directory 'c:/proj/shared/testpkg.Rcheck'
* using R version 3.6.0 (2019-04-26)
* using platform: x86_64-w64-mingw32 (64-bit)
* using session charset: ISO8859-1
* checking for file 'testpkg/DESCRIPTION' ... OK
* this is package 'testpkg' version '0.1.0'
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking whether package 'testpkg' can be installed ... ERROR
Installation failed.
See 'c:/proj/shared/testpkg.Rcheck/00install.out' for details.
* DONE
Status: 1 ERROR

 (From 00install.out)

* installing *source* package 'testpkg' ...
** using staged installation
** R
Error in parse(outFile) :
  c:/proj/shared/testpkg/R/b.R:4:0: unexpected end of input
2:   print("unclosed function_b")
3: # no closing }
  ^
ERROR: unable to collate and parse R files for package 'testpkg'
* removing 'c:/proj/shared/testpkg.Rcheck/testpkg'

 (from the source function)

source ("c:/proj/shared/testpkg/R/b.R", echo=TRUE)
Error in source("c:/proj/shared/testpkg/R/b.R", echo = TRUE) :
  c:/proj/shared/testpkg/R/b.R:4:0: unexpected end of input
2:   print("unclosed function_b")
3: # no closing }

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] matplot.Date & matplot.POSIXct

2020-01-27 Thread Abby Spurdle

Maybe I'm missing something really obvious here, but I was unable to
create a matrix out of POSIXct object(s).
Perhaps that deserves a separate discussion...?

Regarding your other comments/questions:
(1) You should *NOT* mask functions from the graphics package (or
base, stats, etc), except possibly for personal use.
(2) The xlab and ylab are fine.


B.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] matplot.Date & matplot.POSIXct

2020-01-28 Thread Abby Spurdle

> > Maybe I'm missing something really obvious here, but I was unable to
> > create a matrix out of POSIXct object(s).
> > Perhaps that deserves a separate discussion...?
>Can you provide an example?

--
#date and time objects
x = Sys.Date () + 1:16
y = as.POSIXct (x)

#matrices
str (matrix (x, 4, 4) )
str (matrix (y, 4, 4) )
--

Creating a matrix from a Date or POSIXct object, results in a numeric
matrix, not a date/time matrix.

I think that date/time matrices could be useful.
It's possible that this has been discussed before.
But if not, it may be good to discuss it.

And returning to your original post...

I re-read the documentation for the matplot function.
And I feel that it's ambiguous.

The description says:
"Plot the columns of one matrix against the columns of another."
i.e. The matplot function is for *matrices*.

However, then it says:
"x,y vectors or matrices of data for plotting. The number of rows should match."

I'm guessing the current intention is that standard vectors (without a
dim attribute) would be coerce-ible to single-column matrices,
implying that using this function with date and time objects, is
contrary to the way it's currently designed to work.

But...

After reading your examples and re-reading the documentation, your
main suggestion that matplot should support Date and POSIXct objects,
is still *probably* a good one. I note that function is not generic,
and modifications to it would not necessarily be trivial.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R 3.6.3 scheduled for February 29

2020-02-06 Thread Abby Spurdle

Congratulations!

> celebrate (beeR=TRUE, loud.music=FALSE,
nbeeRs=2L,
proportion.of.tech.talk=0.4)

Why is it the 5th anniversary and the not the 20th anniversary?


On Fri, Feb 7, 2020 at 4:58 AM Peter Dalgaard via R-devel
 wrote:
>
> Full schedule is available on developer.r-project.org.
>
> (The date is chosen to celebrate the 5th anniversary of R 1.0.0. Some 
> irregularity may occur on the release day, since this happens to be a 
> Saturday and the release manager is speaking at the CelebRation2020 event...)
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R --interactive -e 'browser()'

2020-02-22 Thread Abby Spurdle

Here's what I would expect:

In interactive mode, input is taken from the user (i.e. command line).
In non-interactive mode, input is taken from a text file (or equivalent).

What you're trying to do is run R in *non*-interactive mode, and call
the browser function.
This requires input to come from the user (i.e. command line) and from
a text file (or equivalent), at essentially the same time.

While doing fantasmoswankyblastik things with I/O is a lot of fun.
I don't think it's R's job to do that.

On Fri, Feb 21, 2020 at 6:25 PM  wrote:
>
> I would like to have a mode where I can run some R code in an executable 
> script, like with Rscript, but interactively, so that e.g. 'browser()' works.
>
>  From the manual page it looks like this should work:
>
>  R --interactive -e 'source("script.R")'
>
> or we could shorten it to:
>
>  R --interactive -e 'browser()'
>
> However, it seems that --interactive causes -e to be ignored.
>
> And if I leave out --interactive, then R quits before the browser() function 
> exits.
>
>  From an engineering standpoint it doesn't seem like it should be very 
> difficult to tell the interactive REPL to pretend that a certain command was 
> entered before everything else. Also, it would be useful to me to be able to 
> debug R scripts using standard features like 'browser()'. Should I submit a 
> feature request on Bugzilla, or maybe someone can advise me how to proceed?
>
> Thanks,
>
> Frederick
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R --interactive -e 'browser()'

2020-02-22 Thread Abby Spurdle

I should add:

In theory, R could emulate the behavior of a compiled programming language.
(In which case, R could read input from the user, or use debugging
tools, when told to).
However, I can't see R-core supporting this approach in the foreseeable future.

A related topic is running R inside Java and C++ programs.
Something R-core is *un*enthusiastic about...???

I note that a number of people have requested the ability to run R
scripts like executable files (easily), or to run scripts with partial
interactivity.

It would be possible to create a special-purpose pty-based (or
pty-like) application specifically to run R, which could allow the
user to emulate the behavior of a compiled language (and do a few
other things), but I'm wondering if it's ethical to publish it...???

On Sun, Feb 23, 2020 at 10:21 AM Abby Spurdle  wrote:
>
> Here's what I would expect:
>
> In interactive mode, input is taken from the user (i.e. command line).
> In non-interactive mode, input is taken from a text file (or equivalent).
>
> What you're trying to do is run R in *non*-interactive mode, and call
> the browser function.
> This requires input to come from the user (i.e. command line) and from
> a text file (or equivalent), at essentially the same time.
>
> While doing fantasmoswankyblastik things with I/O is a lot of fun.
> I don't think it's R's job to do that.
>
>
> On Fri, Feb 21, 2020 at 6:25 PM  wrote:
> >
> > I would like to have a mode where I can run some R code in an executable 
> > script, like with Rscript, but interactively, so that e.g. 'browser()' 
> > works.
> >
> >  From the manual page it looks like this should work:
> >
> >  R --interactive -e 'source("script.R")'
> >
> > or we could shorten it to:
> >
> >  R --interactive -e 'browser()'
> >
> > However, it seems that --interactive causes -e to be ignored.
> >
> > And if I leave out --interactive, then R quits before the browser() 
> > function exits.
> >
> >  From an engineering standpoint it doesn't seem like it should be very 
> > difficult to tell the interactive REPL to pretend that a certain command 
> > was entered before everything else. Also, it would be useful to me to be 
> > able to debug R scripts using standard features like 'browser()'. Should I 
> > submit a feature request on Bugzilla, or maybe someone can advise me how to 
> > proceed?
> >
> > Thanks,
> >
> > Frederick
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RIOT 2020

2020-02-26 Thread Abby Spurdle

If people want to create a new interpreter (for R or any other
data-driven programming language), or do something closely related
(such as adapt an existing interpreter), I think a better strategy
would be to focus on real time computing.

I note that Oracle who appears to be sponsoring this event, also
acquired ChorusOS (via their acquisition of Sun). I think they should
release ChorusOS under an open source license, and consider investing
into that.

On Wed, Feb 26, 2020 at 5:41 AM Stepan  wrote:
>
> I hope you don’t mind us using this mailing list for a small
> advertisement, but we think it is most relevant for this group:
>
> We'd like to invite you to RIOT 2020 - the 5rd workshop on R
> Implementation, Optimization and Tooling [1]. It will take place
> co-located with, and during, useR! 2020 in St. Louis on July 8th. RIOT
> is an excellent venue for deep technical discussions about R
> implementations, tools, optimizations and R extension, and will be very
> interesting for anyone interested in what’s under the hood of R.
>
> Regards,
> Stepan Sindelar, Lukas Stadler (Oracle Labs), Jan Vitek (Northeastern),
> Alexander Bertram (BeDataDriven)
>
> [1] http://riotworkshop.github.io/
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RIOT 2020

2020-02-27 Thread Abby Spurdle

On Thu, Feb 27, 2020 at 3:58 PM Vitek, Jan  wrote:
> I am a co-organizer of RIOT and spent 10 years building real-time Java 
> virtual machines.

Wow!
I'm impressed.

Sounds like you've set a precedent for future research in language
implementation.

> My conclusion: no one cares

It sounds like your research has been under-appreciated.
(Not sure if I'm interpreting your post correctly).

I suspect that (some) other people will start coming on board, relatively soon.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] ":::" operator doesn't work with data object Ecdat:::Crime

2020-03-17 Thread Abby Spurdle

Crime?
(Macavity, Macavity, ..., and when you reach the scene of crime
Macavity's not there...)

I suspect your data objects are like Macavity, they're not there.

I found this in the R Internals 1.17.

Lazy-load databases are loaded into the exports for a package, but not
into the namespace environment itself. Thus they are visible when the
package is attached, and also via the :: operator. This was a
deliberate design decision...

The manual is not explicit about the converse, but implies (I think)
that the converse is true.
(i.e. Non-lazyloaded datasets are not supposed to available via this mechanism).

I didn't check your examples.

Are your examples lazyloaded?
And are they any different from other similar packages/datasets?


>A different but related issue is that "plm::Crime" says "Error:
> 'Crime' is not an exported object from 'namespace:plm'", even though
> "library(plm); data(Crime); Crime" works.  I would naively think a user
> should be able to compare "Crime" objects documented in different
> packages using the "::" and ":::" operators, even if a package
> maintainer chooses not to "export" data objects.
>What do you think?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Help useRs to use R's own Time/Date objects more efficiently

2020-04-05 Thread Abby Spurdle

I think POSIXct and POSIXlt are badly-chosen names.
The name "POSIX" implies UNIX.
(i.e. XYZix operating system is mostly POSIX compliant... Woo-Hoo!).
My assumption is that most people modelling industrial/econometric
data etc, or data imported from databases, don't want system
references everywhere.

Historically, I've use the principle that:
If programming language A uses functionality from programming language
B, then bindings should be as close as possible to whatever is in
programming language B. Any additional functionality in programming
language A, should be distinct from the bindings.
R hasn't done this here, where POSIX-bindings have added in additional
R functionality and semantics.
Possibly introducing problems at an early stage.

The help file entitled DateTimeClasses, only covers a small subset of
information on date and time classes, with no obvious information
about how to construct date and time objects, except for what's in the
examples. The Date class has a similar problem, omitting information
about how to construct Date objects.

The "convenience extraction functions" aren't necessarily convenient
because they return text rather than integers, requiring many users to
still use the POSIXlt class.

I don't think your example is simple.
And I suspect it may discourage some people from using base packages.
Having opposite effect to what's intended.

It's probably too late to change the functions, but here's what I would suggest:

(1) Create a top-level help page with a title like "Date and Time
Classes" to give a brief but general overview. This would mean the
existing DateTimeClasses would need a new title.
(2) Create a another function the same as as.POSIXlt, but with a more
user-friendly name, which would increase readability.
(3) If help files for describing classes are separate from the help
files for creating/coercing objects (e.g. Date vs as.Date), then I
think they should cross reference each other in the description field,
not just the details or seealso fields.
(4) Reference relevant extraction/formatting functions, in most
date/time help files, even if there's some (small) duplication in the
help files.
(5) Focus on keeping the examples simple rather than comprehensive.

Expanding on suggestion (4), if you read the help file for as.Date
(which seems like an obvious starting point, because that's where I
started reading...), there's no reference at all to getting the month,
or the day of the week, etc. To make it worse it doesn't mention
coercion to POSIXlt objects either (but does mention coercion from
POSIXlt to Date objects). This could give the wrong impression to many
readers...

In it's defense, it does link to Date, which links to weekdays, which
links to as.POSIXlt.

Of course the note and seealso fields are near the bottom, and there's
an implicit (possibly false) assumption that the reader will read all
the help file*s*, and follow the links at the bottom, at least three
times over.
And a new-ish R user is likely to have to read more than four help files.
Unless they Google it, read stack exchange, or read some fancy
(apparently modern) textbook on data science...

Reinforcing the need for the help files to be clear about what the
functions (collectively) can do and specifically what
extraction/formatting functionality is available...

My guess is the that most common tasks with date and time objects are:
(1) Reading a character vector representing dates/times.
(2) Formatting a date/time (i.e. Object to character vector, or
character vector to another character vector).
(3) Extracting information such as month, weekday, etc, either as an
integer or as text.

So, I in short, these should be easy (to do, and find out how to do)...

On Sat, Apr 4, 2020 at 10:51 PM Martin Maechler
 wrote:
>
> This is mostly a RFC  [but *not* about the many extra packages, please..]:
>
> Noticing to my chagrin  how my students work in a project,
> googling for R code and cut'n'pasting stuff together, accumulating
> this and that package on the way  all just for simple daily time series
> (though with partly missing parts),
> using chron, zoo, lubridate, ...  all for things that are very
> easy in base R *IF* you read help pages and start thinking on
> your own (...), I've noted once more that the above "if" is a
> very strong one, and seems to happen rarely nowadays by typical R users...
> (yes, I stop whining for now).
>
> In this case, I propose to slightly improve the situation ...
> by adding a few more lines to one help page [[how could that
> help in the age where "google"+"cut'n'paste" has replaced thinking ? .. ]] :
>
> On R's own ?Dates  help page (and also on ?DateTimeClasses )
> we have pointers, notably
>
> See Also:
>
>  ...
>  ...
>
>  'weekdays' for convenience extraction functions.
>
> So people must find that and follow the pointer
> (instead of installing one of the dozen helper packages).
>
> Then on that page, one sees  weekdays(), mo

Re: [Rd] Help useRs to use R's own Time/Date objects more efficiently

2020-04-05 Thread Abby Spurdle

> (1) Create a top-level help page with a title like "Date and Time
> Classes" to give a brief but general overview. This would mean the
> existing DateTimeClasses would need a new title.

I wanted to modify my first suggestion.
Perhaps a better idea would be to reference an external document
giving an overview of the subject.
I couldn't find a discussion of POSIXct/POSIXlt objects in the R
manuals (unless I missed it somewhere), so perhaps "An Introduction to
R" could be updated to include this subject, and then the help files
could reference that?

Mark Leeds has already mentioned one possible (unofficial) source.
And I suspect that there are others.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Testing before release (was: edit() doubles backslashes when keep.source=TRUE)

2020-05-15 Thread Abby Spurdle

This perhaps diverges from the intent of the thread, but...

I wanted to say I'm extremely grateful to the people who go the
through the bug reports.
It's an extremely important job (in the long run, particularly), but
perhaps not quite as "sexy"-sounding as other roles, and probably
under-valued.

So, thank you to the bug-fixers...

:)

On Sat, May 16, 2020 at 2:54 AM Duncan Murdoch  wrote:
>
> On 15/05/2020 9:41 a.m., Martin Maechler wrote:
> [ deletions ]
> > 
> >
> >  Why does nobody anymore  help R development by working with
> >  "R-devel", or at least then the alpha, beta and the "RC"
> >  (Release Candidate) versions that we release daily for about one
> >  month before the final release?
> >
> >  Notably a highly staffed enterprise such as Rstudio (viz the bug
> >  report 17800 above), but also others could really help by
> >  starting to use the "next version" of R on a routine basis ...
> >
> >  I understand the whining, bugs that get released are embarrassing.  But
> when I read the NEWS, I can see that both the NEW FEATURES and BUG FIXES
> sections of x.y.0 releases tend to be much longer than the BUG FIXES
> sections in patch releases.  That seems to indicate that things are
> working reasonably well.
>
> For a really rough measure, just counting bullet points:
>
> R 4.0.0:  65 new features, 55 bug fixes
>
> R 3.6.3:  1 new feature, 7 bug fixes
>
> R 3.6.2:  2 new features, 21 bug fixes
>
> R 3.6.1:  0 new features, 16 bug fixes
>
> R 3.6.0:  72 new features, 62 bug fixes
>
> You can get these numbers programmatically:
>
> R4 <- news()
> table(R4$Category)
>
> R3 <- news(package = "R-3")
> table(R3$Version, R3$Category)
>
> Duncan Murdoch
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] dbinom link

2020-05-18 Thread Abby Spurdle

This has come up before.

Here's the last time:
https://stat.ethz.ch/pipermail/r-devel/2019-March/077478.html

I guess my answer to the following the question...

Perhaps we should ask permission to
nail the thing down somewhere on r-project.org?

...would be, to reproduce it somewhere.
And then update the link in the binom help file.

Given that the article was previously available freely (with no
apparent restrictions on reproducing it), and that the author has
significant published works which are open access, I'd be surprised if
there's any objection to reproducing it.

On Mon, May 18, 2020 at 8:01 PM Koenker, Roger W  wrote:
>
> FWIW the link from ?dbinom to the Loader paper on Binomials is broken but the 
> paper seems to be
> available here:   
> https://octave.1599824.n4.nabble.com/attachment/3829107/0/loader2000Fast.pdf
>
> Roger Koenker
> r.koen...@ucl.ac.uk
> Honorary Professor of Economics
> Department of Economics, UCL
> Emeritus Professor of Economics
> and Statistics, UIUC
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Feature Request: User Prompt + Message First Execution when "Managing Search Path Conflicts"

2020-05-20 Thread Abby Spurdle

> An IDE could provide a more sophisticated interface, like a dialog
> allowing separate choices for each conflict. But this is best left up
> to the IDE or the user.

An IDE (or other user interface) should not alter the behavior of R,
especially the installing/loading/attaching of packages.

There are some possible exceptions:
(1) The global option for width.
(2) Output that would normally appear in a separate window.
(3) Maybe others...

But only as non-defaults, with consent from the user.
Also, while exception (2) may have an intuitive appeal, it's risky business...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Feature Request: User Prompt + Message First Execution when "Managing Search Path Conflicts"

2020-05-20 Thread Abby Spurdle

On Thu, May 21, 2020 at 10:37 AM  wrote:
> Providing a way to more easily resolve situations that otherwise would
> be errors is a reasonable thing for an IDE to do.

In principle, yes.
However, I note that the word "easily" could mean different things to
different people.
Certain IDE* (not naming any names, and not distinguishing between
singular/plural), introduce bugs and other problems, by altering R's
behavior.

Also, if they absolutely must change things, perhaps the first thing
they should change is the error messages:

I'm sorry, this program didn't work.
It could be R's fault, but probably is the IDE's fault.
Please do NOT post a message on R-help saying:
Why did R crash?
Without first running your code via a terminal.

> I would prefer is
> such things were optional and off by default, but other way not.

That's good to hear.

> If an IDE does this and you don't approve then you don't have to use
> it.

Yes.
And please let statistics and data science students know that.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] closing R graphics windows?

2020-05-27 Thread Abby Spurdle

> The only annoying thing for me is that 'plot()' is not interruptible, so 
> neither Ctrl-C nor the window manager can stop a plot once it has started - 
> but I submitted a bug to fix this a long time ago. If I use the keyboard to 
> close the window while a plot is being drawn, then it has to finish drawing 
> before the window actually closes.

When R first shifted to Cairo, there was a noticeable performance
loss, which could be fixed by changing to the nondefault (but
traditional) "X11" graphics device.

But that was about ten years ago.
And at present, I'm not using Linux on a regular basis, and haven't
been following changes to the graphics devices.

Perhaps someone who is more familiar with graphics devices under
Linux, could comment on options to increase performance...?


P.S.
Cairo does produce high-quality graphics, better than any PDF viewer I've seen.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: use of the tcltk package crashes R 4.0.1 for Windows

2020-06-07 Thread Abby Spurdle

On Mon, Jun 8, 2020 at 4:09 AM Fox, John  wrote:
> Does it make sense to withdraw the Windows R 4.0.1 binary until the issue is 
> resolved?

Yes, it does.
All the release reversions should be removed.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: use of the tcltk package crashes R 4.0.1 for Windows

2020-06-07 Thread Abby Spurdle

sorry, release "versions"


On Mon, Jun 8, 2020 at 11:17 AM Abby Spurdle  wrote:
>
> On Mon, Jun 8, 2020 at 4:09 AM Fox, John  wrote:
> > Does it make sense to withdraw the Windows R 4.0.1 binary until the issue 
> > is resolved?
>
> Yes, it does.
> All the release reversions should be removed.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Restrict package to load-only access - prevent attempts to attach it

2020-06-23 Thread Abby Spurdle

You could go one step down, print a note or a warning.

Also, you could combine different approaches:
Check for an (additional) environment variable.
If set, print a note, if not set, generate a warning (or an error).

That would prevent someone accidently attaching your package, and
would discourage them from doing it.
But would also allow people to attach your package, if they really want to.


On Wed, Jun 24, 2020 at 8:21 AM Henrik Bengtsson
 wrote:
>
> Hi,
>
> I'm developing a package whose API is only meant to be used in other
> packages via imports or pkg::foo().  There should be no need to attach
> this package so that its API appears on the search() path. As a
> maintainer, I want to avoid having it appear in search() conflicts by
> mistake.
>
> This means that, for instance, other packages should declare this
> package under 'Imports' or 'Suggests' but never under 'Depends'.  I
> can document this and hope that's how it's going to be used.  But, I'd
> like to make it explicit that this API should be used via imports or
> ::.  One approach I've considered is:
>
> .onAttach <- function(libname, pkgname) {
>if (nzchar(Sys.getenv("R_CMD"))) return()
>stop("Package ", sQuote(pkgname), " must not be attached")
> }
>
> This would produce an error if the package is attached.  It's
> conditioned on the environment variable 'R_CMD' set by R itself
> whenever 'R CMD ...' runs.  This is done to avoid errors in 'R CMD
> INSTALL' and 'R CMD check' "load tests", which formally are *attach*
> tests.  The above approach passes all the tests and checks I'm aware
> of and on all platforms.
>
> Before I ping the CRAN team explicitly, does anyone know whether this
> is a valid approach?  Do you know if there are alternatives for
> asserting that a package is never attached.  Maybe this is more
> philosophical where the package "contract" is such that all packages
> should be attachable and, if not, then it's not a valid R package.
>
> This is a non-critical topic but if it can be done it would be useful.
>
> Thanks,
>
> Henrik
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Speed-up/Cache loadNamespace()

2020-07-20 Thread Abby Spurdle

It's possible to run R (or a c parent process) as a background process
via a named pipe, and then write script files to the named pipe.
However, the details depend on what shell you use.

The last time I tried (which was a long time ago), I created a small c
program to run R, read from the named pipe from within c, then wrote
it's contents to R's standard in.

It might be possible to do it without the c program.
Haven't checked.


On Mon, Jul 20, 2020 at 3:50 AM Mario Annau  wrote:
>
> Dear all,
>
> in our current setting we have our packages stored on a (rather slow)
> network drive and need to invoke short R scripts (using RScript) in a
> timely manner. Most of the script's runtime is spent with package loading
> using library() (or loadNamespace to be precise).
>
> Is there a way to cache the package namespaces as listed in
> loadedNamespaces() and load them into memory before the script is executed?
>
> My first simplistic attempt was to serialize the environment output
> from loadNamespace() to a file and load it before the script is started.
> However, loading the object automatically also loads all the referenced
> namespaces (from the slow network share) which is undesirable for this use
> case.
>
> Cheers,
> Mario
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Speed-up/Cache loadNamespace()

2020-07-20 Thread Abby Spurdle

Thank you Serguei and Gabor.
Great suggestions.

> If your R scripts contain "stop()" or "q('yes')" or any other error, it
> will end the Rscript process. Kind of watch-dog can be set for automatic
> relaunching if needed.

It should be possible to change the error handling behavior.
>From within R:

options (error = function () NULL)

Or something better...

Also, it may be desirable to wipe the global environment (or parts of
it), after each script:

remove (list = ls (envir=.GlobalEnv, all.names=TRUE) )

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] trivial typo in ?Matrix::sparse.model.matrix.Rd

2020-07-21 Thread Abby Spurdle

> "No documentation for ‘sparse.model.matrix’ in
> specified packages and libraries", but it's there after
> "library(Ecfun)".  I find that interesting, because "Matrix" does not
> appear in the Ecfun DESCRIPTION file.

Not interesting.
Note the imports and depends fields.
(Of your own packages).

> AND I don't see 'repr = ("C",
> "R", "T")' in the "sparse.model.matrix" help file I do see.

Martin's comment used future tense.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] trivial typo in ?Matrix::sparse.model.matrix.Rd

2020-07-21 Thread Abby Spurdle

>  By the way, Ecfun includes some 32 "suggests" and "imports".
> "Matrix" is not one of them, but it must be called by something else
> that's loaded by Ecfun, to get the result I got.

Spencer, I find some of your comments/questions on R packages, extremely basic.
(Sorry, if that sounds condescending, but I'm wondering if your
comments/questions would be better R-package-devel?).

You need to look at the imports and depends fields.
(As stated in my previous post).

The *first* package on the Ecfun imports list, is fda, which is *your*
package (technically, contributor), and it has a dependency on the
Matrix package.

I'd recommend you read the documentation on writing R packages, and on
how package namespaces are handled.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] trivial typo in ?Matrix::sparse.model.matrix.Rd

2020-07-22 Thread Abby Spurdle

> The *first* package on the Ecfun imports list, is fda, which is *your*
> package (technically, contributor), and it has a dependency on the
> Matrix package.

My post this morning might have come across the wrong way.
It's good that you're interested in software for numerical linear algebra.
(I only just worked the importance of this, about a year ago).
And I may also have a closer look at the Matrix package, in the near future.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Seeding non-R RNG with numbers from R's RNG stream

2020-07-30 Thread Abby Spurdle

> 3. In C++: Draw millions of times from a Categorical(p) distribution, where
> "p" is recalculated after each draw

I don't see the need here.
It should be possible to generate all the random numbers , *in R*, and
in *one line* of R code.
Easy...

Then standard inversion sampling, can be used to transform the random
numbers, as necessary.
This may (?) benefit from a C/C++ implementation, but that can be kept
separate from the random number generation.
i.e. The C++ function takes a vector of random numbers from a uniform
distribution, then computes "draws" (from the desired distribution),
iteratively.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] M[cbind()] <- assignment with Matrix object loses attributes

2020-08-22 Thread Abby Spurdle

Hi Ben,

I had some problems reproducing this.
As far as I can tell *all* indexed assignments drop attributes.
(Maybe we have different versions).

I'm not an expert on S4, but I'm unenthusiastic about mixing slot (S4)
semantics with attribute (S3) semantics.
And str() excludes attributes, but attributes() includes slots.
Highlighting the problems here...

I think R should generate an error or a warning, if a user tries to
assign attributes to S4 objects.

In saying that, mixing OO design with numerical linear algebra is a gold mine...

On Tue, Aug 11, 2020 at 1:23 PM Ben Bolker  wrote:
>
>Does this constitute a bug, or is there something I'm missing?
> assigning sub-elements of a sparse Matrix via M[X]<-..., where X is a
> 2-column matrix, appears to drop user-assigned attributes. I dug around
> in the R code for Matrix trying to find the relevant machinery but my
> brain started to hurt too badly ...
>
> Will submit this as a bug if it seems warranted.
>
> library(Matrix)
> m1 <- matrix(1:9,3,3)
> m1 <- Matrix(m1)
> attr(m1,"junk") <- 12
> stopifnot(isTRUE(attr(m1,"junk")==12))  ## OK
> m1[cbind(1:2,2:3)] <- 1
> stopifnot(isTRUE(attr(m1,"junk")==12)) ## not OK
> attr(m1,"junk") ## NULL
>
>
> ## note I have to use the ugly stopifnot(isTRUE(...)) because a missing
> attribute returns NULL, an assignment to NULL returns NULL, and
> stopifnot(NULL) doesn't stop ...
>
>
> cheers
>
>   Ben Bolker
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] M[cbind()] <- assignment with Matrix object loses attributes

2020-08-22 Thread Abby Spurdle

> Hmm, really?  In `R Under development (unstable) (2020-08-14
> r79020)`, doing the indexed assignment with a regular matrix (as opposed
> to a Matrix) appears to preserve attributes.

I was referring to *Matrix* objects.
Sorry, if that wasn't clear.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] more Matrix weirdness

2020-09-10 Thread Abby Spurdle

> > "These operators are also implicit S4 generics, but as
> > primitives, S4 methods will be dispatched only on S4
> > objects ‘x’."

> Yes, exactly,  very well found, Georgi!

I'm sorry Martin, but I don't understand your point here.

I'm assuming that you want the (S3) matrix, x, to be converted to an
(S4) Matrix.

However, this is not a question of method dispatch, as such.
But rather a question of type conversion (integer to numeric to complex, etc).

Specifically, can/should automatic type conversion, convert an S3 data
type to an S4 data type, even where user-defined data types are
involved?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Including full text of open source licenses in a package

2020-09-12 Thread Abby Spurdle

> > Including a copy of the license with the work is vital

Hmmm...
Agree.

Just for context:
CRAN has a history of being exceptionally useful and efficient.
In general, I don't support suggestions to change their submission policies.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] more Matrix weirdness

2020-09-17 Thread Abby Spurdle

> There may be cases when changing the class of the left-hand side make sense 
> (such as one subclass of "Matrix" to another) but certainly not for the base 
> R vector classes.

I'm not sure what you mean by "not for the base R vector classes".
Historically, the simpler class (or mode) gets coerced to the more
complex class (or mode).

x <- y <- 1:10
y [1] <- 1

(class (x) == class (y) ) #FALSE

Also, I note the behavior of multiplication of a matrix with a Matrix.

library (Matrix)

m <- matrix (1:16, 4, 4)
M <- Matrix (1:16, 4, 4)

as.vector (class (m * M) )   #dgeMatrix
as.vector (class (M * m) )   #dgeMatrix
as.vector (class (m %*% M) ) #dgeMatrix
as.vector (class (M %*% m) ) #dgeMatrix

So, here also, the output is a Matrix, regardless of the type of
multiplication, or the order of the operands.

But the following surprised me:

k <- m
mode (k) <- "complex"
k %*% M

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Is it possible to simply the use of NULL slots (or at least improve the help files)?

2020-09-23 Thread Abby Spurdle

As far as I can tell, there's no trivial way to set arbitrary S4 slots to NULL.

Most of the online examples I can find, use setClassUnion and are
about 10 years old.
Which, in my opinion, is defective.
There's nothing "robust" about making something that should be
trivially simple, really complicated.

Maybe there is a simpler way, and I just haven't worked it out, yet.
But either way, could the documentation for the methods package be improved?
I can find any obvious info on NULL slots:

Introduction
Classes
Classes_Details
setClass
slot

Again, maybe I missed it.
Even setClassUnion, which is what's used in the online examples,
doesn't contain a NULL slot example.

One more thing:
The help file for setClassUnion, uses the term "superclass", incorrectly.

Its examples include the following:
setClassUnion("maybeNumber", c("numeric", "logical"))

If maybeNumber was the superclass of numeric, then every instance of
numeric would also be an instance of maybeNumber...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Is it possible to simply the use of NULL slots (or at least improve the help files)?

2020-09-23 Thread Abby Spurdle

Sorry, the title should be "simplify", and the third paragraph should
say "I can't".
(Don't know how I missed these).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Is it possible to simply the use of NULL slots (or at least improve the help files)?

2020-09-24 Thread Abby Spurdle

Hi Martin,
Thankyou for your response.

I suspect that we're not going to agree on the main point.
Making it trivially simple (as say Java) to set slots to NULL.
So, I'll move on to the other points here.

***Note that cited text uses excerpts only.***

>   setClassUnion("character_OR_NULL", c("character", "NULL"))
>   A = setClass("A", slots = c(x = "character_OR_NULL"))

I think the above construct needs to be documented much more clearly.
i.e. In the introductory and details pages for S4 classes.
This is something that many people will want to do.
And BasicClasses or NULL-class, are not the most obvious place to
start looking, either.

Also, I'd recommend the S4 authors, go one step further.
Include character_OR_NULL, numeric_OR_NULL, etc, or something similar,
in S4's predefined basic classes.
Otherwise, contributed packages will (eventually) end up with hundreds
of copies of these.

> setClassUnion("maybeNumber", c("numeric", "logical"))
> every instance of numeric _is_ a maybeNumber, e.g.,
> > is(1, "maybeNumber")
> [1] TRUE

> which I think is consistent with the use of 'superclass'

Not quite.

x <- structure (sqrt (37), class = c ("sqrt.prime", "numeric") )
is (x, "numeric") #TRUE
is (x, "maybeNumber") #FALSE

So now, an object x, is a numeric but not a maybeNumber.
Perhaps a class union should be described as a partial imitation of a
superclass, for the purpose of making slots more flexible.


B.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Could "MedicalImaging" Be Changed to "ImageProcessing"?

2020-10-27 Thread Abby Spurdle

Dear List,

Regarding the task views:
I was wondering if it would be possible to replace the MedicalImaging
task view with an ImageProcessing task view?
Alternatively, a separate task view could be created.
However, I suspect that there would be considerable overlap.
In my opinion, a single task view for image processing, would be preferable.

Also, I note that the Spatial task view contains a section on raster data.
This seems like the wrong place.
i.e. A raster file, typically represents a photograph or other image-like data.
Hence, it seems more natural for it to be in an image processing task
view, than in a spatial task view.
But then there is a counter argument to that, if it's not covered in
any other task view, then it needs to go somewhere. And the spatial
task view is preferrable, to nowhere.

When I think of spatial data, I think of things like coordinates for
weather satellites, or possibly a wireframe mesh. I suppose if a
raster file represented a matrix of height or depth values, then the
notion of image and spatial data coincide. But then there's a counter
argument to that, that the data should just be a plain text file with
a matrix, not a raster file.

Also, I note that the section on raster data needs to be updated.

A quick search resulted in a number of hits recommending two packages,
sp and raster.
However, a more thorough search resulted in more options, such as the
png and jpeg packages.
(Which have been on CRAN for about a decade).

I haven't tried the png and jpeg packages yet, but they seem like
attractive options.
So, I should be given them a test drive later today, or maybe tomorrow.

One final comment.
As a somewhat broad generalization, there seems to be considerably
more overlap between machine learning and image processing
(specifically), than there is between machine learning and spatial
data (more generally). Hence image processing should be given higher
precedence than what it currently is, and information on image
processing should be presented in a way that's suitable to parties
interested in machine learning.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] all.equal applied to function closures

2020-12-01 Thread Abby Spurdle

> Bill, I'm sure you've noticed that we did write  all.equal.environment()
> to work recursively... Actually, I had worked quite a bit at
> that, too long ago to remember details, but the relevant svn log
> entry is
> 
> r66640 | maechler | 2014-09-18 22:10:20 +0200 (Thu, 18 Sep 2014) | 1 line
>
> more sophisticated all.equal.environment(): no longer "simple" infinite 
> recursions
> 

I haven't checked the above reference.
But I would like to note the following behaviour:

#e group
e = new.env ()
e1 = new.env ()
e$e = e1
e1$e = e

#f group
f = new.env ()
f1 = new.env ()
f2 = new.env ()
f$e = f1
f1$e = f2
f2$e = f

all.equal (e, f)

I tried a number of examples with circular references.
All worked correctly, except for "identical" environments nested an
unequal number of times.

I suspect there may be other special cases.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

2020-12-05 Thread Abby Spurdle

> This is a good addition

I can't understand why so many people are calling this a "pipe".
Pipes connect processes, via their I/O streams.
Arguably, a more general interpretation would include sockets and files.

https://en.wikipedia.org/wiki/Pipeline_(Unix)
https://en.wikipedia.org/wiki/Named_pipe
https://en.wikipedia.org/wiki/Anonymous_pipe

As far as I can tell, the magrittr-like operators are functions (not
pipes), with nonstandard syntax.
This is not consistent with R's original design philosophy, building
on C, Lisp and S, along with lots of *important* math and stats.

It's possible that some parties are interested in creating a kind of
"data pipeline".
I'm interested in this myself, and I think we could discuss this more.
But I'm not convinced the magrittr-like operators help to achieve this goal.
Which, in my opinion, would require one to model programs as directed
graphs, along with some degree of asynchronous input.

Presumably, these operators will be added to R anyway, and (almost) no
one will listen to me.

So, I would like to make one suggestion:
Is it possible for these operators to *not* be named:
The R Pipe
The S Pipe
Or anything with a similar meaning.

Maybe tidy pipe, or something else that links it to its proponents?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] anonymous functions

2020-12-07 Thread Abby Spurdle

I mostly agree with your comments on anonymous functions.

However, I think the main problem is cryptic-ness, rather than succinct-ness.
The backslash is a relatively universal symbol within programming
languages with C-like (ALGOL-like?) syntax.
Where it denotes escape sequences within strings.

Using the leading character for escape sequences, to define functions,
is like using integers to define floating point numbers:

my.integer <- as.integer (2) * pi

Arguably, the motive is more to be ultra-succinct than cryptic.
But either way, we get syntax which is difficult to read, from a
mathematical and statistical perspective.


On Tue, Dec 8, 2020 at 6:04 AM Therneau, Terry M., Ph.D. via R-devel
 wrote:
>
> “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be 
> helpful in making
> code containing simple function expressions more readable.”
>
> Color me unimpressed.
> Over the decades I've seen several "who can write the shortest code" threads: 
> in Fortran,
> in C, in Splus, ...   The same old idea that "short" is a synonym for either 
> elegant,
> readable, or efficient is now being recylced in the tidyverse.   The truth is 
> that "short"
> is actually an antonym for all of these things, at least for anyone else 
> reading the code;
> or for the original coder 30-60 minutes after the "clever" lines were 
> written.  Minimal
> use of the spacebar and/or the return key isn't usually held up as a goal, 
> but creeps into
> many practiioner's code as well.
>
> People are excited by replacing "function(" with "\("?  Really?   Are people 
> typing code
> with their thumbs?
> I am ambivalent about pipes: I think it is a great concept, but too many of 
> my colleagues
> think that using pipes = no need for any comments.
>
> As time goes on, I find my goal is to make my code less compact and more 
> readable.  Every
> bug fix or new feature in the survival package now adds more lines of 
> comments or other
> documentation than lines of code.  If I have to puzzle out what a line does, 
> what about
> the poor sod who inherits the maintainance?
>
>
> --
> Terry M Therneau, PhD
> Department of Health Science Research
> Mayo Clinic
> thern...@mayo.edu
>
> "TERR-ree THUR-noh"
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] anonymous functions

2020-12-07 Thread Abby Spurdle

Sorry, I should replace "cryptic-ness" from my last post, with
"unnecessary cryptic-ness".
Sometimes short symbolic expressions are necessary.


P.S.
Often, I wish I could write: f (x) = x^2.
But that's replacement function syntax.


On Tue, Dec 8, 2020 at 11:56 AM Abby Spurdle  wrote:
>
> I mostly agree with your comments on anonymous functions.
>
> However, I think the main problem is cryptic-ness, rather than succinct-ness.
> The backslash is a relatively universal symbol within programming
> languages with C-like (ALGOL-like?) syntax.
> Where it denotes escape sequences within strings.
>
> Using the leading character for escape sequences, to define functions,
> is like using integers to define floating point numbers:
>
> my.integer <- as.integer (2) * pi
>
> Arguably, the motive is more to be ultra-succinct than cryptic.
> But either way, we get syntax which is difficult to read, from a
> mathematical and statistical perspective.
>
>
> On Tue, Dec 8, 2020 at 6:04 AM Therneau, Terry M., Ph.D. via R-devel
>  wrote:
> >
> > “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be 
> > helpful in making
> > code containing simple function expressions more readable.”
> >
> > Color me unimpressed.
> > Over the decades I've seen several "who can write the shortest code" 
> > threads: in Fortran,
> > in C, in Splus, ...   The same old idea that "short" is a synonym for 
> > either elegant,
> > readable, or efficient is now being recylced in the tidyverse.   The truth 
> > is that "short"
> > is actually an antonym for all of these things, at least for anyone else 
> > reading the code;
> > or for the original coder 30-60 minutes after the "clever" lines were 
> > written.  Minimal
> > use of the spacebar and/or the return key isn't usually held up as a goal, 
> > but creeps into
> > many practiioner's code as well.
> >
> > People are excited by replacing "function(" with "\("?  Really?   Are 
> > people typing code
> > with their thumbs?
> > I am ambivalent about pipes: I think it is a great concept, but too many of 
> > my colleagues
> > think that using pipes = no need for any comments.
> >
> > As time goes on, I find my goal is to make my code less compact and more 
> > readable.  Every
> > bug fix or new feature in the survival package now adds more lines of 
> > comments or other
> > documentation than lines of code.  If I have to puzzle out what a line 
> > does, what about
> > the poor sod who inherits the maintainance?
> >
> >
> > --
> > Terry M Therneau, PhD
> > Department of Health Science Research
> > Mayo Clinic
> > thern...@mayo.edu
> >
> > "TERR-ree THUR-noh"
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] quantile() names

2020-12-14 Thread Abby Spurdle

The "value" is *not* 975.
It's 975.025.

The results that you're observing, are merely the byproduct of formatting.

Maybe, you should try:

quantile (x, .975, type=4)

Which perhaps, using default options, produces the result you're expecting?


On Tue, Dec 15, 2020 at 8:55 AM Merkle, Edgar C.  wrote:
>
> All,
>
> Consider the code below
>
> options(digits=2)
> x <- 1:1000
> quantile(x, .975)
>
> The value returned is 975 (the 97.5th percentile), but the name has been 
> shortened to "98%" due to the digits option. Is this intended? I would have 
> expected the name to also be "97.5%" here. Alternatively, the returned value 
> might be 980 in order to match the name of "98%".
>
> Best,
> Ed
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] quantile() names

2020-12-16 Thread Abby Spurdle

CITED TEXT CONTAINS EXCERPTS ONLY

> and now we read more replies on this topic without anyone looking at
> the pure R source code which is pretty simple and easy.
> Instead, people do experiments and take time to muse about their findings..
> Honestly, I'm disappointed: I've always thought that if you
> *write* on R-devel, you should be able to figure out a few
> things yourself before that..

That's a bit unfair.
Some of us have written packages, containing functions for computing
quantile names:

 probhat::ntile.names (,100)


> 1) provide an optional argument   'digits = 7'
>back compatible w/ default getOption("digits")

I'm not sure I've got this right.
Are you suggesting that by default, names should have 7 digits?


> so I'm guessing it may make more people unhappy than other
> people happy if we change this now, after close to 23 years  .. ??

I would probably be in the less enthusiastic group.
I take the view that quantile naming is mainly a convenience, for
summary-style output.

And on that basis, I would say the current behaviour is about right.
Anyone looking for high precision, should probably compute their own
quantile names.


Also, expanding on an earlier point.
The value was 975.025, so a label of "97.5%" could still cause problems.
Increasing the precision doesn't necessarily fix this sort of problem.
But rather, increases the complexity of the output, beyond what
"97.5%" of users would ever want...


B.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] quantile() names

2020-12-16 Thread Abby Spurdle

Sorry, I need to change my last post.

I looked at this a bit more, and realized that increasing the (max)
number of (name) digits is only relevant in some cases.
For people computing quartiles and deciles, this shouldn't make any difference.
Therefore, should still be convenient for the purposes of summary-style output.


On Thu, Dec 17, 2020 at 11:48 AM Abby Spurdle  wrote:
>
> CITED TEXT CONTAINS EXCERPTS ONLY
>
> > and now we read more replies on this topic without anyone looking at
> > the pure R source code which is pretty simple and easy.
> > Instead, people do experiments and take time to muse about their findings..
> > Honestly, I'm disappointed: I've always thought that if you
> > *write* on R-devel, you should be able to figure out a few
> > things yourself before that..
>
> That's a bit unfair.
> Some of us have written packages, containing functions for computing
> quantile names:
>
>  probhat::ntile.names (,100)
>
>
> > 1) provide an optional argument   'digits = 7'
> >back compatible w/ default getOption("digits")
>
> I'm not sure I've got this right.
> Are you suggesting that by default, names should have 7 digits?
>
>
> > so I'm guessing it may make more people unhappy than other
> > people happy if we change this now, after close to 23 years  .. ??
>
> I would probably be in the less enthusiastic group.
> I take the view that quantile naming is mainly a convenience, for
> summary-style output.
>
> And on that basis, I would say the current behaviour is about right.
> Anyone looking for high precision, should probably compute their own
> quantile names.
>
>
> Also, expanding on an earlier point.
> The value was 975.025, so a label of "97.5%" could still cause problems.
> Increasing the precision doesn't necessarily fix this sort of problem.
> But rather, increases the complexity of the output, beyond what
> "97.5%" of users would ever want...
>
>
> B.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Allowing S3 methods of rounding functions to take `...`

2021-01-28 Thread Abby Spurdle

That's a great suggestion Davis.

While, we're on the topic...
Could we have a "dots" argument in base::t, the transpose function?


On Fri, Jan 29, 2021 at 4:48 AM Davis Vaughan  wrote:
>
> I should also say that I would be willing to attempt a patch for this, if
> others agree that this would be useful.
>
> - Davis
>
> On Thu, Jan 28, 2021 at 9:14 AM Davis Vaughan  wrote:
>
> > Hi all,
> >
> > I would like to propose adding `...` to the signatures of the following
> > rounding functions:
> >
> > - floor(x)
> > - ceiling(x)
> > - round(x, digits = 0)
> > - And possibly signif(x, digits = 6)
> >
> > The purpose would be to allow S3 methods to add additional arguments as
> > required.
> >
> > A few arguments in favor of this change:
> >
> > `trunc(x, ...)` already takes dots, which sets a precedent for the others
> > to do so as well. It is documented in the same help file as the other
> > rounding functions.
> >
> > Internally at the C level, a check is done to ensure that there is exactly
> > 1 arg for floor() and ceiling(), and either 1 or 2 args for round(). The
> > actual names of those arguments are not checked, however, and I believe
> > this is what allows `round.Date(x, ...)` and `round.POSIXt(x, unit)` to
> > exist, solely because they have 2 arguments. It seems like this is a bit of
> > a hack, since you couldn't create something similar for floor, like
> > `floor.POSIXt(x, unit)` (not saying this should exist, it is just for
> > argument's sake), because the 1 argument check would error on this. I think
> > adding `...` to the signature of the generics would better support what is
> > being done here.
> >
> > Additionally, I have a custom date-like S3 class of my own that I would
> > like to write floor(), ceiling(), and round() methods for, and they would
> > require passing additional arguments.
> >
> > If R core would like to make this change, they could probably tweak
> > `do_trunc()` to be a bit more general, and use it for floor() and
> > ceiling(), since it already allows `...`.
> >
> > A few references:
> >
> > Check for 1 arg in do_math1(), used by floor() and ceiling()
> >
> > https://github.com/wch/r-source/blob/fe82da3baf849fcd3cc7dbc31c6abc72b57aa083/src/main/arithmetic.c#L1270
> >
> > Check for 2 args in do_Math2(), used by round()
> >
> > https://github.com/wch/r-source/blob/fe82da3baf849fcd3cc7dbc31c6abc72b57aa083/src/main/arithmetic.c#L1655
> >
> > do_trunc() definition that allows `...`
> >
> > https://github.com/wch/r-source/blob/fe82da3baf849fcd3cc7dbc31c6abc72b57aa083/src/main/arithmetic.c#L1329-L1340
> >
> > - Davis
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Allowing S3 methods of rounding functions to take `...`

2021-01-28 Thread Abby Spurdle

I've been writing functions for block matrices and more generally,
arrays of matrices.

Presumably, the default transpose operation would transpose everything.
But there are situations where one might want to transpose the
top-level matrix (of submatrices) but not the submatrices, themselves.
Or vice versa.

On a side note, the help file for base::aperm is entitled "Array Transposition".
So, this topic is not quite as simple as it may sound.

Interestingly, the aperm generic function *does* have dots.


On Fri, Jan 29, 2021 at 3:37 PM Gabriel Becker  wrote:
>
> Out of my naive curiosity, what arguments are you hoping a method for t() 
> will take?
>
> I mean honestly an argument could be made that all S3 generics should take 
>  I don't think its an overwhelmingly compelling one, but I d see some 
> merit to it given what an s3 generic is at its core.
>
> ~G
>
> On Thu, Jan 28, 2021 at 5:27 PM Abby Spurdle  wrote:
>>
>> That's a great suggestion Davis.
>>
>> While, we're on the topic...
>> Could we have a "dots" argument in base::t, the transpose function?
>>
>>
>> On Fri, Jan 29, 2021 at 4:48 AM Davis Vaughan  wrote:
>> >
>> > I should also say that I would be willing to attempt a patch for this, if
>> > others agree that this would be useful.
>> >
>> > - Davis
>> >
>> > On Thu, Jan 28, 2021 at 9:14 AM Davis Vaughan  wrote:
>> >
>> > > Hi all,
>> > >
>> > > I would like to propose adding `...` to the signatures of the following
>> > > rounding functions:
>> > >
>> > > - floor(x)
>> > > - ceiling(x)
>> > > - round(x, digits = 0)
>> > > - And possibly signif(x, digits = 6)
>> > >
>> > > The purpose would be to allow S3 methods to add additional arguments as
>> > > required.
>> > >
>> > > A few arguments in favor of this change:
>> > >
>> > > `trunc(x, ...)` already takes dots, which sets a precedent for the others
>> > > to do so as well. It is documented in the same help file as the other
>> > > rounding functions.
>> > >
>> > > Internally at the C level, a check is done to ensure that there is 
>> > > exactly
>> > > 1 arg for floor() and ceiling(), and either 1 or 2 args for round(). The
>> > > actual names of those arguments are not checked, however, and I believe
>> > > this is what allows `round.Date(x, ...)` and `round.POSIXt(x, unit)` to
>> > > exist, solely because they have 2 arguments. It seems like this is a bit 
>> > > of
>> > > a hack, since you couldn't create something similar for floor, like
>> > > `floor.POSIXt(x, unit)` (not saying this should exist, it is just for
>> > > argument's sake), because the 1 argument check would error on this. I 
>> > > think
>> > > adding `...` to the signature of the generics would better support what 
>> > > is
>> > > being done here.
>> > >
>> > > Additionally, I have a custom date-like S3 class of my own that I would
>> > > like to write floor(), ceiling(), and round() methods for, and they would
>> > > require passing additional arguments.
>> > >
>> > > If R core would like to make this change, they could probably tweak
>> > > `do_trunc()` to be a bit more general, and use it for floor() and
>> > > ceiling(), since it already allows `...`.
>> > >
>> > > A few references:
>> > >
>> > > Check for 1 arg in do_math1(), used by floor() and ceiling()
>> > >
>> > > https://github.com/wch/r-source/blob/fe82da3baf849fcd3cc7dbc31c6abc72b57aa083/src/main/arithmetic.c#L1270
>> > >
>> > > Check for 2 args in do_Math2(), used by round()
>> > >
>> > > https://github.com/wch/r-source/blob/fe82da3baf849fcd3cc7dbc31c6abc72b57aa083/src/main/arithmetic.c#L1655
>> > >
>> > > do_trunc() definition that allows `...`
>> > >
>> > > https://github.com/wch/r-source/blob/fe82da3baf849fcd3cc7dbc31c6abc72b57aa083/src/main/arithmetic.c#L1329-L1340
>> > >
>> > > - Davis
>> > >
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Potential improvements of ave?

2021-03-15 Thread Abby Spurdle

Hi Thomas,

These are some great suggestions.
But I can't help but feel there's a much bigger problem here.

Intuitively, the ave function could (or should) sort the data.
Then the indexing step becomes almost trivial, in terms of both time
and space complexity.
And the ave function is not the only example of where a problem
becomes much simpler, if the data is sorted.

Historically, I've never found base R functions user-friendly for
aggregation purposes, or for sorting.
(At least, not by comparison to SQL).

But that's not the main problem.
It would seem preferable to sort the data, only once.
(Rather than sorting it repeatedly, or not at all).

Perhaps, objects such as vectors and data.frame(s) could have a
boolean attribute, to indicate if they're sorted.
Or functions such as ave could have a sorted argument.
In either case, if true, the function assumes the data is sorted and
applies a more efficient algorithm.

B.

On Sat, Mar 13, 2021 at 1:07 PM SOEIRO Thomas  wrote:
>
> Dear all,
>
> I have two questions/suggestions about ave, but I am not sure if it's 
> relevant for bug reports.
>
>
>
> 1) I have performance issues with ave in a case where I didn't expect it. The 
> following code runs as expected:
>
> set.seed(1)
>
> df1 <- data.frame(id1 = sample(1:1e2, 5e2, TRUE),
>   id2 = sample(1:3, 5e2, TRUE),
>   id3 = sample(1:5, 5e2, TRUE),
>   val = sample(1:300, 5e2, TRUE))
>
> df1$diff <- ave(df1$val,
> df1$id1,
> df1$id2,
> df1$id3,
> FUN = function(i) c(diff(i), 0))
>
> head(df1[order(df1$id1,
>df1$id2,
>df1$id3), ])
>
> But when expanding the data.frame (* 1e4), ave fails (Error: cannot allocate 
> vector of size 1110.0 Gb):
>
> df2 <- data.frame(id1 = sample(1:(1e2 * 1e4), 5e2 * 1e4, TRUE),
>   id2 = sample(1:3, 5e2 * 1e4, TRUE),
>   id3 = sample(1:(5 * 1e4), 5e2 * 1e4, TRUE),
>   val = sample(1:300, 5e2 * 1e4, TRUE))
>
> df2$diff <- ave(df2$val,
> df2$id1,
> df2$id2,
> df2$id3,
> FUN = function(i) c(diff(i), 0))
>
> This use case does not seem extreme to me (e.g. aggregate et al work 
> perfectly on this data.frame).
> So my question is: Is this expected/intended/reasonable? i.e. Does ave need 
> to be optimized?
>
>
>
> 2) Gabor Grothendieck pointed out in 2011 that drop = TRUE is needed to avoid 
> warnings in case of unused levels 
> (https://stat.ethz.ch/pipermail/r-devel/2011-February/059947.html).
> Is it relevant/possible to expose the drop argument explicitly?
>
>
>
> Thanks,
>
> Thomas
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Faster sorting algorithm...

2021-03-15 Thread Abby Spurdle

In principle, I agree that faster ranking/sorting algorithms are
important, and should be a priority.
But I can't help but feel that the paper focuses on textbook-oriented problems.

Given that in real world problems, there's almost always some form of
prior knowledge:
Wouldn't it be better, from a management perspective, to focus on
sorting algorithms, that incorporate that prior knowledge?

I'm not sure whether that's an R-devel discussion, or for another forum...


On Tue, Mar 16, 2021 at 5:25 AM Morgan Morgan  wrote:
>
> Hi,
> I am not sure if this is the right mailing list, so apologies in advance if
> it is not.
>
> I found the following link/presentation:
> https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf
>
> The implementation of fsort is interesting but incomplete (not sure why?)
> and can be improved or made faster (at least 25%  I believe). I might be
> wrong but there are maybe a couple of bugs as well.
>
> My questions are:
>
> 1/ Is the R Core team interested in a faster sorting algo? (Multithread or
> even single threaded)
>
> 2/ I see an issue with the license, which is MPL-2.0, and hence not
> compatible with base R, Python and Julia. Is there an interest to change
> the license of fsort so all 3 languages (and all the people using these
> languages) can benefit from it? (Like suggested on the first page)
>
> Please let me know if there is an interest to address the above points, I
> would be happy to look into it (free of charge of course!).
>
> Thank you
> Best regards
> Morgan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Potential improvements of ave?

2021-03-16 Thread Abby Spurdle

There are some relatively obvious examples:
unique, which.min/which.max/etc, range/min/max, quantile, aggregate/split

Also, many timeseries, graphics and spline functions are dependent on the order.

In the case of data.frame(s), a boolean flag would probably need to be
extended to allow for multiple column sorting, and
ascending/descending options.

On Tue, Mar 16, 2021 at 11:08 AM Gabriel Becker  wrote:
>
> Abby,
>
> Vectors do have an internal mechanism for knowing that they are sorted via 
> ALTREP (it was one of 2 core motivating features for 'smart vectors' the 
> other being knowledge about presence of NAs).
>
> Currently I don't think we expose it at the R level, though it is part of the 
> official C API. I don't know of any plans for this to change, but I suppose 
> it could. Plus for functions in R itself, we could even use it without 
> exposing it more widely. A number of functions, including sort itself, 
> already do this in fact, but more could. I'd be interested in hearing which 
> functions you think would particularly benefit from this.
>
> ~G
>
> On Mon, Mar 15, 2021 at 12:01 PM SOEIRO Thomas  wrote:
>>
>> Hi Abby,
>>
>> Thank you for your positive feedback.
>>
>> I agree for your general comment about sorting.
>>
>> For ave specifically, ordering may not help because the output must maintain 
>> the order of the input (as ave returns only x and not the entiere 
>> data.frame).
>>
>> Thanks,
>>
>> Thomas
>> 
>> De : Abby Spurdle 
>> Envoyé : lundi 15 mars 2021 10:22
>> À : SOEIRO Thomas
>> Cc : r-devel@r-project.org
>> Objet : Re: [Rd] Potential improvements of ave?
>>
>> EMAIL EXTERNE - TRAITER AVEC PRÉCAUTION LIENS ET FICHIERS
>>
>> Hi Thomas,
>>
>> These are some great suggestions.
>> But I can't help but feel there's a much bigger problem here.
>>
>> Intuitively, the ave function could (or should) sort the data.
>> Then the indexing step becomes almost trivial, in terms of both time
>> and space complexity.
>> And the ave function is not the only example of where a problem
>> becomes much simpler, if the data is sorted.
>>
>> Historically, I've never found base R functions user-friendly for
>> aggregation purposes, or for sorting.
>> (At least, not by comparison to SQL).
>>
>> But that's not the main problem.
>> It would seem preferable to sort the data, only once.
>> (Rather than sorting it repeatedly, or not at all).
>>
>> Perhaps, objects such as vectors and data.frame(s) could have a
>> boolean attribute, to indicate if they're sorted.
>> Or functions such as ave could have a sorted argument.
>> In either case, if true, the function assumes the data is sorted and
>> applies a more efficient algorithm.
>>
>>
>> B.
>>
>>
>> On Sat, Mar 13, 2021 at 1:07 PM SOEIRO Thomas  wrote:
>> >
>> > Dear all,
>> >
>> > I have two questions/suggestions about ave, but I am not sure if it's 
>> > relevant for bug reports.
>> >
>> >
>> >
>> > 1) I have performance issues with ave in a case where I didn't expect it. 
>> > The following code runs as expected:
>> >
>> > set.seed(1)
>> >
>> > df1 <- data.frame(id1 = sample(1:1e2, 5e2, TRUE),
>> >   id2 = sample(1:3, 5e2, TRUE),
>> >   id3 = sample(1:5, 5e2, TRUE),
>> >   val = sample(1:300, 5e2, TRUE))
>> >
>> > df1$diff <- ave(df1$val,
>> > df1$id1,
>> > df1$id2,
>> > df1$id3,
>> > FUN = function(i) c(diff(i), 0))
>> >
>> > head(df1[order(df1$id1,
>> >df1$id2,
>> >df1$id3), ])
>> >
>> > But when expanding the data.frame (* 1e4), ave fails (Error: cannot 
>> > allocate vector of size 1110.0 Gb):
>> >
>> > df2 <- data.frame(id1 = sample(1:(1e2 * 1e4), 5e2 * 1e4, TRUE),
>> >   id2 = sample(1:3, 5e2 * 1e4, TRUE),
>> >   id3 = sample(1:(5 * 1e4), 5e2 * 1e4, TRUE),
>> >   val = sample(1:300, 5e2 * 1e4, TRUE))
>> >
>> > df2$diff <- ave(df2$val,
>> > df2$id1,
>> > df2$id2,
>> > df2$id3,
>> > FUN = function(i) c(diff(i), 0))
>> >
>> > This

[Rd] python-based examples within core-package help files

2021-04-05 Thread Abby Spurdle

I just noticed the following:
(Within the help file for methods::is).

supers <- extends("PythonInterface")
superRelations <- extends("PythonInterface", fullInfo = TRUE)

I was wondering:
Could we please *not* have python-based examples within core help files.

Furthermore, this example has no obvious relevance to mathematical or
statistical subject matter.

Maybe it could be rewritten to use the Matrix package.
Which would be 1000x better.
:)

I'm happy to add that to my todo list, if no one else volunteers...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Using R on kFreeBSD/Debian Hybrid

2021-07-24 Thread Abby Spurdle

Dear All,

I'm considering installing a kFreeBSD/Debian hybrid operating system.
i.e. A Debian OS but with a FreeBSD kernel.
Which subsequently requires running (and compiling) R.
And at some point, looking at the relationship between the R source and fortran.

I thought I'd be prudent, and ask if anyone has done this before?
Also, if I should expect nontrivial problems (especially compiling R),
and if so, how nontrivial?
Hopefully, someone more experienced than I, can offer useful advice on
this subject...

Note (1):
I wasn't sure which mailing list this should go on.
So, just decided to roll with R-devel.

Note (2):
If I understand things correctly, the most recent version is based on
Debian 8, so there are possible problems there.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

84 matches

Mail list logo