[Rd] How to use `[` without evaluating the arguments.

2020-09-25 Thread Eeles, Christopher
Hello R-devel,

I am currently attempting to implement an API similar to data.table wherein 
single bracket subsetting can accept an unquoted expression to be evaluated in 
the context of my object.

A simple example from the data.table package looks like this:


DT <- data.table(col1 = c('a', 'b', 'c'), col2 = c('x', 'y', 'z'))
DT[col1 == 'a']

Where the expression i in DT[i, j] is captured with substitute then evaluated 
inside the DT object.

Reviewing the source code from data.table, it seems that they implemented this 
feature simple by defining a new S3 method on `[` called `[.data.table`. I 
tried to replicate this API as follows.

I have defined an S4 which contains an S3 class as follows:


#' Define an S3 Class
#'
#' Allows use of S3 methods with new S4 class. This is required to overcome
#' limitations of the `[` S4 method.
#'
setOldClass('long.table')

#' LongTable class definition
#'
#' Define a private constructor method to be used to build a `LongTable` object.
#'
#' @param drugs [`data.table`]
#' @param cells [`data.table`]
#' @param assays [`list`]
#' @param metadata [`list`]
#'
#'
#' @return [`LongTable`] object containing the assay data from a
#'
#' @import data.table
#' @keywords internal
.LongTable <- setClass("LongTable",
   slots=list(rowData='data.table',
  colData='data.table',
  assays='list',
  metadata='list',
  .intern='environment'),
   contains='long.table')

#' LongTable constructor method
#'
#' @param rowData [`data.table`, `data.frame`, `matrix`] A table like object
#'   coercible to a `data.table` containing the a unique `rowID` column which
#'   is used to key assays, as well as additional row metadata to subset on.
#' @param rowIDs [`character`, `integer`] A vector specifying
#'   the names or integer indexes of the row data identifier columns. These
#'   columns will be pasted together to make up the row.names of the
#'   `LongTable` object.
#' @param colData [`data.table`, `data.frame`, `matrix`] A table like object
#'   coercible to a `data.table` containing the a unique `colID` column which
#'   is used to key assays, as well as additional column metadata to subset on.
#' @param colIDs [`character`, `integer`] A vector specifying
#'   the names or integer indexes of the col data identifier columns. These
#'   columns will be pasted together to make up the col.names of the
#'   `LongTable` object.
#' @param assays A [`list`] containing one or more objects coercible to a
#'   `data.table`, and keyed by rowID and colID corresponding to the rowID and
#'   colID columns in colData and rowData.
#' @param metadata A [`list`] of metadata associated with the `LongTable`
#'   object being constructed
#' @param keep.rownames [`logical` or `character`] Logical: whether rownames
#'   should be added as a column if coercing to a `data.table`, default is 
FALSE.
#'   If TRUE, rownames are added to the column 'rn'. Character: specify a custom
#'   column name to store the rownames in.
#'
#' @return [`LongTable`] object
#'
#' @import data.table
#' @export
LongTable <- function(rowData, rowIDs, colData, colIDs, assays,
  metadata=list(), keep.rownames=FALSE) {

## TODO:: Handle missing parameters

if (!is(colData, 'data.table')) {
colData <- data.table(colData, keep.rownames=keep.rownames)
}

if (!is(rowData, 'data.table')) {
rowData <- data.table(rowData, keep.rownames=keep.rownames)
}

if (!all(vapply(assays, FUN=is.data.table, FUN.VALUE=logical(1 {
tryCatch({
assays <- lapply(assays, FUN=data.table, 
keep.rownames=keep.rownames)
}, warning = function(w) {
warning(w)
}, error = function(e, assays) {
message(e)
types <- lapply(assays, typeof)
stop(paste0('List items are types: ',
paste0(types, collapse=', '),
'\nPlease ensure all items in the assays list are
coerced to data.tables!'))
})
}

# Initialize the .internals object to store private metadata for a LongTable
internals <- new.env()

## TODO:: Implement error handling
internals$rowIDs <-
if (is.numeric(rowIDs) && max(rowIDs) < ncol(rowData))
rowIDs
else
which(colnames(rowData) %in% rowIDs)
lockBinding('rowIDs', internals)

internals$colIDs <-
if (is.numeric(colIDs) && max(colIDs) < ncol(colData))
colIDs
else
which(colnames(colData) %in% colIDs)
lockBinding('colIDs', internals)

# Assemble the pseudo row and column names for the LongTable
.pasteColons <- function(...) paste(..., collapse=':')
rowData[, `:=`(.rownames=mapply(.pasteColons, transpose(.SD))), 
.SDcols=internals$rowIDs]
colData[, `:=`(.colnames=mapply(

[Rd] Extra "Note" in CRAN submission

2020-09-25 Thread Therneau, Terry M., Ph.D. via R-devel
When I run R CMD check on the survival package I invariably get a note:
...
* checking for file ‘survival/DESCRIPTION’ ... OK
* this is package ‘survival’ version ‘3.2-6’
* checking CRAN incoming feasibility ... NOTE
Maintainer: ‘Terry M Therneau ’
...

This is sufficient for the auto-check process to return the following failure 
message:

Dear maintainer,

  
package survival_3.2-6.tar.gz does not pass the incoming checks automatically, 
please see the following pre-tests:
Windows:
Status: 1 NOTE
Debian:
Status: 1 NOTE


--

In the interest of smoothing things out for the CRAN maintainers I would make 
this message go away, but I don't see how.  Below is the DESCRIPTION file.   
Thanks in advance for any hints.

Terry T.

--

Title: Survival Analysis
Maintainer: Terry M Therneau 
Priority: recommended
Package: survival
Version: 3.2-6
Date: 2020-09-24
Depends: R (>= 3.4.0)
Imports: graphics, Matrix, methods, splines, stats, utils
LazyData: Yes
LazyLoad: Yes
ByteCompile: Yes
Authors@R: c(person(c("Terry", "M"), "Therneau",
 email="therneau.te...@mayo.edu",
role=c("aut", "cre")),
person("Thomas", "Lumley", role=c("ctb", "trl"),
   comment="original S->R port and R maintainer until 2009"),
   person("Atkinson", "Elizabeth", role="ctb"),
   person("Crowson", "Cynthia", role="ctb"))
Description: Contains the core survival analysis routines, including
 definition of Surv objects,
 Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models,
 and parametric accelerated failure time models.
License: LGPL (>=2)
URL: https://github.com/therneau/survival


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Extra "Note" in CRAN submission

2020-09-25 Thread Henrik Singmann
Dear Terry,

You misunderstand the note. The problem is that your tar.gz is too
large. The critical bit is:
Size of tarball: 7528635 bytes

The CRAN repository policies state:
"Packages should be of the minimum necessary size. Reasonable
compression should be used for data (not just .rda files) and PDF
documentation: CRAN will if necessary pass the latter through qpdf.
As a general rule, neither data nor documentation should exceed 5MB
(which covers several books). A CRAN package is not an appropriate way
to distribute course notes, and authors will be asked to trim their
documentation to a maximum of 5MB.
Where a large amount of data is required (even after compression),
consideration should be given to a separate data-only package which
can be updated only rarely (since older versions of packages are
archived in perpetuity)."

But I agree with you that this is the one note that is very easy to
overlook as it is kind of hidden after the name bit.

Best,
Henrik

Am Fr., 25. Sept. 2020 um 12:59 Uhr schrieb Therneau, Terry M., Ph.D.
via R-devel :
>
> When I run R CMD check on the survival package I invariably get a note:
> ...
> * checking for file ‘survival/DESCRIPTION’ ... OK
> * this is package ‘survival’ version ‘3.2-6’
> * checking CRAN incoming feasibility ... NOTE
> Maintainer: ‘Terry M Therneau ’
> ...
>
> This is sufficient for the auto-check process to return the following failure 
> message:
>
> Dear maintainer,
>
>
> package survival_3.2-6.tar.gz does not pass the incoming checks 
> automatically, please see the following pre-tests:
> Windows:
> Status: 1 NOTE
> Debian:
> Status: 1 NOTE
>
>
> --
>
> In the interest of smoothing things out for the CRAN maintainers I would make 
> this message go away, but I don't see how.  Below is the DESCRIPTION file.   
> Thanks in advance for any hints.
>
> Terry T.
>
> --
>
> Title: Survival Analysis
> Maintainer: Terry M Therneau 
> Priority: recommended
> Package: survival
> Version: 3.2-6
> Date: 2020-09-24
> Depends: R (>= 3.4.0)
> Imports: graphics, Matrix, methods, splines, stats, utils
> LazyData: Yes
> LazyLoad: Yes
> ByteCompile: Yes
> Authors@R: c(person(c("Terry", "M"), "Therneau",
>  email="therneau.te...@mayo.edu",
> role=c("aut", "cre")),
> person("Thomas", "Lumley", role=c("ctb", "trl"),
>comment="original S->R port and R maintainer until 2009"),
>person("Atkinson", "Elizabeth", role="ctb"),
>person("Crowson", "Cynthia", role="ctb"))
> Description: Contains the core survival analysis routines, including
>  definition of Surv objects,
>  Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models,
>  and parametric accelerated failure time models.
> License: LGPL (>=2)
> URL: https://github.com/therneau/survival
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Dr. Henrik Singmann
Lecturer, Experimental Psychology
University College London (UCL), UK
http://singmann.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How to use `[` without evaluating the arguments.

2020-09-25 Thread Hugh Parsonage
This works as expected:

"[.foo" <- function(x, i, j) {
  sx <- substitute(x)
  si <- substitute(i)
  sj <- substitute(j)
  100 * length(sx) + 10 * length(si) + length(sj)
}

x <- 1:10
class(x) <- "foo"
x[y == z, a(x)]
#> [1] 132

Note in your implementation you ask the function evaluate the
expression. You may have been intending to recompose the calls from
the substituted values of x, i, j and evaluate this new call.

On Fri, 25 Sep 2020 at 20:02, Eeles, Christopher
 wrote:
>
> Hello R-devel,
>
> I am currently attempting to implement an API similar to data.table wherein 
> single bracket subsetting can accept an unquoted expression to be evaluated 
> in the context of my object.
>
> A simple example from the data.table package looks like this:
>
>
> DT <- data.table(col1 = c('a', 'b', 'c'), col2 = c('x', 'y', 'z'))
> DT[col1 == 'a']
>
> Where the expression i in DT[i, j] is captured with substitute then evaluated 
> inside the DT object.
>
> Reviewing the source code from data.table, it seems that they implemented 
> this feature simple by defining a new S3 method on `[` called `[.data.table`. 
> I tried to replicate this API as follows.
>
> I have defined an S4 which contains an S3 class as follows:
>
>
> #' Define an S3 Class
> #'
> #' Allows use of S3 methods with new S4 class. This is required to overcome
> #' limitations of the `[` S4 method.
> #'
> setOldClass('long.table')
>
> #' LongTable class definition
> #'
> #' Define a private constructor method to be used to build a `LongTable` 
> object.
> #'
> #' @param drugs [`data.table`]
> #' @param cells [`data.table`]
> #' @param assays [`list`]
> #' @param metadata [`list`]
> #'
> #'
> #' @return [`LongTable`] object containing the assay data from a
> #'
> #' @import data.table
> #' @keywords internal
> .LongTable <- setClass("LongTable",
>slots=list(rowData='data.table',
>   colData='data.table',
>   assays='list',
>   metadata='list',
>   .intern='environment'),
>contains='long.table')
>
> #' LongTable constructor method
> #'
> #' @param rowData [`data.table`, `data.frame`, `matrix`] A table like object
> #'   coercible to a `data.table` containing the a unique `rowID` column which
> #'   is used to key assays, as well as additional row metadata to subset on.
> #' @param rowIDs [`character`, `integer`] A vector specifying
> #'   the names or integer indexes of the row data identifier columns. These
> #'   columns will be pasted together to make up the row.names of the
> #'   `LongTable` object.
> #' @param colData [`data.table`, `data.frame`, `matrix`] A table like object
> #'   coercible to a `data.table` containing the a unique `colID` column which
> #'   is used to key assays, as well as additional column metadata to subset 
> on.
> #' @param colIDs [`character`, `integer`] A vector specifying
> #'   the names or integer indexes of the col data identifier columns. These
> #'   columns will be pasted together to make up the col.names of the
> #'   `LongTable` object.
> #' @param assays A [`list`] containing one or more objects coercible to a
> #'   `data.table`, and keyed by rowID and colID corresponding to the rowID and
> #'   colID columns in colData and rowData.
> #' @param metadata A [`list`] of metadata associated with the `LongTable`
> #'   object being constructed
> #' @param keep.rownames [`logical` or `character`] Logical: whether rownames
> #'   should be added as a column if coercing to a `data.table`, default is 
> FALSE.
> #'   If TRUE, rownames are added to the column 'rn'. Character: specify a 
> custom
> #'   column name to store the rownames in.
> #'
> #' @return [`LongTable`] object
> #'
> #' @import data.table
> #' @export
> LongTable <- function(rowData, rowIDs, colData, colIDs, assays,
>   metadata=list(), keep.rownames=FALSE) {
>
> ## TODO:: Handle missing parameters
>
> if (!is(colData, 'data.table')) {
> colData <- data.table(colData, keep.rownames=keep.rownames)
> }
>
> if (!is(rowData, 'data.table')) {
> rowData <- data.table(rowData, keep.rownames=keep.rownames)
> }
>
> if (!all(vapply(assays, FUN=is.data.table, FUN.VALUE=logical(1 {
> tryCatch({
> assays <- lapply(assays, FUN=data.table, 
> keep.rownames=keep.rownames)
> }, warning = function(w) {
> warning(w)
> }, error = function(e, assays) {
> message(e)
> types <- lapply(assays, typeof)
> stop(paste0('List items are types: ',
> paste0(types, collapse=', '),
> '\nPlease ensure all items in the assays list are
> coerced to data.tables!'))
> })
> }
>
> # Initialize the .internals object to store private metadata for a 
> LongTable
> internals <- new.env()
>