Re: [Rd] Conventions: Use of globals and main functions

2019-08-26 Thread Gábor Csárdi
That is unfortunately wrong, though. Whether the script runs as "main"
and whether R is in interactive mode are independent properties. I
guess most of the time it works, because _usually_ you run the whole
script (main()) in non-interactive mode, and source() it in
interactive mode, but this is not necessarily always the case, e.g.
you might want to source() in non-interactive mode to run some tests,
or use the functions of the script in another script, in which cases
you don't want to run main().

G.

On Sun, Aug 25, 2019 at 11:47 PM Cyclic Group Z_1
 wrote:
>
> This seems like a nice idiom; I've seen others use
> if(!interactive()){
> main()
> }
> to a similar effect.
>
> Best,
> CG
>
> On Sunday, August 25, 2019, 01:16:06 AM CDT, Gábor Csárdi 
>  wrote:
>
>
> This is what I usually put in scripts:
>
> if (is.null(sys.calls())) {
>   main()
> }
>

[...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] wrap_logical warning message when loading objects created in R 3.6 in an R 3.5 session

2019-08-26 Thread petr smirnov
Hi,

I am experiencing a warning message when I load a large R object
created in an R 3.6 session in R 3.5.* , as follows:

```
Warning message:
In load(“GDSCv2.RData”) :
 cannot unserialize ALTVEC object of class ‘wrap_logical’ from package
‘base’; returning length zero vector
```
Of relevant information may be that the large R object (a data
structure defined in my Bioconductor package PharmacoGx), was in part
created including data frames which were cast from data.tables. I
noticed that the ALTVEC class had caused some errors previously in the
data.table package.

I have two questions:
1. Should I be concerned about this warning? I cannot seem to find
what effect it has on the data loaded.
2. Could you point me towards narrowing down the cause of this issue?
Ideally, everyone would upgrade R promptly, but even our own
institute's HPC cluster is still on 3.5, and the warning does not
inspire confidence for some of the less technical members of our group
who are using the datasets.


-- 
Petr Smirnov
PhD Candidate,
Benjamin Haibe-Kains Lab
Princess Margaret Cancer Centre
University of Toronto

psmirnov2...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] wrap_logical warning message when loading objects created in R 3.6 in an R 3.5 session

2019-08-26 Thread Duncan Murdoch

On 23/08/2019 4:22 p.m., petr smirnov wrote:

Hi,

I am experiencing a warning message when I load a large R object
created in an R 3.6 session in R 3.5.* , as follows:

```
Warning message:
In load(“GDSCv2.RData”) :
  cannot unserialize ALTVEC object of class ‘wrap_logical’ from package
‘base’; returning length zero vector
```
Of relevant information may be that the large R object (a data
structure defined in my Bioconductor package PharmacoGx), was in part
created including data frames which were cast from data.tables. I
noticed that the ALTVEC class had caused some errors previously in the
data.table package.

I have two questions:
1. Should I be concerned about this warning? I cannot seem to find
what effect it has on the data loaded.


That object is not being loaded.  If the object is important to you, 
then you should be concerned.



2. Could you point me towards narrowing down the cause of this issue?


From time to time the format of saved objects changes.  There was such 
a change in version 3.5.0 of R. Older versions cannot read files in a 
format that was invented after they were released.


I would have expected 3.5.0 to be able to read all the current formats, 
but it appears you have an object new to 3.6.0 that needs something that 
doesn't exist in 3.5.0.  The solution is probably to use 3.6.x to read 
that file, and then save it with "version=2" as an argument.  That 
should cause it to use the older format.


Duncan Murdoch



Ideally, everyone would upgrade R promptly, but even our own
institute's HPC cluster is still on 3.5, and the warning does not
inspire confidence for some of the less technical members of our group
who are using the datasets.


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Conventions: Use of globals and main functions

2019-08-26 Thread Cyclic Group Z_1 via R-devel
Right, I did not mean to imply these tests are equivalent. Only that both 
similarly exclude execution of main() under some context. 

Best,
CG

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Conventions: Use of globals and main functions

2019-08-26 Thread William Dunlap via R-devel
Duncan Murdoch wrote:
>  Scripts are for throwaways, not for anything worth keeping.

I totally agree and have a tangentially relevant question about the <<-
operator.  Currently 'name <<- value' means to look up the environment
stack until you find 'name'  and (a) if you find 'name' in some frame bind
it to a new value in that frame and (b) if you do not find it make a new
entry for it in .GlobalEnv.

Should R deprecate the second part of that and give an error if 'name' is
not already present in the environment stack?  This would catch misspelling
errors in functions that collect results from recursive calls.  E.g.,

collectStrings <- function(list) {
strings <- character() # to be populated by .collect
.collect <- function(x) {
if (is.list(x)) {
lapply(x, .collect)
} else if (is.character(x)) {
strings <<- c(strings, x)
}
misspelledStrings <<- c(strings, names(x)) # oops, would like to be
told about this error
NULL
}
.collect(list)
strings
}

This gives the incorrect:
> collectStrings(list(i="One", ii=list(a=1, b="Two")))
[1] "One" "Two"
> misspelledStrings
[1] "One" "Two" "i"   "ii"

instead of what we would get if 'misspelledStrings' were 'strings'.
> collectStrings(list(i="One", ii=list(a=1, b="Two")))
[1] "One" "Two" "a"   "b"   "i"   "ii"

If someone really wanted to assign into .GlobalEnv the assign() function is
available.

In S '<<-' only had meaning (b) and R added meaning (a).  Perhaps it is
time to drop meaning (b).  We could start by triggering a warning about it
if some environment variable were set, as is being done for non-scalar &&
and ||.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Sun, Aug 25, 2019 at 5:09 PM Duncan Murdoch 
wrote:

> On 25/08/2019 7:09 p.m., Cyclic Group Z_1 wrote:
> >
> >
> > This is a fair point; structuring functions into packages is probably
> ultimately the gold standard for code organization in R. However, lexical
> scoping in R is really not much different than in other languages, such as
> Python, in which use of main functions and defining other named functions
> outside of main are encouraged. For example, in Scheme, from which R
> derives its scoping rules, the community generally organizes code with
> almost exclusively functions and few non-function global variables at top
> level. The common use of globals in R seems to be mostly a consequence of
> historical interactive use and, relatedly, an inherited practice from S.
> >
> > It is true, though, that since anonymous functions (such as in lapply)
> play a large part in idiomatic R code, as you put it, "[l]exical scoping
> means that all of the problems of global variables are available to writers
> who use main()." Nevertheless, using a main function with other functions
> defined outside it seems like a good quick alternative that offers similar
> advantages to making a package when functions are tightly coupled to the
> script and the project may not be large or generalizable enough to warrant
> making a package.
> >
>
> I think the idea that making a package is too hard is just wrong.
> Packages in R have lots of requirements, but nowadays there are tools
> that make them easy.  Eleven years ago at UseR in Dortmund I wrote a
> package during a 45 minute presentation, and things are much easier now.
>
> If you make a complex project without putting most of the code into a
> package, you don't have something that you will be able to modify in a
> year or two, because you won't have proper documentation.
>
> Scripts are for throwaways, not for anything worth keeping.
>
> Duncan Murdoch
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Conventions: Use of globals and main functions

2019-08-26 Thread Duncan Murdoch

On 26/08/2019 1:58 p.m., William Dunlap wrote:

Duncan Murdoch wrote:
 > Scripts are for throwaways, not for anything worth keeping.

I totally agree and have a tangentially relevant question about the <<- 
operator.  Currently 'name <<- value' means to look up the environment 
stack until you find 'name'  and (a) if you find 'name' in some frame 
bind it to a new value in that frame and (b) if you do not find it make 
a new entry for it in .GlobalEnv.


Should R deprecate the second part of that and give an error if 'name' 
is not already present in the environment stack?  This would catch 
misspelling errors in functions that collect results from recursive 
calls.  E.g.,


I like that suggestion.  Package tests have been complaining about 
packages writing to .GlobalEnv for a while now, so there probably aren't 
many instances of b) in CRAN packages; that change might be relatively 
painless.


Duncan Murdoch



collectStrings <- function(list) {
     strings <- character() # to be populated by .collect
     .collect <- function(x) {
         if (is.list(x)) {
             lapply(x, .collect)
         } else if (is.character(x)) {
             strings <<- c(strings, x)
         }
         misspelledStrings <<- c(strings, names(x)) # oops, would like 
to be told about this error

         NULL
     }
     .collect(list)
     strings
}

This gives the incorrect:
 > collectStrings(list(i="One", ii=list(a=1, b="Two")))
[1] "One" "Two"
 > misspelledStrings
[1] "One" "Two" "i"   "ii"

instead of what we would get if 'misspelledStrings' were 'strings'.
 > collectStrings(list(i="One", ii=list(a=1, b="Two")))
[1] "One" "Two" "a"   "b"   "i"   "ii"

If someone really wanted to assign into .GlobalEnv the assign() function 
is available.


In S '<<-' only had meaning (b) and R added meaning (a).  Perhaps it is 
time to drop meaning (b).  We could start by triggering a warning about 
it if some environment variable were set, as is being done for 
non-scalar && and ||.


Bill Dunlap
TIBCO Software
wdunlap tibco.com 


On Sun, Aug 25, 2019 at 5:09 PM Duncan Murdoch > wrote:


On 25/08/2019 7:09 p.m., Cyclic Group Z_1 wrote:
 >
 >
 > This is a fair point; structuring functions into packages is
probably ultimately the gold standard for code organization in R.
However, lexical scoping in R is really not much different than in
other languages, such as Python, in which use of main functions and
defining other named functions outside of main are encouraged. For
example, in Scheme, from which R derives its scoping rules, the
community generally organizes code with almost exclusively functions
and few non-function global variables at top level. The common use
of globals in R seems to be mostly a consequence of historical
interactive use and, relatedly, an inherited practice from S.
 >
 > It is true, though, that since anonymous functions (such as in
lapply) play a large part in idiomatic R code, as you put it,
"[l]exical scoping means that all of the problems of global
variables are available to writers who use main()." Nevertheless,
using a main function with other functions defined outside it seems
like a good quick alternative that offers similar advantages to
making a package when functions are tightly coupled to the script
and the project may not be large or generalizable enough to warrant
making a package.
 >

I think the idea that making a package is too hard is just wrong.
Packages in R have lots of requirements, but nowadays there are tools
that make them easy.  Eleven years ago at UseR in Dortmund I wrote a
package during a 45 minute presentation, and things are much easier now.

If you make a complex project without putting most of the code into a
package, you don't have something that you will be able to modify in a
year or two, because you won't have proper documentation.

Scripts are for throwaways, not for anything worth keeping.

Duncan Murdoch

__
R-devel@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel