Re: [Rd] Plans to improve reference classes?

2015-06-23 Thread Michael Lawrence
Could of requests:

1) Is there any example or writeup on the difficulties of extending
reference classes across packages? Just so I can fully understand the
issues.

2) In what sorts of situations does the performance of reference
classes cause problems? Sure, it's an order of magnitude slower than
constructing a simple environment, but those timings are in
microseconds, so one would need a thousand objects before it started
to be noticeable. Some motivating use cases would help.

Thanks,
Michael




On Mon, Jun 22, 2015 at 7:06 AM, Hadley Wickham  wrote:
> Apart from speed, the most important advantage of R6 over ref classes
> is that's it easy to subclass a class defined in package A in package
> B. This is currently difficult with ref classes because of the way it
> does scoping. (And I think it's difficult to fix without fundamentally
> changing how ref classes work)
>
> Hadley
>
> On Mon, Jun 22, 2015 at 8:49 AM, Michael Lawrence
>  wrote:
>> (Moved to R-devel)
>>
>> Niek,
>>
>> Would you please provide the details on this test case, including your
>> benchmarks, and what you are trying to achieve at the high-level?
>>
>> Thanks,
>> Michael
>>
>>
>>
>>
>> On Wed, Jun 17, 2015 at 4:55 AM, Niek Bouman  wrote:
>>> Dear R-core team,
>>>
>>> I was wondering whether you have any plans to improve the current 
>>> implementation of reference classes.
>>>
>>> Background:
>>> For a new project we will have many mutable objects, and we therefore want 
>>> to use a construction like reference classes in this project. However, we 
>>> observed that the speed performance of our implementation (using reference 
>>> classes) for a simple test case is rather poor compared to a non-OOP 
>>> implementation. Further, turning the reference classes into R6classes 
>>> (using the R6 package) gave the best performance. As speed is an issue in 
>>> our project, this would for us be an important reason  to use R6 classes 
>>> instead of reference classes. The drawback, of course, is that the R6 
>>> package is developed by a single developer and that further development is 
>>> therefore less certain than if we would use reference classes, which are in 
>>> the core. Ideally we would like a system like R6 in the core of R. Are you 
>>> planning to support R6, or improve reference classes to be on par with (or 
>>> better than) R6 in terms of speed, in the core?
>>>
>>> Best regards,
>>> Niek
>>>
>>> Keygene N.V. | P.O. Box 216 | 6700 AE Wageningen | The Netherlands
>>> T (+31) 317 46 68 66 | F (+31) 317 42 49 39 | CoC. 09066631 | 
>>> http://www.keygene.com
>>>
>>>
>>> [http://www.keygene.com/images/keygenegeneral.jpg]
>>>
>>> Stay up-to-date! Subscribe to our bimonthly newsletter 
>>> here
>>>
>>> [http://www.keygene.com/images/linkedin-grey.png]
>>>[http://www.keygene.com/images/twitter-grey.png] 
>>>  
>>> [http://www.keygene.com/images/facebook-grey.png] 
>>> 
>>>
>>> The information contained in this message, and attachments if any, may be 
>>> privileged and/or confidential and is intended to be received only by 
>>> persons
>>> entitled to receive such information. Use of any part of this message 
>>> and/or its attachments if any, in any other way than as explicitly stated 
>>> by the sender is strictly prohibited. Should you receive this
>>> message unintentionally please notify the sender immediately, and delete it 
>>> together with all attachments, if any. Thank you. The transmission of 
>>> messages and/or information via the Internet is not
>>> secured and may be intercepted by third parties. KeyGene assumes no 
>>> liability for any damage caused by any unintentional disclosure and/or use 
>>> of the content of this message and attachments if any.
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ___
>>> R-core list: https://stat.ethz.ch/mailman/listinfo/r-core
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] GCC update in Rtools?

2015-06-23 Thread Avraham Adler
Hello.

There was a lot of discussion in March about the difficulties in
having Rtools use a more recent version of GCC than 4.6.3. May we know
if there has been any progress since then, or has dveleopment/testing
been put on hold for the time being?

Thank you,

Avi

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Plans to improve reference classes?

2015-06-23 Thread Hadley Wickham
> 1) Is there any example or writeup on the difficulties of extending
> reference classes across packages? Just so I can fully understand the
> issues.

Here's a simple example:

library(scales)
library(methods)

MyRange <- setRefClass("MyRange", contains = "DiscreteRange")
a_range <- MyRange()
a_range$train(1:10)
# Error in a_range$train(1:10) : could not find function "train_discrete"

where train_discrete() is an non-exported function of the scales
package called by the train() method of DiscreteRange.

There are also some notes about portable vs. non-portable R6 classes
at http://cran.r-project.org/web/packages/R6/vignettes/Portable.html

> 2) In what sorts of situations does the performance of reference
> classes cause problems? Sure, it's an order of magnitude slower than
> constructing a simple environment, but those timings are in
> microseconds, so one would need a thousand objects before it started
> to be noticeable. Some motivating use cases would help.

It's a bit of a pathological case, but the switch from RefClasses to
R6 made a noticeable performance improvement in shiny. It's hard to
quantify the impact on an app, but the impact on the underlying
reactive implementation was quite profound: http://rpubs.com/wch/27260
vs  http://rpubs.com/wch/27264

R6 also includes a vignette with detailed benchmarking:
http://cran.r-project.org/web/packages/R6/vignettes/Performance.html

I've added Winston to the thread since he's the expert.

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Plans to improve reference classes?

2015-06-23 Thread John Chambers
I understand Hadley's point; it's a consequence of the modification of the 
environment of the ref. class methods.

Good point, but it seems we can make that an option (there are advantages to it 
of code quality and ease of writing, when it works);

Let's discuss possibilities, off-list until things are a bit clearer.

John

On Jun 23, 2015, at 8:06 AM, Hadley Wickham  wrote:

>> 1) Is there any example or writeup on the difficulties of extending
>> reference classes across packages? Just so I can fully understand the
>> issues.
> 
> Here's a simple example:
> 
> library(scales)
> library(methods)
> 
> MyRange <- setRefClass("MyRange", contains = "DiscreteRange")
> a_range <- MyRange()
> a_range$train(1:10)
> # Error in a_range$train(1:10) : could not find function "train_discrete"
> 
> where train_discrete() is an non-exported function of the scales
> package called by the train() method of DiscreteRange.
> 
> There are also some notes about portable vs. non-portable R6 classes
> at http://cran.r-project.org/web/packages/R6/vignettes/Portable.html
> 
>> 2) In what sorts of situations does the performance of reference
>> classes cause problems? Sure, it's an order of magnitude slower than
>> constructing a simple environment, but those timings are in
>> microseconds, so one would need a thousand objects before it started
>> to be noticeable. Some motivating use cases would help.
> 
> It's a bit of a pathological case, but the switch from RefClasses to
> R6 made a noticeable performance improvement in shiny. It's hard to
> quantify the impact on an app, but the impact on the underlying
> reactive implementation was quite profound: http://rpubs.com/wch/27260
> vs  http://rpubs.com/wch/27264
> 
> R6 also includes a vignette with detailed benchmarking:
> http://cran.r-project.org/web/packages/R6/vignettes/Performance.html
> 
> I've added Winston to the thread since he's the expert.
> 
> Hadley
> 
> -- 
> http://had.co.nz/
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Plans to improve reference classes?

2015-06-23 Thread Winston Chang
I can provide a little background on why particular choices were made for
R6. Generally speaking, speed is a primary consideration in making
decisions about the design of R6. The basic structure of R6 classes is
actually not so different from reference classes: an R6 object is an
environment. But many aspects of R6 objects are simpler.

R6 does support clean cross-package inheritance. The key design feature
that allows this is that methods have one environment that they are bound
in (this is where they can be found), and another environment that they are
enclosed in (roughly, this is where they run). The enclosing environment
points back to the binding environment with a binding named `self`. Methods
must access other members with `self$`, as in `self$foo`. I've found that
this requirement results in clearer code, because it's always clear when
you're accessing something that's part of the object.

When a class inherits from another class, the enclosing environment also
contains a binding named `super`, which points to an environment containing
methods from the superclass. These methods also have their own enclosing
environment, with a `self` that points back to the object's binding
environment.

I know this might be hard to picture from the description; I have some
diagrams drawn up which might help. See pages 1 and 4 from this document:
  https://github.com/wch/R6/blob/master/doc_extra/R6.pdf
(The other pages show other features, like private members, and
non-portable R6 objects, which don't support clean cross-package
inheritance, and have a different structure.)


Regarding performance, R6 is fast relative to ref classes because it
doesn't do type checking for fields, and doesn't make use of S4. (There may
be other reasons as well, but I don't know the internals of ref classes
well enough to say much about it.) Accessing a member of an R6 object is
literally just accessing a binding in an environment, and that's a very
fast operation in R.

-Winston



On Tue, Jun 23, 2015 at 10:06 AM, Hadley Wickham 
wrote:

> > 1) Is there any example or writeup on the difficulties of extending
> > reference classes across packages? Just so I can fully understand the
> > issues.
>
> Here's a simple example:
>
> library(scales)
> library(methods)
>
> MyRange <- setRefClass("MyRange", contains = "DiscreteRange")
> a_range <- MyRange()
> a_range$train(1:10)
> # Error in a_range$train(1:10) : could not find function "train_discrete"
>
> where train_discrete() is an non-exported function of the scales
> package called by the train() method of DiscreteRange.
>
> There are also some notes about portable vs. non-portable R6 classes
> at http://cran.r-project.org/web/packages/R6/vignettes/Portable.html
>
> > 2) In what sorts of situations does the performance of reference
> > classes cause problems? Sure, it's an order of magnitude slower than
> > constructing a simple environment, but those timings are in
> > microseconds, so one would need a thousand objects before it started
> > to be noticeable. Some motivating use cases would help.
>
> It's a bit of a pathological case, but the switch from RefClasses to
> R6 made a noticeable performance improvement in shiny. It's hard to
> quantify the impact on an app, but the impact on the underlying
> reactive implementation was quite profound: http://rpubs.com/wch/27260
> vs  http://rpubs.com/wch/27264
>
> R6 also includes a vignette with detailed benchmarking:
> http://cran.r-project.org/web/packages/R6/vignettes/Performance.html
>
> I've added Winston to the thread since he's the expert.
>
> Hadley
>
> --
> http://had.co.nz/
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] GCC update in Rtools?

2015-06-23 Thread Duncan Murdoch

On 23/06/2015 10:36 AM, Avraham Adler wrote:

Hello.

There was a lot of discussion in March about the difficulties in
having Rtools use a more recent version of GCC than 4.6.3. May we know
if there has been any progress since then, or has dveleopment/testing
been put on hold for the time being?


A volunteer has come forward to try to solve the outstanding issues 
(which were listed at 
https://rawgit.com/kevinushey/RToolsToolchainUpdate/master/mingwnotes.html). 
I haven't heard if any progress has been made.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Listing all spawned jobs/processed after parallel::mcparallel()?

2015-06-23 Thread Henrik Bengtsson
On Sun, Jun 21, 2015 at 9:59 AM, Prof Brian Ripley
 wrote:
> On 20/06/2015 22:21, Henrik Bengtsson wrote:
>>
>> QUESTION:
>> Is it possible to query number of active jobs running after launching
>> them with parallel::mcparallel()?
>>
>> For example, if I launch 3 jobs using:
>>
>>> library(parallel)
>>> f <- lapply(1:3, FUN=mcparallel)
>>
>>
>> then I can inspect them as:
>>
>>> str(f)
>>
>> List of 3
>>   $ :List of 2
>>..$ pid: int 142225
>>..$ fd : int [1:2] 8 13
>>..- attr(*, "class")= chr [1:3] "parallelJob" "childProcess" "process"
>>   $ :List of 2
>>..$ pid: int 142226
>>..$ fd : int [1:2] 10 15
>>..- attr(*, "class")= chr [1:3] "parallelJob" "childProcess" "process"
>>   $ :List of 2
>>..$ pid: int 142227
>>..$ fd : int [1:2] 12 17
>>..- attr(*, "class")= chr [1:3] "parallelJob" "childProcess" "process"
>>
>> However, if I launch them without "recording" them, or equivalently if I
>> do:
>>
>>> f <- lapply(1:3, FUN=mcparallel)
>>> rm(list="f")
>>
>>
>> is there a function/mechanism in R/the parallel package allowing me to
>> find the currently active/running processes?  ... or at least query
>> how many they are?  I'd like to use this to prevent spawning of more
>> than a maximum number of parallel processes.  (Yes, I'm away of
>> mclapply() and friends, but I'm looking at using more low-level
>> mcparallel()/mccollect()). I'm trying to decide whether I should
>> implement my own mechanism for keeping track of "jobs" or not.
>
>
> Note that 'currently active/running' is a slippery concept and is not what
> the results above show.  But see ?children, which seems to be what you are
> looking for.  It is not exported and there is no more detailed explanation
> save the source code.  Also note that tells you about children and not
> grandchildren 
>
> You can find out about child processes (and their children) at OS level, for
> example via the 'ps' command, but doing so portably is not easy.

Thank you very much.  This was exactly what I was looking for.  I
appreciate the problem of identifying grandchildren, but with
children() I know at least have chance to get to a lower bound of the
number of "active children" (?children).

After some initial testing on Linux and OSX, I'm glad to see that
parallel:::children() seem to reflect what are actually active
processes, e.g. if I SIGTERM one of them externally, it is immediately
dropped from parallel:::children().  I also noticed that the process
remains active until it has been parallel:::mccollect():ed.

/Henrik

>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Emeritus Professor of Applied Statistics, University of Oxford
> 1 South Parks Road, Oxford OX1 3TG, UK

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel