Re: [Rd] compairing doubles

2018-09-03 Thread Juan Telleria Ruiz de Aguirre
Maybe a new Operator could be defined for a fast and easy double
Comparison: `~~`

`~~` <- function (e1, e2)  all.equal(e1, e2)

And document it properly.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-09-03 Thread Rui Barradas

Hello,

Watch out for operator precedence.



all.equal(0.3, 0.1*3)
#[1] TRUE


`%~~%` <- function (e1, e2)  all.equal(e1, e2)

0.3 %~~% 0.1*3
#Error in 0.3 %~~% 0.1 * 3 : argumento não-numérico para operador binário


0.3 %~~% (0.1*3)
#[1] TRUE


Now with isTRUE. The problem changes a bit.


isTRUE(all.equal(0.3, 0.1*3))
#[1] TRUE


`%~~%` <- function (e1, e2)  isTRUE(all.equal(e1, e2))

0.3 %~~% 0.1*3
#[1] 0

0.3 %~~% (0.1*3)
#[1] TRUE


Hope this helps,

Rui Barradas

Às 08:20 de 03/09/2018, Juan Telleria Ruiz de Aguirre escreveu:

Maybe a new Operator could be defined for a fast and easy double
Comparison: `~~`

`~~` <- function (e1, e2)  all.equal(e1, e2)

And document it properly.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?

2018-09-03 Thread Tomas Kalibera
Please don't do this to get the underlying vector length (or to achieve 
anything else). Setting/deleting attributes of an R object without 
checking the reference count violates R semantics, which in turn can 
have unpredictable results on R programs (essentially undebuggable 
segfaults now or more likely later when new optimizations or features 
are added to the language). Setting attributes on objects with reference 
count (currently NAMED value) greater than 0 (in some special cases 1 is 
ok) is cheating - please see Writing R Extensions - and getting speedups 
via cheating leads to fragile, unmaintainable and buggy code. Doing so 
in packages is particularly unhelpful to the whole community - packages 
should only use the public API as documented.


Similarly, getting a physical address of an object to hack around 
whether R has copied it or not should certainly not be done in packages 
and R code should never be working with or even obtaining physical 
address of an object. This is also why one cannot obtain such address 
using base R (apart in textual form from certain diagnostic messages 
where it can indeed be useful for low-level debugging).


Tomas

On 09/02/2018 01:19 AM, Dénes Tóth wrote:
The solution below introduces a dependency on data.table, but 
otherwise it does what you need:


---

# special method for Foo objects
length.Foo <- function(x) {
  length(unlist(x, recursive = TRUE, use.names = FALSE))
}

# an instance of a Foo object
x <- structure(list(a = 1, b = list(b1 = 1, b2 = 2)), class = "Foo")

# its length
stopifnot(length(x) == 3L)

# get its length as if it were a standard list
.length <- function(x) {
  cls <- class(x)
  # setattr() does not make a copy, but modifies by reference
  data.table::setattr(x, "class", NULL)
  # get the length
  len <- base::length(x)
  # re-set original classes
  data.table::setattr(x, "class", cls)
  # return the unclassed length
  len
}

# to check that we do not make unwanted changes
orig_class <- class(x)

# check that the address in RAM does not change
a1 <- data.table::address(x)

# 'unclassed' length
stopifnot(.length(x) == 2L)

# check that address is the same
stopifnot(a1 == data.table::address(x))

# check against original class
stopifnot(identical(orig_class, class(x)))

---


On 08/24/2018 07:55 PM, Henrik Bengtsson wrote:

Is there a low-level function that returns the length of an object 'x'
- the length that for instance .subset(x) and .subset2(x) see? An
obvious candidate would be to use:

.length <- function(x) length(unclass(x))

However, I'm concerned that calling unclass(x) may trigger an
expensive copy internally in some cases.  Is that concern unfounded?

Thxs,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Bug report: problems with saving plots on a Windows PC with 4k monitor - wrong picture size

2018-09-03 Thread Yu Lee
Steps to reproduce the problem:

win.metafile("myplot.wmf",height=3,width=5)
plot(1:9)
dev.off()


Details:
When I try to save plots as WMF or EMF pictures specifying small picture size, 
e.g.., 3x5 inches, I get a wrong size of the WMF/EMF picture. The plot itself 
resides in the left upper corner of the picture.
I use a 4k monitor, there are no problems with 1920x1080 monitors.

One must turn off "display scaling for higher DPI settings" for RGui to make it 
work correctly:

"Right-click the R shortcut on your desktop, then select Properties from the 
menu.
Once the Properties window is up, go to the Compatibility tab.
You will see a 'Disable display scaling on high DPI' option.
"

That works for 32-bit RGui version only, since 64-bit RGui is considered as a 
native application.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Get Logical processor count correctly whether NUMA is enabled or disabled

2018-09-03 Thread Tomas Kalibera
A summary for reference: the new detectCores() for Windows in R-devel 
seems to be working both for logical and physical cores on systems with 
 >64 logical processors  (thanks to Arun for testing!). If the feature 
is important for anyone particularly using an older version of Windows 
and/or on a system with >64 logical processors, it would be nice if you 
could test and report any possible problem.

As I mentioned earlier, in older versions of R one can as a workaround 
use "wmic" to detect the number of processors on systems with >64 
logical processors (with appropriate error handling added as needed):

# detectCores()
out <- system("wmic cpu get numberoflogicalprocessors", intern=TRUE)
sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, 
value=TRUE

#detectCores(logical=FALSE)
out <- system("wmic cpu get numberofcores", intern=TRUE)
sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, 
value=TRUE

The remaining problem with running using >64 processors on Windows 
turned out to be due to a bug in sockets communication, debugged and 
fixed in R-devel by Luke Tierney.

Tomas

On 08/29/2018 12:42 PM, Srinivasan, Arunkumar wrote:
> Dear Tomas, thank you very much. I installed r-devel r75201 and tested.
>
> The machine with 88 cores has NUMA disabled. It therefore has 2 processor 
> groups with 64 and 24 processors each.
>
> require(parallel)
> detectCores()
> # [1] 88
>
> This is great!
>
> Then I went on to test with a simple 'foreach()' loop. I started with 64 
> processors (max limit of 1 processor group). I ran with a simple function of 
> 0.5s sleep.
>
> require(snow)
> require(doSNOW)
> require(foreach)
>
> cl <- makeCluster(64L, "SOCK")
> registerDoSNOW(cl)
> system.time(foreach(i=1:64) %dopar% Sys.sleep(0.5))
> # user  system elapsed
> # 0.060.000.64
> system.time(foreach(i=1:65) %dopar% Sys.sleep(0.5))
> #user  system elapsed
> #0.030.011.04
> stopCluster(cl)
>
> With a cluster of 64 processors and loop running with 64 iterations, it 
> completed in ~.5s (0.64), and with 65 iterations, it took ~1s as expected.
>   
> cl <- makeCluster(65L, "SOCK")
> registerDoSNOW(cl)
> system.time(foreach(i=1:64) %dopar% Sys.sleep(0.5))
> user  system elapsed
> 0.030.020.61
> system.time(foreach(i=1:65) %dopar% Sys.sleep(0.5))
> # Timing stopped at: 0.08 0 293
> stopCluster(cl)
>
> However, when I increased the cluster to have 65 processors, a loop with 64 
> iterations seem to complete as expected, but using all 65 processors to loop 
> over 65 iterations didn't seem to complete. I stopped it after ~5mins. The 
> same happens with the cluster started with any number between 65 and 88. It 
> seems to me like we are still not being able to use >64 processors all at the 
> same time even if detectCores() returns the right count now.
>
> I'd appreciate your thoughts on this.
>
> Best,
> Arun.
>
> -Original Message-
> From: Tomas Kalibera 
> Sent: 27 August 2018 19:43
> To: Srinivasan, Arunkumar ; 
> r-devel@r-project.org
> Subject: Re: [Rd] Get Logical processor count correctly whether NUMA is 
> enabled or disabled
>
> Dear Arun,
>
> thank you for checking the workaround scripts.
>
> I've modified detectCores() to use GetLogicalProcessorInformationEx. It is in 
> revision 75198 of R-devel, could you please test it on your machines? For a 
> binary, you can wait until the R-devel snapshot build gets to at least this 
> svn revision.
>
> Thanks for the link to the processor groups documentation. I don't have a 
> machine to test this on, but I would hope that snow clusters (e.g.
> PSOCK) should work fine on systems with >64 logical processors as they spawn 
> new processes (not just threads). Note that FORK clusters are not supported 
> on Windows.
>
> Thanks
> Tomas
>
> On 08/21/2018 02:53 PM, Srinivasan, Arunkumar wrote:
>> Dear Tomas, thank you for looking into this. Here's the output:
>>
>> # number of logical processors - what detectCores() should return out
>> <- system("wmic cpu get numberoflogicalprocessors", intern=TRUE)
>> [1] "NumberOfLogicalProcessors  \r" "22 \r" "22  
>>\r"
>> [4] "20 \r" "22 \r" "\r"
>> sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out,
>> value=TRUE # [1] 86
>>
>> [I've asked the IT team to understand why one of the values is 20 instead of 
>> 22].
>>
>> # number of cores - what detectCores(FALSE) should return out <-
>> system("wmic cpu get numberofcores", intern=TRUE)
>> [1] "NumberOfCores  \r" "22 \r" "22 \r" "20  
>>\r" "22 \r"
>> [6] "\r"
>> sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out,
>> value=TRUE # [1] 86
>>
>> [Currently hyperthreading is disabled. So this output being identical to the 
>> previous output makes sense].
>>
>> system("wmic computersystem get numberofprocessors")
>> NumberOfProcessors
>> 4

Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?

2018-09-03 Thread Radford Neal
Regarding the discussion of getting length(unclass(x)) without an
unclassed version of x being created...

There are already no copies done for length(unclass(x)) in pqR
(current version of 2017-06-09 at pqR-project.org, as well as the
soon-to-be-release new version).  This is part of a more general
facility for avoiding copies from unclass in other circumstances as
well - eg, unclass(a)+unclass(b).

It's implemented using pqR's internal "variant result" mechanism.
Primitives such as "length" and "+" can ask for their arguments to be
evaluated in such a way that an "unclassed" result is possibly
returned with its class attribute still there, but with a flag set
(not in the object) to indicate that it should be ignored.

The variant result mechanism is also central to many other pqR
improvements, including deferred evaluation to enable automatic use of
multiple cores, and optimizations that allow fast evaluation of things
like any(x<0), any(is.na(x)), or all(is.na(x)) without creation of
intermediate results and with early termination when the result is
determined.

It is much better to use such a general mechanism that speeds up
existing code than to implement more and more special-case functions
like anyNA or some special function to allow length(unclass(x)) to be
done quickly.

The variant result mechanism has extremely low overhead, and is not
hard to implement.

   Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Get Logical processor count correctly whether NUMA is enabled or disabled

2018-09-03 Thread Srinivasan, Arunkumar
Tomas, Luke, thank you very much once again for patching both issues swiftly. 
This’ll be incredibly valuable to us once we move to 3.6.0.

From: Tomas Kalibera 
Sent: 03 September 2018 13:07
To: r-devel@r-project.org
Cc: Srinivasan, Arunkumar 
Subject: Re: [Rd] Get Logical processor count correctly whether NUMA is enabled 
or disabled

A summary for reference: the new detectCores() for Windows in R-devel seems to 
be working both for logical and physical cores on systems with >64 logical 
processors  (thanks to Arun for testing!). If the feature is important for 
anyone particularly using an older version of Windows and/or on a system with 
>64 logical processors, it would be nice if you could test and report any 
possible problem.

As I mentioned earlier, in older versions of R one can as a workaround use 
"wmic" to detect the number of processors on systems with >64 logical 
processors (with appropriate error handling added as needed):

# detectCores()
out <- system("wmic cpu get numberoflogicalprocessors", intern=TRUE)
sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, 
value=TRUE

#detectCores(logical=FALSE)
out <- system("wmic cpu get numberofcores", intern=TRUE)
sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, 
value=TRUE

The remaining problem with running using >64 processors on Windows turned out 
to be due to a bug in sockets communication, debugged and fixed in R-devel by 
Luke Tierney.

Tomas

On 08/29/2018 12:42 PM, Srinivasan, Arunkumar wrote:

Dear Tomas, thank you very much. I installed r-devel r75201 and tested.



The machine with 88 cores has NUMA disabled. It therefore has 2 processor 
groups with 64 and 24 processors each.



require(parallel)

detectCores()

# [1] 88



This is great!



Then I went on to test with a simple 'foreach()' loop. I started with 64 
processors (max limit of 1 processor group). I ran with a simple function of 
0.5s sleep.



require(snow)

require(doSNOW)

require(foreach)



cl <- makeCluster(64L, "SOCK")

registerDoSNOW(cl)

system.time(foreach(i=1:64) %dopar% Sys.sleep(0.5))

# user  system elapsed

# 0.060.000.64

system.time(foreach(i=1:65) %dopar% Sys.sleep(0.5))

#user  system elapsed

#0.030.011.04

stopCluster(cl)



With a cluster of 64 processors and loop running with 64 iterations, it 
completed in ~.5s (0.64), and with 65 iterations, it took ~1s as expected.



cl <- makeCluster(65L, "SOCK")

registerDoSNOW(cl)

system.time(foreach(i=1:64) %dopar% Sys.sleep(0.5))

   user  system elapsed

   0.030.020.61

system.time(foreach(i=1:65) %dopar% Sys.sleep(0.5))

# Timing stopped at: 0.08 0 293

stopCluster(cl)



However, when I increased the cluster to have 65 processors, a loop with 64 
iterations seem to complete as expected, but using all 65 processors to loop 
over 65 iterations didn't seem to complete. I stopped it after ~5mins. The same 
happens with the cluster started with any number between 65 and 88. It seems to 
me like we are still not being able to use >64 processors all at the same time 
even if detectCores() returns the right count now.



I'd appreciate your thoughts on this.



Best,

Arun.



-Original Message-

From: Tomas Kalibera 

Sent: 27 August 2018 19:43

To: Srinivasan, Arunkumar 
; 
r-devel@r-project.org

Subject: Re: [Rd] Get Logical processor count correctly whether NUMA is enabled 
or disabled



Dear Arun,



thank you for checking the workaround scripts.



I've modified detectCores() to use GetLogicalProcessorInformationEx. It is in 
revision 75198 of R-devel, could you please test it on your machines? For a 
binary, you can wait until the R-devel snapshot build gets to at least this svn 
revision.



Thanks for the link to the processor groups documentation. I don't have a 
machine to test this on, but I would hope that snow clusters (e.g.

PSOCK) should work fine on systems with >64 logical processors as they spawn 
new processes (not just threads). Note that FORK clusters are not supported on 
Windows.



Thanks

Tomas



On 08/21/2018 02:53 PM, Srinivasan, Arunkumar wrote:

Dear Tomas, thank you for looking into this. Here's the output:



# number of logical processors - what detectCores() should return out

<- system("wmic cpu get numberoflogicalprocessors", intern=TRUE)

[1] "NumberOfLogicalProcessors  \r" "22 \r" "22 
\r"

[4] "20 \r" "22 \r" "\r"

sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out,

value=TRUE # [1] 86



[I've asked the IT team to understand why one of the values is 20 instead of 
22].



# number of cores - what detectCores(FALSE) should return out <-

system("wmic cpu get numberofcores", intern=TRUE)

[1] "NumberOfCores  \r" "22 \r" "22 \r" "20 
\r" "22 

Re: [Rd] compairing doubles

2018-09-03 Thread Martin Maechler
> Rui Barradas 
> on Mon, 3 Sep 2018 09:58:34 +0100 writes:

> Hello, Watch out for operator precedence.

indeed!  (but not only)

> all.equal(0.3, 0.1*3)
> #[1] TRUE
> 
> 
> `%~~%` <- function (e1, e2)  all.equal(e1, e2)
> 
> 0.3 %~~% 0.1*3
> #Error in 0.3 %~~% 0.1 * 3 : argumento não-numérico para operador binário
> 
> 
> 0.3 %~~% (0.1*3)
> #[1] TRUE
> 
> 
> Now with isTRUE. The problem changes a bit.
> 
> 
> isTRUE(all.equal(0.3, 0.1*3))
> #[1] TRUE
> 
> 
> `%~~%` <- function (e1, e2)  isTRUE(all.equal(e1, e2))
> 
> 0.3 %~~% 0.1*3
> #[1] 0
> 
> 0.3 %~~% (0.1*3)
> #[1] TRUE
> 

> Hope this helps,
> Rui Barradas
> 
> Às 08:20 de 03/09/2018, Juan Telleria Ruiz de Aguirre escreveu:
> > Maybe a new Operator could be defined for a fast and easy double
> > Comparison: `~~`
> > 
> > `~~` <- function (e1, e2)  all.equal(e1, e2)
> > 
> > And document it properly.
> > 

I would still quite strongly recommend against such a
definition:

If you ask for  help(all.equal)
you do see that it is a generic with a   all.equal.numeric()
method which has several extra arguments 
(new ones even in R-devel)  the most important one being the
numerical  'tolerance'  with a default of
sqrt(.Machine$double.eps)  { == 2^-26 == 1.490116e-08  on all current platforms}

Of course there is some arbitraryness in that choice
{{ but only *some*: the default is related to finding the minimum of
   smooth function which hence is locally quadratic at a "decent"
   minimum hence sqrt(.)
}}
but I find it important sometimes to increase the equality
strictness of that tolerance.

Hiding everything behind a new operator which does not allow to
take into account that there are quite a few versions of
near-equality --- only partly) mirrored by the existence of
extra arguments of all.equal() --- only encourages simplified
thinking about the underlying subtle issues  which already too
many people don't care to know about.

(( e.g. all those people only caring for speed, but not for
   accuracy and reliability ... ))

Martin Maechler
ETH Zurich and R Core

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?

2018-09-03 Thread Dénes Tóth

Hi Tomas,

On 09/03/2018 11:49 AM, Tomas Kalibera wrote:
Please don't do this to get the underlying vector length (or to achieve 
anything else). Setting/deleting attributes of an R object without 
checking the reference count violates R semantics, which in turn can 
have unpredictable results on R programs (essentially undebuggable 
segfaults now or more likely later when new optimizations or features 
are added to the language). Setting attributes on objects with reference 
count (currently NAMED value) greater than 0 (in some special cases 1 is 
ok) is cheating - please see Writing R Extensions - and getting speedups 
via cheating leads to fragile, unmaintainable and buggy code. 


Please note that data.table::setattr is an exported function of a widely 
used package (available from CRAN), which also has a description in 
?data.table::setattr why it might be useful.


Of course one has to use set* functions from data.table with extreme 
care, but if one does it in the right way, they can help a lot. For 
example there is no real danger of using them in internal functions 
where one can control what is get passed to the function or created 
within the function (so when one knows that the refcount==0 condition is 
true).


(Notwithstanding the above, but also supporting you argumentation, it 
took me hours to debug a particular problem in one of my internal 
packages, see https://github.com/Rdatatable/data.table/issues/1281)


In the present case, an important and unanswered question is (cited from 
Henrik):

>>> However, I'm concerned that calling unclass(x) may trigger an
>>> expensive copy internally in some cases.  Is that concern unfounded?

If no copy is made, length(unclass(x)) beats length(setattr(..)) in all 
scenarios.



Doing so 
in packages is particularly unhelpful to the whole community - packages 
should only use the public API as documented.


Similarly, getting a physical address of an object to hack around 
whether R has copied it or not should certainly not be done in packages 
and R code should never be working with or even obtaining physical 
address of an object. This is also why one cannot obtain such address 
using base R (apart in textual form from certain diagnostic messages 
where it can indeed be useful for low-level debugging).


Getting the physical address of the object was done exclusively for 
demonstration purposes. I totally agree that is should not be used for 
the purpose you described and I have never ever done so.


Regards,
Denes



Tomas

On 09/02/2018 01:19 AM, Dénes Tóth wrote:
The solution below introduces a dependency on data.table, but 
otherwise it does what you need:


---

# special method for Foo objects
length.Foo <- function(x) {
  length(unlist(x, recursive = TRUE, use.names = FALSE))
}

# an instance of a Foo object
x <- structure(list(a = 1, b = list(b1 = 1, b2 = 2)), class = "Foo")

# its length
stopifnot(length(x) == 3L)

# get its length as if it were a standard list
.length <- function(x) {
  cls <- class(x)
  # setattr() does not make a copy, but modifies by reference
  data.table::setattr(x, "class", NULL)
  # get the length
  len <- base::length(x)
  # re-set original classes
  data.table::setattr(x, "class", cls)
  # return the unclassed length
  len
}

# to check that we do not make unwanted changes
orig_class <- class(x)

# check that the address in RAM does not change
a1 <- data.table::address(x)

# 'unclassed' length
stopifnot(.length(x) == 2L)

# check that address is the same
stopifnot(a1 == data.table::address(x))

# check against original class
stopifnot(identical(orig_class, class(x)))

---


On 08/24/2018 07:55 PM, Henrik Bengtsson wrote:

Is there a low-level function that returns the length of an object 'x'
- the length that for instance .subset(x) and .subset2(x) see? An
obvious candidate would be to use:

.length <- function(x) length(unclass(x))

However, I'm concerned that calling unclass(x) may trigger an
expensive copy internally in some cases.  Is that concern unfounded?

Thxs,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] True length - length(unclass(x)) - without having to call unclass()?

2018-09-03 Thread Tomas Kalibera

On 09/03/2018 03:59 PM, Dénes Tóth wrote:

Hi Tomas,

On 09/03/2018 11:49 AM, Tomas Kalibera wrote:
Please don't do this to get the underlying vector length (or to 
achieve anything else). Setting/deleting attributes of an R object 
without checking the reference count violates R semantics, which in 
turn can have unpredictable results on R programs (essentially 
undebuggable segfaults now or more likely later when new 
optimizations or features are added to the language). Setting 
attributes on objects with reference count (currently NAMED value) 
greater than 0 (in some special cases 1 is ok) is cheating - please 
see Writing R Extensions - and getting speedups via cheating leads to 
fragile, unmaintainable and buggy code. 



Hi Denes,

Please note that data.table::setattr is an exported function of a 
widely used package (available from CRAN), which also has a 
description in ?data.table::setattr why it might be useful.
indeed, and not your fault, but the function is cheating and that it is 
in a widely used package, even exported from it, does not make it any 
safer. The related optimization in base R (shallow copying) mentioned in 
the documentation of data.table::setattr is on the other hand sound, it 
does not break the semantics.
Of course one has to use set* functions from data.table with extreme 
care, but if one does it in the right way, they can help a lot. For 
example there is no real danger of using them in internal functions 
where one can control what is get passed to the function or created 
within the function (so when one knows that the refcount==0 condition 
is true).
Extreme care is not enough as the internals can and do change (and with 
the limits given by documentation, they are likely to change soon wrt to 
NAMED/reference counting), not mentioning that they are very 
complicated. The approach of "modify in place because we know the 
reference count is 0" is particularly error prone and unnecessary. It is 
unnecessary because there is documented C API for legitimate use in 
packages to find out whether an object may be referenced/shared 
(indirectly checks the reference count). If not, it can be modified in 
place without cheating, and some packages do it. It is error prone 
because the reference count can change due to many things package 
developers cannot be expected to know (and again, these things change): 
in set* functions for example, it will never be 0 (!), these functions 
with their current API can never be implemented in current R without 
breaking the semantics.


In principle one can do similar things legitimately by wrapping objects 
in an environment, passing such environment (environments can 
legitimately be modified in place), checking the contained objects have 
reference count of 1 (not shared), and if so, modifying them in place. 
But indeed, as soon as such objects become shared, there is no way out, 
one has to copy (in the current R).


Best
Tomas

(Notwithstanding the above, but also supporting you argumentation, it 
took me hours to debug a particular problem in one of my internal 
packages, see https://github.com/Rdatatable/data.table/issues/1281)


In the present case, an important and unanswered question is (cited 
from Henrik):

>>> However, I'm concerned that calling unclass(x) may trigger an
>>> expensive copy internally in some cases.  Is that concern unfounded?

If no copy is made, length(unclass(x)) beats length(setattr(..)) in 
all scenarios.



Doing so in packages is particularly unhelpful to the whole community 
- packages should only use the public API as documented.


Similarly, getting a physical address of an object to hack around 
whether R has copied it or not should certainly not be done in 
packages and R code should never be working with or even obtaining 
physical address of an object. This is also why one cannot obtain 
such address using base R (apart in textual form from certain 
diagnostic messages where it can indeed be useful for low-level 
debugging).


Getting the physical address of the object was done exclusively for 
demonstration purposes. I totally agree that is should not be used for 
the purpose you described and I have never ever done so.


Regards,
Denes



Tomas

On 09/02/2018 01:19 AM, Dénes Tóth wrote:
The solution below introduces a dependency on data.table, but 
otherwise it does what you need:


---

# special method for Foo objects
length.Foo <- function(x) {
  length(unlist(x, recursive = TRUE, use.names = FALSE))
}

# an instance of a Foo object
x <- structure(list(a = 1, b = list(b1 = 1, b2 = 2)), class = "Foo")

# its length
stopifnot(length(x) == 3L)

# get its length as if it were a standard list
.length <- function(x) {
  cls <- class(x)
  # setattr() does not make a copy, but modifies by reference
  data.table::setattr(x, "class", NULL)
  # get the length
  len <- base::length(x)
  # re-set original classes
  data.table::setattr(x, "class", cls)
  # return the unclassed length
  len
}

# to check t

[Rd] ubuntu software updater clash with with cloud.r-project

2018-09-03 Thread David Shera
This seems the most appropriate place to report this.

I just updated my ubuntu to 18.04.  And installed R by adding the line to 
/etc/apt/source.list:   deb ... cloud.r-project ... bionic-cran35/
R installed just fine.
However, my ubuntu software update would not finish correctly any more.  Failed 
to access ... check internet connection   (internet connection was just fine.)

Commented out the line in sources.list and software update runs just fine now.

So if I want to use apt to get packages, I'll have to uncomment the line, get 
the packages, and then comment it out again?  Seems like a bug, but not part of 
any R package.


Thanks.
-David



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel