[Rd] Parallel compression support for saving to rds/rdata files?

2016-12-15 Thread Kenny Bell
Hi,

I have tried to follow the instructions in the ``save`` documentation and
it doesn't seem to work (see below):

mydata <- do.call(rbind, rep(iris, 1))
con <- pipe("pigz -p8 > fname.gz", "wb");
save(mydata, file = con); close(con) # This runs

R.utils::gunzip("fname.gz", "fname.RData", overwrite = TRUE)
load("fname.RData") # Error: error reading from connection

First question: Should the above work?

Second question: Is it possible to make this dummy friendly by allowing
"pigz" as an option for ``compress`` in saveRDS and save? And in such a way
that the decompressing is hidden from the user like normal?

Thanks!
Kenny


-- 
Kendon Bell
Email: km...@berkeley.edu
Phone: (510) 612-3375

Ph.D. Candidate
Department of Agricultural & Resource Economics
University of California, Berkeley

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Parallel compression support for saving to rds/rdata files?

2016-12-15 Thread Simon Urbanek

> On Dec 15, 2016, at 12:08 AM, Kenny Bell  wrote:
> 
> Hi,
> 
> I have tried to follow the instructions in the ``save`` documentation and
> it doesn't seem to work (see below):
> 
> mydata <- do.call(rbind, rep(iris, 1))
> con <- pipe("pigz -p8 > fname.gz", "wb");
> save(mydata, file = con); close(con) # This runs
> 
> R.utils::gunzip("fname.gz", "fname.RData", overwrite = TRUE)
> load("fname.RData") # Error: error reading from connection
> 
> First question: Should the above work?
> 


Not really, gzip is a bad example, because it doesn't really support parallel 
compression (since a gzip stream cannot be chopped into blocks by design), but 
you can do it with bzip2:

mydata <- do.call(rbind, rep(iris, 1))
con <- pipe("pbzip2 -p8 > fname.bz2", "wb")
save(mydata, file = con)
close(con) 

load("fname.bz2")

you can also use parallel read:

load(pipe("pbzip2 -dc fname.bz2"))

Cheers,
Simon



> Second question: Is it possible to make this dummy friendly by allowing
> "pigz" as an option for ``compress`` in saveRDS and save? And in such a way
> that the decompressing is hidden from the user like normal?
> 
> Thanks!
> Kenny
> 
> 
> -- 
> Kendon Bell
> Email: km...@berkeley.edu
> Phone: (510) 612-3375
> 
> Ph.D. Candidate
> Department of Agricultural & Resource Economics
> University of California, Berkeley
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] print.POSIXct doesn't seem to use tz argument, as per its example

2016-12-15 Thread Jennifer Lyon
On the documentation page for DateTimeClasses, in the Examples section,
there are the following two lines:

format(.leap.seconds) # the leap seconds in your time zone
print(.leap.seconds, tz = "PST8PDT")  # and in Seattle's

The second line (using print) seems to ignore the tz argument, and prints
the dates in my time zone, while:

format(.leap.seconds, tz = "PST8PDT")

does print the dates in PST. The code in
https://github.com/wch/r-source/blob/trunk/src/library/base/R/datetime.R
around line 234 looks like the ... argument is passed to print, not to
format.

print.POSIXct <-
print.POSIXlt <- function(x, ...)
{
max.print <- getOption("max.print", L)
if(max.print < length(x)) {
print(format(x[seq_len(max.print)], usetz = TRUE), ...)
cat(' [ reached getOption("max.print") -- omitted',
length(x) - max.print, 'entries ]\n')
} else print(if(length(x)) format(x, usetz = TRUE)
 else paste(class(x)[1L], "of length 0"), ...)
invisible(x)
}

The documentation for print() on this page seems to be silent on tz as an
argument, but I do believe the example using print() does not work as
advertised.

Thanks.

Jen
sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New leap second end of 2016 / beginning 2017 (depending on TZ)

2016-12-15 Thread Martin Maechler
> Martin Maechler 
> on Wed, 14 Dec 2016 17:04:22 +0100 writes:

> As R is sophisticated enough to track leap seconds,
> ?.leap.seconds

> we'd need to update our codes real soon now again:

> https://en.wikipedia.org/wiki/Leap_second

> (and those of you who want second precision in R in 2017 need to start
> working with 'R patched' or 'R devel' ...)

I've been told offline, that the above could be considered as
FUD .. which I hope nobody read from it.

Furthermore, there seems to be wide disagreement about the
usefulness of leap seconds, and how computers (and OSs) should
deal with them.
One recent approach (e.g. by Google) is to "smear the leap
second" into the system (by somehow "throttling" time servers ;-)..

(and no, I even less would want this to become a long thread, so
 please refrain if you can ...)

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel