[Rd] Integrate erros on certain functions
In the help for ?integrate: >When integrating over infinite intervals do so explicitly, rather than just using a large number as the endpoint. This increases the chance of a correct answer – any function whose integral over an infinite interval is finite must be near zero for most of that interval. I understand that and there are examples such as: ## a slowly-convergent integral integrand <- function(x) {1/((x+1)*sqrt(x))} integrate(integrand, lower = 0, upper = Inf) ## don't do this if you really want the integral from 0 to Inf integrate(integrand, lower = 0, upper = 100, stop.on.error = FALSE) #> failed with message ‘the integral is probably divergent’ which gives an error message if stop.on.error = FALSE. But what happens on something like the function below: integrate(function(x) exp(-x), lower = 0, upper =Inf) #> 1 with absolute error < 5.7e-05 integrate(function(x) exp(-x), lower = 0, upper =13000) #> 2.819306e-05 with absolute error < 5.6e-05 *integrate(function(x) exp(-x), lower = 0, upper =13000, stop.on.error = FALSE)#> 2.819306e-05 with absolute error < 5.6e-05* I'm not sure this is a bug or misuse of the function, but I would assume the last integrate to give an error if stop.on.error = FALSE. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Read.dcf with no newline ending: gzfile drops last line
I don't know if this is a bug per se, but an undesired behavior in read.dcf. read.dcf takes a file argument and passes it to gzfile if it's a character: if (is.character(file)) { file <- gzfile(file) on.exit(close(file)) } This gzfile connection is passed to readLines (line #39): lines <- readLines(file) If no newline is at the end of the file, readLines doesn't give a warning (I think appropriate behavior). If a DESCRIPTION file doesn't happen to have a newline at the end of it (odd, but it may happen), then the last tag is dropped: > x = "Package: test + Type: Package" > > ## > # No Newline in file > ## > fname = tempfile() > writeLines(x, fname, sep = "") > > ### readlines with character - warning but all fields > readLines(fname) [1] "Package: test" "Type: Package" Warning message: In readLines(fname) : incomplete final line found on '/var/folders/1s/wrtqcpxn685_zk570bnx9_rrgr/T//Rtmpz95dsT/file180a65a6b745' > ### readlines with file connection - warning but all fields > file_con <- file(fname) > readLines(file_con) [1] "Package: test" "Type: Package" Warning message: In readLines(file_con) : incomplete final line found on '/var/folders/1s/wrtqcpxn685_zk570bnx9_rrgr/T//Rtmpz95dsT/file180a65a6b745' > > ### readlines with gzfile connection > ## no warning and drops last field > gz_con = gzfile(fname) > readLines(gz_con) # ONLY 1 lines! [1] "Package: test" > > ## > # No Newline in file - fine > ## > ### readlines with gzfile connection > ## no warning and drops last field but OK > writeLines(x, fname, sep = "\n") > gz_con = gzfile(fname) > readLines(gz_con) [1] "Package: test" "Type: Package" Currently I use file(fname) before read.dcf to be sure a warning occurs, but all fields are read. I didn't see anything in read.dcf help about this. readLines states clearly: "If the final line is incomplete (no final EOL marker) the behaviour depends on whether the connection is blocking or not", but it's not 100% clear that read.dcf uses gzfile if the file is not compressed. Thanks John __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Floating Point with POSIXct
That’s a good point. I think that’s fair and why rounding may not be the appropriate default. Oddly enough, I think 1:59:60 may be more appropriate though wrong. The way the seconds are separated in POSIXlt however, I don’t think that would ever happen, but the big downside would be if that would round to 1:59:59.00 On Thu, Mar 3, 2022 at 2:38 PM Duncan Murdoch wrote: > On 03/03/2022 11:52 a.m., Martin Maechler wrote: > >>>>>> John Muschelli > >>>>>> on Thu, 3 Mar 2022 11:04:05 -0500 writes: > >>>>>> John Muschelli > >>>>>> on Thu, 3 Mar 2022 11:04:05 -0500 writes: > > > > > I see in ?POSIXct and I'm trying to understand the note: > > >> Classes "POSIXct" and "POSIXlt" are able to express fractions of > a second. (Conversion of fractions between the two forms may not be exact, > but will have better than microsecond accuracy.) > > > > > Mainly, I'm trying to understand printing of POSIXct with > fractional > > > seconds. I see print.POSIXct calls format.POSIXct and eventually > > > calls format.POSIXlt, which then takes into account `digits.secs` > for > > > printing. The format uses %OS3, which strptime indicates (* > added): > > > > >> Specific to R is %OSn, which for output gives the seconds > *truncated* to 0 <= n <= 6 decimal places (and if %OS is not followed by a > digit, it uses the setting of getOption("digits.secs"), or if that is > unset, n = 0). > > > > > So I'm seeing it truncates the seconds to 3 digits, so I think > that is > > > why the below is printing 0.024. > > > > > I think this is especially relevant even if you set > > > `options(digits.secs = 6)`, then the code in > > > format.POSIXlt would still return np=3 as the following condition > > > would break at i = 3 > > > > > for (i in seq_len(np) - 1L) > > > if (all(abs(secs - round(secs, > i)) < 1e-06)) { > > > np <- i > > > break > > > } > > > > > as sub_seconds - round(sub_seconds,3) < 1e-06. This seems to be > > > expected behavior given the docs, but would any consider this a > bug? > > > > > > > Example: > > > > > options(digits.secs = 4) > > > x = structure(947016000.025, class = c("POSIXct", "POSIXt"), > tzone = "UTC") > > > > I think you've fallen into the R FAQ 7.31 trap : > > > >> ct <- 947016000.025 > >> ct %% 1 > > [1] 0.0248 > >> > > > > Of course, the issue may still be somewhat interesting, ... > > > > Yes, POSIXct is of limited precision and I think the help page > > you mentioned did document that that's one reason for using > > POSIXlt instead, as there, sub second accuracy can be much better. > > > > But FAQ 7.31 and the fact that all numbers are base 2 and in > > base 2, no decimal .025 can be represented in full accuracy. > > > > Also, as you've noticed the R POSIX[cl]t code just truncates, > > i.e. rounds towards 0 unconditionally, and I tend to agree that it > > should rather round than truncate. > > If you print the hour and minute at 01:59:59, you get 1 and 59, not 2 > and 0. That may be the motivation for doing the same for fractional > seconds. Should 1:59:59.9 really print as 2:00:00? > > Duncan Murdoch > > > > But we should carefully separate the issues here, from the > > underlying inherent FAQ 7.31 truth that most decimal numbers in > > a computer are not quite what they look like ... > > > > Martin Maechler > > ETH Zurich and R Core Team (also author of the CRAN package 'round') > > > > > > > summary(x, digits = 20) > > > #> Min. 1st Qu. > Median > > > #> "2000-01-04 20:00:00.024" "2000-01-04 20:00:00.024" > "2000-01-04 20:00:00.024" > > > #> Mean 3rd Qu. > Max. > > > #> "2000-01-04 20:00:00.024" "2000-01-04 20:00:00.024" > "2000-01-04 20:00:00.024" > > > x > > > #> [1] "2000-01-04 20:00:00.024 UTC" > > > format.POSIXct(x, format = "%Y-%m-%d %H:%M:%OS3") > > > #> [1] "2000-01-04 20:00:00.024" > >
[Rd] Printing digits.secs on data.frame?
Is there a way to have printing data.frames with POSIXct to display milliseconds if digits.secs is set as a default? You can use the digits argument in print, such as print(df, digits = 3) to get the intended output, but I assumed it was done with the option digits.secs set. Tibbles by default do this printing, which is shown below, but I was unsure if digits.secs should affect printing data.frames, as we see below it affects printing POSIXct outside of a data.frame. ``` r df = structure(list(time = structure(c(1509375600, 1509375600.0, 1509375600.06667, 1509375600.1, 1509375600.1, 1509375600.16667 ), class = c("POSIXct", "POSIXt"), tzone = "GMT"), X = c(0.188, 0.18, 0.184, 0.184, 0.184, 0.184), Y = c(0.145, 0.125, 0.121, 0.121, 0.117, 0.125), Z = c(-0.984, -0.988, -0.984, -0.992, -0.988, -0.988)), row.names = c(NA, 6L), class = "data.frame") options(digits.secs = NULL) getOption("digits.secs") #> NULL df #> time X Y Z #> 1 2017-10-30 15:00:00 0.188 0.145 -0.984 #> 2 *2017-10-30 15:00:00* 0.180 0.125 -0.988 #> 3 2017-10-30 15:00:00 0.184 0.121 -0.984 #> 4 2017-10-30 15:00:00 0.184 0.121 -0.992 #> 5 2017-10-30 15:00:00 0.184 0.117 -0.988 #> 6 2017-10-30 15:00:00 0.184 0.125 -0.988 print(df) #> time X Y Z #> 1 2017-10-30 15:00:00 0.188 0.145 -0.984 #> 2 *2017-10-30 15:00:00* 0.180 0.125 -0.988 #> 3 2017-10-30 15:00:00 0.184 0.121 -0.984 #> 4 2017-10-30 15:00:00 0.184 0.121 -0.992 #> 5 2017-10-30 15:00:00 0.184 0.117 -0.988 #> 6 2017-10-30 15:00:00 0.184 0.125 -0.988 df$time #> [1] "2017-10-30 15:00:00 GMT" "*2017-10-30 15:00:00 GMT*" #> [3] "2017-10-30 15:00:00 GMT" "2017-10-30 15:00:00 GMT" #> [5] "2017-10-30 15:00:00 GMT" "2017-10-30 15:00:00 GMT" print(df, digits = 3) #> time X Y Z #> 1 2017-10-30 15:00:00.000 0.188 0.145 -0.984 #> 2 *2017-10-30 15:00:00.033* 0.180 0.125 -0.988 #> 3 2017-10-30 15:00:00.066 0.184 0.121 -0.984 #> 4 2017-10-30 15:00:00.099 0.184 0.121 -0.992 #> 5 2017-10-30 15:00:00.133 0.184 0.117 -0.988 #> 6 2017-10-30 15:00:00.166 0.184 0.125 -0.988 tibble::as_tibble(df) #> # A tibble: 6 × 4 #> timeX Y Z #> #> 1 2017-10-30 15:00:00 0.188 0.145 -0.984 #> 2 *2017-10-30 15:00:00* 0.18 0.125 -0.988 #> 3 2017-10-30 15:00:00 0.184 0.121 -0.984 #> 4 2017-10-30 15:00:00 0.184 0.121 -0.992 #> 5 2017-10-30 15:00:00 0.184 0.117 -0.988 #> 6 2017-10-30 15:00:00 0.184 0.125 -0.988 ``` We see by default tibbles do this printing ``` r options(digits.secs = 3) getOption("digits.secs") #> [1] 3 df #> time X Y Z #> 1 2017-10-30 15:00:00 0.188 0.145 -0.984 #> 2 *2017-10-30 15:00:00* 0.180 0.125 -0.988 #> 3 2017-10-30 15:00:00 0.184 0.121 -0.984 #> 4 2017-10-30 15:00:00 0.184 0.121 -0.992 #> 5 2017-10-30 15:00:00 0.184 0.117 -0.988 #> 6 2017-10-30 15:00:00 0.184 0.125 -0.988 print(df) #> time X Y Z #> 1 2017-10-30 15:00:00 0.188 0.145 -0.984 #> 2* 2017-10-30 15:00:00* 0.180 0.125 -0.988 #> 3 2017-10-30 15:00:00 0.184 0.121 -0.984 #> 4 2017-10-30 15:00:00 0.184 0.121 -0.992 #> 5 2017-10-30 15:00:00 0.184 0.117 -0.988 #> 6 2017-10-30 15:00:00 0.184 0.125 -0.988 ``` We see that this affects printing POSIXct outside of a data.frame ``` r df$time #> [1] "2017-10-30 15:00:00.000 GMT" "*2017-10-30 15:00:00.033 GMT*" #> [3] "2017-10-30 15:00:00.066 GMT" "2017-10-30 15:00:00.099 GMT" #> [5] "2017-10-30 15:00:00.133 GMT" "2017-10-30 15:00:00.166 GMT" print(df, digits = 3) #> time X Y Z #> 1 2017-10-30 15:00:00.000 0.188 0.145 -0.984 #> 2 *2017-10-30 15:00:00.033* 0.180 0.125 -0.988 #> 3 2017-10-30 15:00:00.066 0.184 0.121 -0.984 #> 4 2017-10-30 15:00:00.099 0.184 0.121 -0.992 #> 5 2017-10-30 15:00:00.133 0.184 0.117 -0.988 #> 6 2017-10-30 15:00:00.166 0.184 0.125 -0.988 tibble::as_tibble(df) #> # A tibble: 6 × 4 #> timeX Y Z #> #> 1 2017-10-30 15:00:00.000 0.188 0.145 -0.984 #> 2 *2017-10-30 15:00:00.033* 0.18 0.125 -0.988 #> 3 2017-10-30 15:00:00.066 0.184 0.121 -0.984 #> 4 2017-10-30 15:00:00.099 0.184 0.121 -0.992 #> 5 2017-10-30 15:00:00.133 0.184 0.117 -0.988 #> 6 2017-10-30 15:00:00.166 0.184 0.125 -0.988 ``` Created on 2024-07-18 with [reprex v2.1.0](https://reprex.tidyverse.org ) Session info ``` r sessioninfo::session_info() #> ─ Session info ─── #> setting value #> version R version 4.4.0 (2024-04-24) #> os macOS Sonoma 14.4.1 #> system x86_64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctypeen_US.UTF-8 #> tz America/New_York #> date 2024-07-18 #> pandoc 3.2 @ /usr/local/bin/ (via rmarkdown) #> ``` [[alternative HTML version deleted]] __ R-devel@r-project.org ma