[Rd] Integrate erros on certain functions

2018-03-23 Thread John Muschelli
In the help for ?integrate:

>When integrating over infinite intervals do so explicitly, rather than
just using a large number as the endpoint. This increases the chance of a
correct answer – any function whose integral over an infinite interval is
finite must be near zero for most of that interval.

I understand that and there are examples such as:

## a slowly-convergent integral
integrand <- function(x) {1/((x+1)*sqrt(x))}
integrate(integrand, lower = 0, upper = Inf)

## don't do this if you really want the integral from 0 to Inf
integrate(integrand, lower = 0, upper = 100, stop.on.error = FALSE)
#> failed with message ‘the integral is probably divergent’

which gives an error message if stop.on.error = FALSE. But what happens on
something like the function below:
integrate(function(x) exp(-x), lower = 0, upper =Inf)
#> 1 with absolute error < 5.7e-05
integrate(function(x) exp(-x), lower = 0, upper =13000)
#> 2.819306e-05 with absolute error < 5.6e-05

*integrate(function(x) exp(-x), lower = 0, upper =13000, stop.on.error =
FALSE)#> 2.819306e-05 with absolute error < 5.6e-05*

I'm not sure this is a bug or misuse of the function, but I would assume
the last integrate to give an error if stop.on.error = FALSE.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Read.dcf with no newline ending: gzfile drops last line

2016-11-14 Thread John Muschelli
I don't know if this is a bug per se, but an undesired behavior in
read.dcf.  read.dcf takes a file argument and passes it to gzfile if
it's a character:
if (is.character(file)) {
file <- gzfile(file)
on.exit(close(file))
}
This gzfile connection is passed to readLines (line #39):
lines <- readLines(file)

If no newline is at the end of the file, readLines doesn't give a
warning (I think appropriate behavior).  If a DESCRIPTION file doesn't
happen to have a newline at the end of it (odd, but it may happen),
then the last tag is dropped:

> x = "Package: test
+ Type: Package"
>
> ##
> # No Newline in file
> ##
> fname = tempfile()
> writeLines(x, fname, sep = "")
>
> ### readlines with character - warning but all fields
> readLines(fname)
[1] "Package: test" "Type: Package"
Warning message:
In readLines(fname) :
  incomplete final line found on
'/var/folders/1s/wrtqcpxn685_zk570bnx9_rrgr/T//Rtmpz95dsT/file180a65a6b745'
> ### readlines with file connection - warning but all fields
> file_con <- file(fname)
> readLines(file_con)
[1] "Package: test" "Type: Package"
Warning message:
In readLines(file_con) :
  incomplete final line found on
'/var/folders/1s/wrtqcpxn685_zk570bnx9_rrgr/T//Rtmpz95dsT/file180a65a6b745'
>
> ### readlines with gzfile connection
> ## no warning and drops last field
> gz_con = gzfile(fname)
> readLines(gz_con) # ONLY 1 lines!
[1] "Package: test"
>
> ##
> # No Newline in file - fine
> ##
> ### readlines with gzfile connection
> ## no warning and drops last field but OK
> writeLines(x, fname, sep = "\n")
> gz_con = gzfile(fname)
> readLines(gz_con)
[1] "Package: test" "Type: Package"

Currently I use file(fname) before read.dcf to be sure a warning
occurs, but all fields are read.  I didn't see anything in read.dcf
help about this.  readLines states clearly:
"If the final line is incomplete (no final EOL marker) the behaviour
depends on whether the connection is blocking or not", but it's not
100% clear that read.dcf uses gzfile if the file is not compressed.


Thanks
John

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Floating Point with POSIXct

2022-03-03 Thread John Muschelli
That’s a good point. I think that’s fair and why rounding may not be the
appropriate default.  Oddly enough, I think 1:59:60 may be more appropriate
though wrong.  The way the seconds are separated in POSIXlt however, I
don’t think that would ever happen, but the big downside would be if that
would round to 1:59:59.00

On Thu, Mar 3, 2022 at 2:38 PM Duncan Murdoch 
wrote:

> On 03/03/2022 11:52 a.m., Martin Maechler wrote:
> >>>>>> John Muschelli
> >>>>>>  on Thu, 3 Mar 2022 11:04:05 -0500 writes:
> >>>>>> John Muschelli
> >>>>>>  on Thu, 3 Mar 2022 11:04:05 -0500 writes:
> >
> >  > I see in ?POSIXct and I'm trying to understand the note:
> >  >> Classes "POSIXct" and "POSIXlt" are able to express fractions of
> a second. (Conversion of fractions between the two forms may not be exact,
> but will have better than microsecond accuracy.)
> >
> >  > Mainly, I'm trying to understand printing of POSIXct with
> fractional
> >  > seconds.  I see print.POSIXct calls format.POSIXct and eventually
> >  > calls format.POSIXlt, which then takes into account `digits.secs`
> for
> >  > printing. The format uses %OS3, which strptime indicates (*
> added):
> >
> >  >> Specific to R is %OSn, which for output gives the seconds
> *truncated* to 0 <= n <= 6 decimal places (and if %OS is not followed by a
> digit, it uses the setting of getOption("digits.secs"), or if that is
> unset, n = 0).
> >
> >  > So I'm seeing it truncates the seconds to 3 digits, so I think
> that is
> >  > why the below is printing 0.024.
> >
> >  > I think this is especially relevant even if you set
> >  > `options(digits.secs = 6)`, then the code in
> >  > format.POSIXlt would still return np=3 as the following condition
> >  > would break at i = 3
> >
> >  > for (i in seq_len(np) - 1L)
> >  >   if (all(abs(secs - round(secs, > i)) < 1e-06)) {
> >  > np <- i
> >  > break
> >  > }
> >
> >  > as sub_seconds - round(sub_seconds,3) < 1e-06.   This seems to be
> >  > expected behavior given the docs, but would any consider this a
> bug?
> >
> >
> >  > Example:
> >
> >  > options(digits.secs = 4)
> >  > x = structure(947016000.025, class = c("POSIXct", "POSIXt"),
> tzone = "UTC")
> >
> > I think you've fallen into the R FAQ 7.31 trap :
> >
> >> ct <- 947016000.025
> >> ct %% 1
> > [1] 0.0248
> >>
> >
> > Of course, the issue may still be somewhat interesting, ...
> >
> > Yes, POSIXct is of limited precision and I think the help page
> > you mentioned did document that that's one reason for using
> > POSIXlt instead, as there, sub second accuracy can be much better.
> >
> > But FAQ 7.31 and the fact that all numbers are base 2 and in
> > base 2,  no decimal   .025   can be represented in full accuracy.
> >
> > Also, as you've noticed the R POSIX[cl]t  code just truncates,
> > i.e. rounds towards 0 unconditionally, and I tend to agree that it
> > should rather round than truncate.
>
> If you print the hour and minute at 01:59:59, you get 1 and 59, not 2
> and 0.  That may be the motivation for doing the same for fractional
> seconds.  Should 1:59:59.9 really print as 2:00:00?
>
> Duncan Murdoch
> >
> > But we should carefully separate the issues here, from the
> > underlying inherent FAQ 7.31 truth that most decimal numbers in
> > a computer are not quite what they look like ...
> >
> > Martin Maechler
> > ETH Zurich and  R Core Team (also author of the CRAN package 'round')
> >
> >
> >  > summary(x, digits = 20)
> >  > #>  Min.   1st Qu.
> Median
> >  > #> "2000-01-04 20:00:00.024" "2000-01-04 20:00:00.024"
> "2000-01-04 20:00:00.024"
> >  > #>  Mean   3rd Qu.
>   Max.
> >  > #> "2000-01-04 20:00:00.024" "2000-01-04 20:00:00.024"
> "2000-01-04 20:00:00.024"
> >  > x
> >  > #> [1] "2000-01-04 20:00:00.024 UTC"
> >  > format.POSIXct(x, format = "%Y-%m-%d %H:%M:%OS3")
> >  > #> [1] "2000-01-04 20:00:00.024"
> > 

[Rd] Printing digits.secs on data.frame?

2024-07-18 Thread John Muschelli
Is there a way to have printing data.frames with POSIXct to display
milliseconds if digits.secs is set as a default?

You can use the digits argument in print, such as print(df, digits = 3) to
get the intended output, but I assumed it was done with the option
digits.secs set.  Tibbles by default do this printing, which is shown
below, but I was unsure if digits.secs should affect printing data.frames,
as we see below it affects printing POSIXct outside of a data.frame.

``` r
df = structure(list(time = structure(c(1509375600, 1509375600.0,
   1509375600.06667, 1509375600.1,
1509375600.1, 1509375600.16667
), class = c("POSIXct", "POSIXt"), tzone = "GMT"),
X = c(0.188,
  0.18, 0.184, 0.184, 0.184, 0.184),
Y = c(0.145, 0.125, 0.121,
  0.121, 0.117, 0.125),
Z = c(-0.984, -0.988, -0.984, -0.992, -0.988,
  -0.988)), row.names = c(NA, 6L), class = "data.frame")
options(digits.secs = NULL)
getOption("digits.secs")
#> NULL
df
#>  time X Y  Z
#> 1 2017-10-30 15:00:00 0.188 0.145 -0.984
#> 2 *2017-10-30 15:00:00* 0.180 0.125 -0.988
#> 3 2017-10-30 15:00:00 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00 0.184 0.125 -0.988
print(df)
#>  time X Y  Z
#> 1 2017-10-30 15:00:00 0.188 0.145 -0.984
#> 2 *2017-10-30 15:00:00* 0.180 0.125 -0.988
#> 3 2017-10-30 15:00:00 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00 0.184 0.125 -0.988
df$time
#> [1] "2017-10-30 15:00:00 GMT" "*2017-10-30 15:00:00 GMT*"
#> [3] "2017-10-30 15:00:00 GMT" "2017-10-30 15:00:00 GMT"
#> [5] "2017-10-30 15:00:00 GMT" "2017-10-30 15:00:00 GMT"

print(df, digits = 3)
#>  time X Y  Z
#> 1 2017-10-30 15:00:00.000 0.188 0.145 -0.984
#> 2 *2017-10-30 15:00:00.033* 0.180 0.125 -0.988
#> 3 2017-10-30 15:00:00.066 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00.099 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00.133 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00.166 0.184 0.125 -0.988
tibble::as_tibble(df)
#> # A tibble: 6 × 4
#>   timeX Y  Z
#>
#> 1 2017-10-30 15:00:00 0.188 0.145 -0.984
#> 2 *2017-10-30 15:00:00* 0.18  0.125 -0.988
#> 3 2017-10-30 15:00:00 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00 0.184 0.125 -0.988
```

We see by default tibbles do this printing

``` r
options(digits.secs = 3)
getOption("digits.secs")
#> [1] 3
df
#>  time X Y  Z
#> 1 2017-10-30 15:00:00 0.188 0.145 -0.984
#> 2 *2017-10-30 15:00:00* 0.180 0.125 -0.988
#> 3 2017-10-30 15:00:00 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00 0.184 0.125 -0.988
print(df)
#>  time X Y  Z
#> 1 2017-10-30 15:00:00 0.188 0.145 -0.984
#> 2* 2017-10-30 15:00:00* 0.180 0.125 -0.988
#> 3 2017-10-30 15:00:00 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00 0.184 0.125 -0.988
```

We see that this affects printing POSIXct outside of a data.frame

``` r
df$time
#> [1] "2017-10-30 15:00:00.000 GMT" "*2017-10-30 15:00:00.033 GMT*"
#> [3] "2017-10-30 15:00:00.066 GMT" "2017-10-30 15:00:00.099 GMT"
#> [5] "2017-10-30 15:00:00.133 GMT" "2017-10-30 15:00:00.166 GMT"
print(df, digits = 3)
#>  time X Y  Z
#> 1 2017-10-30 15:00:00.000 0.188 0.145 -0.984
#> 2 *2017-10-30 15:00:00.033* 0.180 0.125 -0.988
#> 3 2017-10-30 15:00:00.066 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00.099 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00.133 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00.166 0.184 0.125 -0.988
tibble::as_tibble(df)
#> # A tibble: 6 × 4
#>   timeX Y  Z
#>
#> 1 2017-10-30 15:00:00.000 0.188 0.145 -0.984
#> 2 *2017-10-30 15:00:00.033* 0.18  0.125 -0.988
#> 3 2017-10-30 15:00:00.066 0.184 0.121 -0.984
#> 4 2017-10-30 15:00:00.099 0.184 0.121 -0.992
#> 5 2017-10-30 15:00:00.133 0.184 0.117 -0.988
#> 6 2017-10-30 15:00:00.166 0.184 0.125 -0.988
```

Created on 2024-07-18 with [reprex v2.1.0](https://reprex.tidyverse.org
)



Session info


``` r
sessioninfo::session_info()
#> ─ Session info
───
#>  setting  value
#>  version  R version 4.4.0 (2024-04-24)
#>  os   macOS Sonoma 14.4.1
#>  system   x86_64, darwin20
#>  ui   X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctypeen_US.UTF-8
#>  tz   America/New_York
#>  date 2024-07-18
#>  pandoc   3.2 @ /usr/local/bin/ (via rmarkdown)
#>
```



[[alternative HTML version deleted]]

__
R-devel@r-project.org ma