Re: [Rd] Community Feedback: Git Repository for R-Devel

2018-01-04 Thread Mark van der Loo
This question has been discussed before on this list:
http://r.789695.n4.nabble.com/Why-R-project-source-code-is-not-on-Github-td4695779.html

See especially Jeroen's answer.

Best,
Mark

Op do 4 jan. 2018 om 01:11 schreef Juan Telleria :

> UNBIASED FACTS:
> • Bugzilla & R-devel Mailing Lists: Remain unchanged: Understood as
> Ticketing platforms for bug pull requests on the R-devel Git Repository.
>
> •  Git Repository Options:
> A) Github (Cloud with Automated backups from GitHub to CRAN Server):
> https://github.com
> B) Gitlab (Selfhosted on CRAN): https://about.gitlab.com
> C) Phabricator (Selfhosted on CRAN): https://www.phacility.com
> D) Microsoft Codeplex: https://www.codeplex.com
> E) Others: Unknown
>
> GOOGLE TRENDS:
> https://trends.google.com/trends/explore?date=all&q=Git,Svn,Github,Gitlab
>
> EXAMPLE
> Git Repository on Core Python: https://github.com/python
>
> PERSONAL OPINION / MOTIVATION:
> I think that moving efforts in this direction is important because it would
> allow a true Open Source Innovation & Open Collaboration in R between:
> * R Community.
> * And R-Core.
> For:
> * R Bug Fixes.
> * And Core Feature Wishlist.
> As anyone would be able to:
> * Check the unassigned bugs in Bugzilla (apart from R-Core).
> * And propose bugs fixes by themselves as Pull requests (by mentioning the
> Bug ID of Bugzilla or the Mailing Lists).
>
> This would allow that _individuals_ either from Universities or Companies
> interested in the Development of R:
> * apart of donating economical resources to the R Foundation.
> * could help to maintain core R Code by themselves.
> Which aligns with the true spirit of R, which shall be done from
> contributing individuals, for individuals themselves.
>
> It would also allow to put the focus on the precise lines of code changed
> with each Commit, and revert changes in an easy way, without verbose
> E-mails: Tidy, Clean, Maintainable, and Fast.
>
> At last, I noticed R-devel Archives do not have an E-mail Id (Unique
> Unsigned Integer), so it would be a good idea to add one for pull requests
> if Git was adopted.
>
> Juan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Searching R Packages

2018-01-29 Thread Mark van der Loo
Dear Spencer,

Nice initiative!

I discover a lot of packages not by explicit search, but by running into
them. I find cranberries really helpful there, especially the twitter feed
(@CRANberries) and also r-bloggers, especially through Joseph Rickert's
monthly roundup of new packages. And then of course there is the R journal
and JSS, but those speak for themselves.

So maybe a 'keeping up to date' section would be nice in the article?

Best,
Mark





Op ma 29 jan. 2018 om 00:25 schreef Ravi Varadhan :

> Hi Spencer,
> Thank you for this wonderful service to the R community.
>
> A suggestion:  it would be great to discuss how to search github and
> Bioconductor repositories.
>
> Thanks,
> Ravi
>
> 
> From: R-devel  on behalf of Spencer Graves
> 
> Sent: Saturday, January 27, 2018 11:17 AM
> To: R-Devel
> Subject: [Rd] Searching R Packages
>
> Hello, All:
>
>
>   Might you have time to review the article I recently posted to
> Wikiversity on "Searching R Packages"
> (https://en.wikiversity.org/wiki/Searching_R_Packages)?
>
>
>   Please edit this yourself or propose changes in the associated
> "Discuss" page or in an email to this list or to me.
>
>
>   My goal in this is to invite readers to turn that article into a
> proposal for improving the search capabilities in R that would
> ultimately be funded by, e.g., The R Foundation.
>
>
>   What do you think?
>
>
>   Please forward this to anyone you think might be interested.
>
>
>   Thanks for your contributions to improving the lot of humanity
> through better statistical software.
>
>
>Best Wishes,
>Spencer Graves, PhD
>Founder
>EffectiveDefense.org
>7300 W. 107th St. # 506
>Overland Park, KS 66212
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-31 Thread Mark van der Loo
I fully agree with Joris and Hadley on roxygen2.


Additionally:

I wrote and published my first package before roxygen (or roxygen2) was
available. I found editing .Rd extremely terse (especially when code is
updated). For example, the fact that there are no spaces allowed between }
and { in \param{}{} has hurt my brain quite a few times, especially since R
CMD check did not give any useful error messages about it. For me it is a
signal that the Rd parser is rather primitive. On the other hand Roxygen2
now usually gives pretty good error messages when I syntax error something.

Also, the 'parent' of roxygen is Doxygen, which was already widely used
(also by me) in the C/C++ community before roxygen was published. I cannot
remember anyone ever complaining about C/C++ documentation deteriorating
because of Doxygen.


-Mark


Op wo 31 jan. 2018 om 14:02 schreef Joris Meys :

> On Wed, Jan 31, 2018 at 1:41 PM, Duncan Murdoch 
> wrote:
>
> > On 31/01/2018 6:33 AM, Joris Meys wrote:
> >
> > 3. given your criticism, I'd like your opinion on where I can improve the
> >> documentation of https://github.com/CenterForStatistics-UGent/pim. I'm
> >> currently busy updating the help files for a next release on CRAN, so
> your
> >> input is more than welcome.
> >>
> >
> > After this invitation I sent some private comments to Joris.  I would say
> > his package does a pretty good job of documentation; it isn't the kind of
> > Roxygen-using package that I was complaining about.  So I will say I have
> > received an example of a Roxygen-using package that
> > has good help pages.
> >
>
> Thank you for the nice compliment and the valuable tips.
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
> <
> https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g
> >
>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Apparent bug in behavior of formulas with '-' operator for lm

2018-03-16 Thread Mark van der Loo
Dear R-developers,

In the 'lm' documentation, the '-' operator is only specified to be used
with -1 (to remove the intercept from the model).

However, the documentation also refers to the 'formula' help file, which
indicates that it is possible to subtract any term. Indeed, the following
works with no problems (the period '.' stands for 'all terms except the
lhs'):

d <- data.frame(x=rnorm(6), y=rnorm(6), z=letters[1:2])
m <- lm(x ~ . -z, data=d)
p <- predict(m,newdata=d)

Now, if I change 'z' so that it has only unique values, and I introduce an
NA in the predicted variable, the following happens:

d <- data.frame(x=rnorm(6),y=rnorm(6),z=letters[1:6])
d$x[1] <- NA
m <- lm(x ~ . -z, data=d)
p <- predict(m, newdata=d)
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels) : factor z has new levels a

It seems a bug to me, although one could argue that 'lm's documentation
does not allow one to expect that the '-' operator should work generally.

If it is a bug I'm happy to report it to bugzilla.

Thanks for all your efforts,
Mark

ps: I was not able to test this on R3.4.4 yet, but the NEWS does not
mention fixes related to 'lm' or 'predict'.


> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 LC_TIME=nl_NL.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=nl_NL.UTF-8LC_MESSAGES=en_US.UTF-8
LC_PAPER=nl_NL.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
 LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.3 tools_3.4.3yaml_2.1.16

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Apparent bug in behavior of formulas with '-' operator for lm

2018-03-16 Thread Mark van der Loo
Joris, the point is that 'z' is NOT used as a predictor in the model.
Therefore it should not affect predictions. Also, I find it suspicious that
the error only occurs when the response variable conitains missings and 'z'
is unique (I have tested several other cases to confirm this).

-Mark

Op vr 16 mrt. 2018 om 13:03 schreef Joris Meys :

> It's not a bug per se. It's the effect of removing all observations linked
> to a certain level in your data frame. So the output of lm() doesn't
> contain a coefficient for level a of z, but your new data contains that
> level a. With a small addition, this works again:
>
> d <- data.frame(x=rnorm(12),y=rnorm(12),z=rep(letters[1:6],2))
>
> d$x[1] <- NA
> m <- lm(x ~ . -z, data=d)
> p <- predict(m, newdata=d)
>
> This is linked to another discussion earlier on stackoverflow :
> https://stackoverflow.com/questions/48461980/prediction-in-r-glmm
> which lead to an update to lme4 : https://github.com/lme4/lme4/issues/452
>
> The point being that factors in your newdata should have the same levels
> as factors in the original data that was used to fit the model. If you add
> levels to these factors, it's impossible to use that model to predict for
> these new data.
>
> Cheers
> Joris
>
> On Fri, Mar 16, 2018 at 10:21 AM, Mark van der Loo <
> mark.vander...@gmail.com> wrote:
>
>> Dear R-developers,
>>
>> In the 'lm' documentation, the '-' operator is only specified to be used
>> with -1 (to remove the intercept from the model).
>>
>> However, the documentation also refers to the 'formula' help file, which
>> indicates that it is possible to subtract any term. Indeed, the following
>> works with no problems (the period '.' stands for 'all terms except the
>> lhs'):
>>
>> d <- data.frame(x=rnorm(6), y=rnorm(6), z=letters[1:2])
>> m <- lm(x ~ . -z, data=d)
>> p <- predict(m,newdata=d)
>>
>> Now, if I change 'z' so that it has only unique values, and I introduce an
>> NA in the predicted variable, the following happens:
>>
>> d <- data.frame(x=rnorm(6),y=rnorm(6),z=letters[1:6])
>> d$x[1] <- NA
>> m <- lm(x ~ . -z, data=d)
>> p <- predict(m, newdata=d)
>> Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
>> object$xlevels) : factor z has new levels a
>>
>> It seems a bug to me, although one could argue that 'lm's documentation
>> does not allow one to expect that the '-' operator should work generally.
>>
>> If it is a bug I'm happy to report it to bugzilla.
>>
>> Thanks for all your efforts,
>> Mark
>>
>> ps: I was not able to test this on R3.4.4 yet, but the NEWS does not
>> mention fixes related to 'lm' or 'predict'.
>>
>>
>> > sessionInfo()
>> R version 3.4.3 (2017-11-30)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 16.04.4 LTS
>>
>> Matrix products: default
>> BLAS: /usr/lib/libblas/libblas.so.3.6.0
>> LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>>  LC_TIME=nl_NL.UTF-8LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=nl_NL.UTF-8LC_MESSAGES=en_US.UTF-8
>> LC_PAPER=nl_NL.UTF-8   LC_NAME=C
>>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
>>  LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> loaded via a namespace (and not attached):
>> [1] compiler_3.4.3 tools_3.4.3yaml_2.1.16
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Apparent bug in behavior of formulas with '-' operator for lm

2018-03-16 Thread Mark van der Loo
Thanks, Joris,

This clarifies at least where exactly it comes from. I still find the
high-level behavior of 'predict' very counter-intuitive as the estimated
model contains no coefficients in 'z', but I think we agree on that.

I am not sure how much trouble it would be to improve this behavior, but
perhaps one of the core authors can have a look at it.

Best,
Mark




Op vr 16 mrt. 2018 om 13:22 schreef Joris Meys :

> Technically it is used as a predictor in the model. The information is
> contained in terms :
>
> > terms(x ~ . - z, data = d)
> x ~ (y + z) - z
> attr(,"variables")
> list(x, y, z)
> attr(,"factors")
>   y
> x 0
> y 1
> z 0
> attr(,"term.labels")
> [1] "y"
> attr(,"order")
> [1] 1
> attr(,"intercept")
> [1] 1
> attr(,"response")
> [1] 1
> attr(,".Environment")
>
> And the model.frame contains it :
>
> > head(model.frame(x ~ . - z, data = d))
> x  y z
> 2 -0.06022984 -0.4483109 b
> 3  1.25293390  0.2687065 c
> 4 -1.11811090  0.8016076 d
> 5 -0.75521720 -0.7484931 e
> 6  0.93037156  0.4128456 f
> 7  1.32052028 -1.6609043 a
>
> It is at the construction of the model.matrix that z disappears, but the
> contrasts information for z is still attached :
>
> > attr(model.matrix(x ~ . - z, data = d),"contrasts")
> $z
> [1] "contr.treatment"
>
> As you can see from the error you printed, it is model.frame() complaining
> about it. In this case it wouldn't be necessary, but it is documented
> behaviour of model.frame. Which is why I didn't say "this is not a bug",
> but "this is not a bug per se". Meaning that this is not optimal behaviour
> and might not what you expect, but it follows the documentation of the
> underlying functions.
>
> Solving it would require a bypass of model.frame() to construct the
> correct model,matrix for the new predictions, and that's far from trivial
> as model.matrix() itself depends on model.frame().
>
> Cheers
> Joris
>
>
>
>
>
> On Fri, Mar 16, 2018 at 1:09 PM, Mark van der Loo <
> mark.vander...@gmail.com> wrote:
>
>> Joris, the point is that 'z' is NOT used as a predictor in the model.
>> Therefore it should not affect predictions. Also, I find it suspicious that
>> the error only occurs when the response variable conitains missings and 'z'
>> is unique (I have tested several other cases to confirm this).
>>
>> -Mark
>>
>> Op vr 16 mrt. 2018 om 13:03 schreef Joris Meys :
>>
>>> It's not a bug per se. It's the effect of removing all observations
>>> linked to a certain level in your data frame. So the output of lm() doesn't
>>> contain a coefficient for level a of z, but your new data contains that
>>> level a. With a small addition, this works again:
>>>
>>> d <- data.frame(x=rnorm(12),y=rnorm(12),z=rep(letters[1:6],2))
>>>
>>> d$x[1] <- NA
>>> m <- lm(x ~ . -z, data=d)
>>> p <- predict(m, newdata=d)
>>>
>>> This is linked to another discussion earlier on stackoverflow :
>>> https://stackoverflow.com/questions/48461980/prediction-in-r-glmm
>>> which lead to an update to lme4 :
>>> https://github.com/lme4/lme4/issues/452
>>>
>>> The point being that factors in your newdata should have the same levels
>>> as factors in the original data that was used to fit the model. If you add
>>> levels to these factors, it's impossible to use that model to predict for
>>> these new data.
>>>
>>> Cheers
>>> Joris
>>>
>>> On Fri, Mar 16, 2018 at 10:21 AM, Mark van der Loo <
>>> mark.vander...@gmail.com> wrote:
>>>
>>>> Dear R-developers,
>>>>
>>>> In the 'lm' documentation, the '-' operator is only specified to be used
>>>> with -1 (to remove the intercept from the model).
>>>>
>>>> However, the documentation also refers to the 'formula' help file, which
>>>> indicates that it is possible to subtract any term. Indeed, the
>>>> following
>>>> works with no problems (the period '.' stands for 'all terms except the
>>>> lhs'):
>>>>
>>>> d <- data.frame(x=rnorm(6), y=rnorm(6), z=letters[1:2])
>>>> m <- lm(x ~ . -z, data=d)
>>>> p <- predict(m,newdata=d)
>>>>
>>>> Now, if I change 'z' so that it has only unique va

Re: [Rd] data.table not available as win binary for R 3.5

2018-04-24 Thread Mark van der Loo
FWIW, I see that stringdist also doesn't pass R CMD check on r-release and
r-devel on Windows while Linux or r-oldrel on Windows gives no problems[1].


A quick scan of the release notes on Windows specific changes doesn't give
me a clue yet. I see the following possibly significant warning in the
output on Windows:

Warning: stack imbalance in '{', 39 then 40

I don't have a Windows PC handy where I can quickly reproduce this so if
anyone has solved similar problems it would be nice if they could be posted
here.

Best,
Mark

[1] https://cran.r-project.org/web/checks/check_results_stringdist.html

Op di 24 apr. 2018 om 14:11 schreef Jeroen Ooms :

> On Tue, Apr 24, 2018 at 7:26 AM, Joris Meys  wrote:
> >
> > Dear all,
> >
> > to my astonishment data.table cannot be installed on R 3.5 Windows. When
> > checking the package page, the Windows binary is available for download.
>
>
> The package check page for data.table shows that is currently failing
> CMD check. As a precaution, CRAN does not publish binaries for
> packages that do not pass check, so I think this is why it seems
> unavailable.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] length of `...`

2018-05-03 Thread Mark van der Loo
This question is better aimed at the r-help mailinglist as it is not about
developing R itself.


having said that,

I can only gues why you want to do this, but why not do something like this:


f <- function(...){
   L <- list(...)
   len <- length()
  # you can stll pass the ... as follows:
  do.call(someotherfunction, L)

}


-Mark

Op do 3 mei 2018 om 16:29 schreef Dénes Tóth :

> Hi,
>
>
> In some cases the number of arguments passed as ... must be determined
> inside a function, without evaluating the arguments themselves. I use
> the following construct:
>
> dotlength <- function(...) length(substitute(expression(...))) - 1L
>
> # Usage (returns 3):
> dotlength(1, 4, something = undefined)
>
> How can I define a method for length() which could be called directly on
> `...`? Or is it an intention to extend the base length() function to
> accept ellipses?
>
>
> Regards,
> Denes
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Mark van der Loo
how about

is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)

(use ellipsis to set tolerance if necessary)


Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :

> Agreed that's it's rounding error, and all.equal would be the way to go.
> I wouldn't call it a bug, it's simply part of working with floating point
> numbers, any language has the same issue.
>
> And while we're at it, I think the function can be a lot shorter:
> .is_continous_evenly_spaced <- function(n){
>   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n), to=max(n),
> length.out = length(n
> }
>
> Cheers, Emil
>
> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
> () escribió:
> >
> > Dear all,
> >
> > I a bit unsure, whether this qualifies as a bug, but it is definitly
> a strange behaviour. That why I wanted to discuss it.
> >
> > With the following function, I want to test for evenly space
> numbers, starting from anywhere.
> >
> > .is_continous_evenly_spaced <- function(n){
> >   if(length(n) < 2) return(FALSE)
> >   n <- n[order(n)]
> >   n <- n - min(n)
> >   step <- n[2] - n[1]
> >   test <- seq(from = min(n), to = max(n), by = step)
> >   if(length(n) == length(test) &&
> >  all(n == test)){
> > return(TRUE)
> >   }
> >   return(FALSE)
> > }
> >
> > > .is_continous_evenly_spaced(c(1,2,3,4))
> > [1] TRUE
> > > .is_continous_evenly_spaced(c(1,3,4,5))
> > [1] FALSE
> > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
> > [1] FALSE
> >
> > I expect the result for 1 and 2, but not for 3. Upon Investigation
> it turns out, that n == test is TRUE for every pair, but not for the pair
> of 0.2.
> >
> > The types reported are always double, however n[2] == 0.1 reports
> FALSE as well.
> >
> > The whole problem is solved by switching from all(n == test) to
> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
> >
> > Does this work as intended? Thanks for any help, advise and
> suggestions in advance.
>
> I guess this has something to do with how the sequence is built and
> the inherent error of floating point arithmetic. In fact, if you
> return test minus n, you'll get:
>
> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
>
> and the error gets bigger when you continue the sequence; e.g., this
> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
>
> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
> [6] 4.440892e-16 4.440892e-16 0.00e+00
>
> So, independently of this is considered a bug or not, instead of
>
> length(n) == length(test) && all(n == test)
>
> I would use the following condition:
>
> isTRUE(all.equal(n, test))
>
> Iñaki
>
> >
> > Best regards,
> > Felix
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Iñaki Ucar
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Mark van der Loo
Sorry for the second e-mail: this is worth watching:
https://www.youtube.com/watch?v=3Bu7QUxzIbA&t=1s
It's Martin Maechler's talk at useR!2018. This kind of stuff should be
mandatory material for any aspiring programmer/data scientist/statistician.

-Mark




Op vr 31 aug. 2018 om 16:00 schreef Mark van der Loo <
mark.vander...@gmail.com>:

> how about
>
> is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)
>
> (use ellipsis to set tolerance if necessary)
>
>
> Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :
>
>> Agreed that's it's rounding error, and all.equal would be the way to go.
>> I wouldn't call it a bug, it's simply part of working with floating point
>> numbers, any language has the same issue.
>>
>> And while we're at it, I think the function can be a lot shorter:
>> .is_continous_evenly_spaced <- function(n){
>>   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n),
>> to=max(n), length.out = length(n
>> }
>>
>> Cheers, Emil
>>
>> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
>> () escribió:
>> >
>> > Dear all,
>> >
>> > I a bit unsure, whether this qualifies as a bug, but it is
>> definitly a strange behaviour. That why I wanted to discuss it.
>> >
>> > With the following function, I want to test for evenly space
>> numbers, starting from anywhere.
>> >
>> > .is_continous_evenly_spaced <- function(n){
>> >   if(length(n) < 2) return(FALSE)
>> >   n <- n[order(n)]
>> >   n <- n - min(n)
>> >   step <- n[2] - n[1]
>> >   test <- seq(from = min(n), to = max(n), by = step)
>> >   if(length(n) == length(test) &&
>> >  all(n == test)){
>> > return(TRUE)
>> >   }
>> >   return(FALSE)
>> > }
>> >
>> > > .is_continous_evenly_spaced(c(1,2,3,4))
>> > [1] TRUE
>> > > .is_continous_evenly_spaced(c(1,3,4,5))
>> > [1] FALSE
>> > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
>> > [1] FALSE
>> >
>> > I expect the result for 1 and 2, but not for 3. Upon Investigation
>> it turns out, that n == test is TRUE for every pair, but not for the pair
>> of 0.2.
>> >
>> > The types reported are always double, however n[2] == 0.1 reports
>> FALSE as well.
>> >
>> > The whole problem is solved by switching from all(n == test) to
>> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
>> >
>> > Does this work as intended? Thanks for any help, advise and
>> suggestions in advance.
>>
>> I guess this has something to do with how the sequence is built and
>> the inherent error of floating point arithmetic. In fact, if you
>> return test minus n, you'll get:
>>
>> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
>>
>> and the error gets bigger when you continue the sequence; e.g., this
>> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
>>
>> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
>> [6] 4.440892e-16 4.440892e-16 0.00e+00
>>
>> So, independently of this is considered a bug or not, instead of
>>
>> length(n) == length(test) && all(n == test)
>>
>> I would use the following condition:
>>
>> isTRUE(all.equal(n, test))
>>
>> Iñaki
>>
>> >
>> > Best regards,
>> > Felix
>> >
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>> --
>> Iñaki Ucar
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Mark van der Loo
Ah, my bad, you're right of course.

sum(abs(diff(diff( sort(x) < eps

for some reasonable eps then, would do as a oneliner, or

all(abs(diff(diff(sort(x < eps)

or

max(abs(diff(diff(sort(x) < eps


-Mark

Op vr 31 aug. 2018 om 16:14 schreef Iñaki Ucar :

> El vie., 31 ago. 2018 a las 16:00, Mark van der Loo
> () escribió:
> >
> > how about
> >
> > is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)
>
> This doesn't work, because
>
> 1. all.equal does *not* return FALSE. Use of isTRUE or identical(.,
> TRUE) is required if you want a boolean.
> 2. all.equal compares two objects, not elements in a vector.
>
> Iñaki
>
> >
> > (use ellipsis to set tolerance if necessary)
> >
> >
> > Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :
> >>
> >> Agreed that's it's rounding error, and all.equal would be the way to go.
> >> I wouldn't call it a bug, it's simply part of working with floating
> point numbers, any language has the same issue.
> >>
> >> And while we're at it, I think the function can be a lot shorter:
> >> .is_continous_evenly_spaced <- function(n){
> >>   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n),
> to=max(n), length.out = length(n
> >> }
> >>
> >> Cheers, Emil
> >>
> >> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
> >> () escribió:
> >> >
> >> > Dear all,
> >> >
> >> > I a bit unsure, whether this qualifies as a bug, but it is
> definitly a strange behaviour. That why I wanted to discuss it.
> >> >
> >> > With the following function, I want to test for evenly space
> numbers, starting from anywhere.
> >> >
> >> > .is_continous_evenly_spaced <- function(n){
> >> >   if(length(n) < 2) return(FALSE)
> >> >   n <- n[order(n)]
> >> >   n <- n - min(n)
> >> >   step <- n[2] - n[1]
> >> >   test <- seq(from = min(n), to = max(n), by = step)
> >> >   if(length(n) == length(test) &&
> >> >  all(n == test)){
> >> > return(TRUE)
> >> >   }
> >> >   return(FALSE)
> >> > }
> >> >
> >> > > .is_continous_evenly_spaced(c(1,2,3,4))
> >> > [1] TRUE
> >> > > .is_continous_evenly_spaced(c(1,3,4,5))
> >> > [1] FALSE
> >> > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
> >> > [1] FALSE
> >> >
> >> > I expect the result for 1 and 2, but not for 3. Upon
> Investigation it turns out, that n == test is TRUE for every pair, but not
> for the pair of 0.2.
> >> >
> >> > The types reported are always double, however n[2] == 0.1 reports
> FALSE as well.
> >> >
> >> > The whole problem is solved by switching from all(n == test) to
> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
> >> >
> >> > Does this work as intended? Thanks for any help, advise and
> suggestions in advance.
> >>
> >> I guess this has something to do with how the sequence is built and
> >> the inherent error of floating point arithmetic. In fact, if you
> >> return test minus n, you'll get:
> >>
> >> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
> >>
> >> and the error gets bigger when you continue the sequence; e.g., this
> >> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
> >>
> >> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
> >> [6] 4.440892e-16 4.440892e-16 0.00e+00
> >>
> >> So, independently of this is considered a bug or not, instead of
> >>
> >> length(n) == length(test) && all(n == test)
> >>
> >> I would use the following condition:
> >>
> >> isTRUE(all.equal(n, test))
> >>
> >> Iñaki
> >>
> >> >
> >> > Best regards,
> >> > Felix
> >> >
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-devel@r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >>
> >>
> >> --
> >> Iñaki Ucar
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Iñaki Ucar
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] NEWS.md support on CRAN

2015-06-04 Thread Mark van der Loo
FWIW (and a bit late in the discussion, I know), I for one do not care
about having NEWS in md format at all.

The solution the Yihui uses (linking to GH from NEWS.Rd) is really annoying
for people with no direct Internet access. For example, I work at an
institute that handles a lot of private data and most VM's with R on it
have no direct internet access for that reason (of course internet is
accessible but through an application running on a separate VM).

Moreover, I as a user also do not care at all about links to GH #issues and
which @user did what for each issue. These are details that are useful for
people developing the package or for people who reported a bug. As a user I
just want to read a short description like "bugfix: function f crashed on
input y", or "function g is deprecated" without having to first navigate to
another website.

The most important thing about the NEWS is that it is easy to find (so in a
fixed place), and aimed at users, not developers. It should come with the
software, so it is also available when GH is offline or replaced with
something new (since hey, didn't we all have a sourceforge or google code
account in our younger days?).

In short, I think that added value of NEWS.md is fairly limited but it does
increase the risk of dispersing the NEWS all over the web.

Best,
Mark










Op wo 3 jun. 2015 om 08:32 schreef Kurt Hornik :

> > Duncan Murdoch writes:
>
> > On 02/06/2015 11:05 AM, Dirk Eddelbuettel wrote:
> >> Hi Kurt,
> >>
> >> On 1 June 2015 at 14:02, Kurt Hornik wrote:
> >> | > peter dalgaard writes:
> >> |
> >> | >> On 30 May 2015, at 01:20 , Imanuel Costigan 
> wrote:
> >> | >>
> >> | >> So I assume this commit means NEWS.md is now no longer on
> blacklist?
> >> | >>
> >> |
> >> | > in the development version. Not true of released versions.
> >> |
> >> | Now also in r-patched.
> >>
> >> Nice.
> >>
> >> Now, is there a way for package authors to preview how a .md would be
> >> rendered?  I wrote mine with GitHub in mind, and they render fine. I
> looked a
> >> recently-uploaded README.md of mine on CRAN, and it got some of the
> pandoc-y
> >> parts wrong --- and looks unprofessional.
> >>
> >> I would like to avoid that.  How can I?
>
> > In the short term, you should probably try to run pandoc with the same
> > version and options as CRAN.  Kurt, can you say what these are?  If you
> > (Dirk) know pandoc options that emulate Github, it would probably make
> > sense for CRAN to use those.
>
> Sure.  We currently have
>
> pandoc 1.12.4.2
> Compiled with texmath 0.6.6.1, highlighting-kate 0.5.8.5.
>
> which we use with --email-obfuscation=references.
>
> Best
> -k
>
> > In the longer term, the plan is to include our own parser and renderer.
> > At that point this would be easy.
>
> > Duncan Murdoch
> >>
> >> Dirk
> >>
> >>
> >> | -k
> >> |
> >> | > -pd
> >> |
> >> |
> >> | >>
> https://github.com/wch/r-source/commit/9ffe87264a1cd59a31a829f72d57af0f1bfa327a
> >> | >>
> >> | >> Sent from my iPad
> >> | >>
> >> | >> On 23 May 2015, at 6:05 pm, Kurt Hornik 
> wrote:
> >> | >>
> >> |  Duncan Murdoch writes:
> >> | >>>
> >> | > On 22/05/2015 8:49 PM, Imanuel Costigan wrote:
> >> | > Are there any plans for CRAN to support NEWS files in markdown?
> Bit of a hassle to go the the package’s Github (or other like) site to read
> NEWS.
> >> | >>>
> >> |  Not as far as I know.  There have been discussions about
> increasing the
> >> |  support of Markdown, but so far the conclusion has been that
> it's too
> >> |  hard to do -- the support is not stable enough on all the
> platforms
> >> |  where R runs.
> >> | >>>
> >> | >>> There are actually two issues here.
> >> | >>>
> >> | >>> For CRAN, we could in principle take inst/NEWS.md files, convert
> these
> >> | >>> to HTML using pandoc, and use the HTML for the package web page.
> (Would
> >> | >>> need the CRAN incoming checks to be taught about inst/NEWS.md.)
> >> | >>>
> >> | >>> However, we cannot use such files for utils::news() because we do
> not
> >> | >>> (yet?) know how to reliably parse such files and extract the news
> items
> >> | >>> (and hence cannot really compute on the news information).
> >> | >>>
> >> | >>> Btw, currently only one package on CRAN has inst/NEWS.md (another
> one
> >> | >>> has NEWS.md at top level).
> >> | >>>
> >> | >>> Best
> >> | >>> -k
> >> | >>>
> >> |  Markdown is allowed for vignettes (because the package author
> processes
> >> |  those), so I'd suggest putting your news into a vignette instead
> of a
> >> |  news file.  Put in a token news file that points to the vignette
> so
> >> |  users can find it.
> >> | >>>
> >> |  Duncan Murdoch
> >> | >>>
> >> |  __
> >> |  R-devel@r-project.org mailing list
> >> |  https://stat.ethz.ch/mailman/listinfo/r-devel
> >> | >>
> >> | >> [[alternative HTML version deleted]]
> >> | >>
> >> | >> 

Re: [Rd] NEWS.md support on CRAN

2015-06-04 Thread Mark van der Loo
@Gavin: My aim was to point out that the ability to mix developer-facing
documentation with user-facing documentation is not a good reason to want
to support md.

I agree with Duncan that links to within a package would be useful (not
sure if NEWS.Rd supports this).

I'm not so convinced that package authors that do not even add a plain text
NEWS file will create a NEWS.md file.

Adding NEWS.md means we now have three ways to specify the NEWS:

- a plain text NEWS file; following the GNU recommendations
- NEWS.Rd
- NEWS.md

Would it not be more elegant to have e.g. roxygen2 generate NEWS.Rd?
(perhaps it is already possible, I'm not sure of that). I don't maintain
CRAN, but I know what I would prefer..

Cheers,
Mark



​

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] is R syntax closed?

2015-09-19 Thread Mark van der Loo
> comment, some marker for 'command doesn't end at this line' etc.

That is not necessary since R supports multi-line commands without the need
for marking continuation.


> R syntax done and any extensions are forbidden?

R is maintained and extended by the R code team[1] who decide on the GNU R
project. Suggestions (or sometimes patches) are posted on this list and may
or may not be implemented (but please check that your suggestion/question
is indeed new by searching the list archive). R is maintained on an svn
repository and doesn't support pull requests in the same way git does.
Since R is free in the GNU sense you can always define your own local
version, see e.g. [2].

> i'm new to R

welcome, and have fun!

best,
Mark


[1] https://www.r-project.org/contributors.html
[2] https://github.com/radfordneal/pqR



Op za 19 sep. 2015 om 04:02 schreef Piotr Turski :

> hi,
>
> i'm new to R and i discovered that for years people are complaining
> about lacking of very basic and hopefully simple things, like multiline
> comment, some marker for 'command doesn't end at this line' etc.
>
> so my question is why some of those things are still not implemented? is
> it because of compatibility/policy reasons? is R syntax done and any
> extensions are forbidden? or it's simply because everyone is doing
> something more interesting?
> if it's the second case then do such pull requests/patches have a chance
> of being accepted?
>
> --
> regards,
> piotrek
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Inconsistency in treating NaN-results?

2015-11-26 Thread Mark van der Loo
This question is more out of curiosity than a complaint or suggestion, but
I'm just wondering.

The behavior of R on calculations that result in NaN seems a bit
inconsistent.

# this is expected:
> 0/0
[1] NaN

# but this gives a warning
> sin(Inf)
[1] NaN
Warning message:
In sin(Inf) : NaNs produced

# and this again does not
> exp(NaN)
[1] NaN


Conceptually, I like to think that R computes over the real line augmented
with NaN, Inf, -Inf, NaN, and NA (which is technically also NaN). As far as
I know, this set is closed under R's arithmetic operations and mathematical
functions (following the IEEE standard on double precision). If that's the
case, the result sin(Inf)=NaN seems normal to me and a warning is
unnecessary.

So why the choice to have warning on sin(Inf), but not on 0/0 or exp(Nan)?
Is it just historical or am I missing a reasoning or some standard?


Best,
Mark

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.3 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=nl_NL.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=nl_NL.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=nl_NL.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Inconsistency in treating NaN-results?

2015-12-01 Thread Mark van der Loo
Thank you very much, Greg and Bill for clearing things up.

As said, it was more out of curiosity then anything else.

I think my examples were not completely inconsistent (as Greg mentioned)
since arithmetic operations that generate NaN's do so without warning:

> Inf - Inf
[1] NaN
> 0/0
[1] NaN,

while mathematical functions generating NaN's do (as noted by Greg):

> sin(Inf)
[1] NaN
Warning message:
In sin(Inf) : NaNs produced
> log(-1)
[1] NaN
> gamma(-1)
[1] NaN
Warning message:
In gamma(-1) : NaNs produced

And this was what surprised me. The rule thus seems to be: arithmetic
operations do it without warning, standard math functions do it with a
warning. I think that is totally fine. As long as we can reason about what
to expect, we're good.

To answer Bill's question

>> If R did make sin(4.6e14) NaN, would you want a warning?

I think I like the behavior as it is (not producing NaN), but that's
because I feel I have a reasonable working knowledge about numerical
precision issues. I'm not an expert, but I believe that at least my
internal alarm bells start ringing at the right time, and my old copy of
Stoer and Bulirsch[1] is never much more then an arm's length away. So it
all depends on what user you're aiming at when implementing such things. A
(switchable) warning about loss of precision without returning NaN would
probably be a reasonable compromise.

Best,
Mark

[1] https://books.google.nl/books?id=1oDXWLb9qEkC



Op di 1 dec. 2015 om 00:09 schreef William Dunlap :

> As a side note, Splus makes sin(x) NA, with a warning, for
> abs(x)>1.6*2^48 (about
> 4.51e+14) because more than half the digits are incorrect in sin(x)
> for such x.  E.g.,
> in R we get:
>
> > options(digits=16)
> > library(Rmpfr)
> > sin(4.6e14)
> [1] -0.792253849684354
> > sin(mpfr(4.6e14, precBits=500))
> 1 'mpfr' number of precision  500   bits
> [1] -0.7922542110462653250609291646717356496505801794010...
>
> If R did make sin(4.6e14) NaN, would you want a warning?
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Mon, Nov 30, 2015 at 2:38 PM, Greg Snow <538...@gmail.com> wrote:
> > R and the S language that it is based on has evolved as much as it has
> > been designed, so there are often inconsistencies due similar
> > functionality evolving from different paths.  In some cases these
> > inconsistencies are resolved, but generally only once someone notices
> > and care enough to do something about it.  In some other cases the
> > inconsistencies are left for historical reasons and for back
> > compatibility (for example some functions use the na.rm option and
> > other use the na.action option for how to deal with missing values).
> >
> > That said, your report inconsistencies in some function calls, but
> > your calls are not completely consistent.  Consider:
> >
> >> sin(NaN)
> > [1] NaN
> >
> > See, no warning, just like your other cases.  Also consider the
> > difference between log(-1) and log(NaN).  It looks like the warning
> > comes mainly when going from one type of exception (Inf) to another
> > (NaN), but not when propagating an NaN.
> >
> > The 'sin' function (and others) do not know whether the argument was
> > typed in as Inf, or if it is the result of another function returning
> > Inf (well technically it could be made to figure out some common
> > cases, but I for one don't see it worth the effort).  So you could
> > have typed something like 'sin(myfun(x))' and sin looks like it
> > assumes that if myfun would have warned about creating an NaN value,
> > so a second warning is not needed, but myfun may legitimately return
> > Inf, so sin feels it helpful to warn in that case.  And warnings can
> > always be turned off and/or ignored.
> >
> > The only real exception that you show is 0/0 is does not start with
> > NaN, but produces NaN.  But infix operator functions tend to behave a
> > bit differently.
> >
> > On Thu, Nov 26, 2015 at 2:07 AM, Mark van der Loo
> >  wrote:
> >> This question is more out of curiosity than a complaint or suggestion,
> but
> >> I'm just wondering.
> >>
> >> The behavior of R on calculations that result in NaN seems a bit
> >> inconsistent.
> >>
> >> # this is expected:
> >>> 0/0
> >> [1] NaN
> >>
> >> # but this gives a warning
> >>> sin(Inf)
> >> [1] NaN
> >> Warning message:
> >> In sin(Inf) : NaNs produced
> >>
> >> # and this again does not
> >>> exp(NaN)
> >> [1] N

Re: [Rd] How do I reliably and efficiently hash a function?

2015-12-11 Thread Mark van der Loo
In addition to what Charles wrote, you can also use 'local' if you don't
want a function that creates another function.

> f <- local({info <- 10; function(x) x + info})
> f(3)
[1] 13

best,
Mark


Op vr 11 dec. 2015 om 03:27 schreef Charles C. Berry :

> On Thu, 10 Dec 2015, Konrad Rudolph wrote:
>
> > I’ve got the following scenario: I need to store information about an
> > R function, and retrieve it at a later point. In other programming
> > languages I’d implement this using a dictionary with the functions as
> > keys. In R, I’d usually use `attr(f, 'some-name')`. However, for my
> > purposes I do not want to use `attr` because the information that I
> > want to store is an implementation detail that should be hidden from
> > the user of the function (and, just as importantly, it shouldn’t
> > clutter the display when the function is printed on the console).
> >
> > `comment` would be almost perfect since it’s hidden from the output
> > when printing a function — unfortunately, the information I’m storing
> > is not a character string (it’s in fact an environment), so I cannot
> > use `comment`.
> >
> > How can this be achieved?
> >
>
> See
>
> https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Scope
>
> For example, these commands:
>
> foo <- function() {info <- "abc";function(x) x+1}
> func <- foo()
> find("func")
> func(1)
> ls(envir=environment(func))
> get("info",environment(func))
> func
>
> Yield these printed results:
>
> : [1] ".GlobalEnv"
> : [1] 2
> : [1] "info"
> : [1] "abc"
> : function (x)
> : x + 1
> : 
>
> The environment of the function gets printed, but 'info' and other
> objects that might exist in that environment do not get printed unless
> you explicitly call for them.
>
> HTH,
>
> Chuck
>
> p.s. 'environment(func)$info' also works.
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] (no) circular dependency

2016-04-07 Thread Mark van der Loo
At the risk of stating the over-obvious: there's also the option of
creating just a single package containing all functions. None of the
functions that create the interdependencies need to be exported that way.

Btw, his question is probably better at home at the r-package-devel list.


Best,

M




On Thu, Apr 7, 2016, 22:24 Dmitri Popavenko 
wrote:

> Hi Thierry,
>
> Thanks for that, the trouble is functions are package specific so moving
> from one package to another could be a solution, but I would rather save
> that as a last resort.
>
> As mentioned, creating a package C with all the common functions could also
> be an option, but this strategy quickly inflates the number of packages on
> CRAN. If no other option is possible, that could be the way but I was still
> thinking about a more direct solution if possible.
>
> Best,
> Dmitri
>
> On Thu, Apr 7, 2016 at 3:47 PM, Thierry Onkelinx  >
> wrote:
>
> > Dear Dmitri,
> >
> > If it's only a small number of functions then move them the relevant
> > functions for A to B so that B works without A. Then Import these
> functions
> > from B in A. Hence A depends on B but B is independent of A.
> >
> > It is requires to move a lot of functions than you better create a
> package
> > C with all the common functions. Then A and B import those functions
> from C.
> >
> > Best regards,
> >
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and
> > Forest
> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> > Kliniekstraat 25
> > 1070 Anderlecht
> > Belgium
> >
> > To call in the statistician after the experiment is done may be no more
> > than asking him to perform a post-mortem examination: he may be able to
> say
> > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does not
> > ensure that a reasonable answer can be extracted from a given body of
> data.
> > ~ John Tukey
> >
> > 2016-04-06 8:42 GMT+02:00 Dmitri Popavenko :
> >
> >> Hello all,
> >>
> >> I would like to build two packages (say A and B), for two different
> >> purposes.
> >> Each of them need one or two functions from the other, which leads to
> the
> >> problem of circular dependency.
> >>
> >> Is there a way for package A to import a function from package B, and
> >> package B to import a function from package A, without arriving to
> >> circular
> >> dependency?
> >> Other suggestions in the archive mention building a third package that
> >> both
> >> A and B should depend on, but this seems less attractive.
> >>
> >> I read about importFrom() into the NAMESPACE file, but I don't know how
> to
> >> relate this with the information in the DESCRIPTION file (other than
> >> adding
> >> each package to the Depends: field).
> >>
> >> Thank you,
> >> Dmitri
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] (no) circular dependency

2016-04-08 Thread Mark van der Loo
Well, I'm not saying that Dmitri _should_ do it. I merely mention it as an
option that I think is worth thinking about -- it is easy to overlook the
obvious :-). Since we have no further info on the package's structure we
can't be sure..




Op vr 8 apr. 2016 om 13:59 schreef Adrian Dușa :

> Hi Mark,
>
> Uhm... sometimes this is not always possible.
> For example I have a package QCA which produces truth tables (all
> combinations of presence / absence of causal conditions), and it uses the
> venn package to draw a Venn diagram.
> It is debatable if one should assimilate the "venn" package into the QCA
> package (other people might want Venn diagrams but not necessarily the
> other QCA functions).
>
> On the other hand, the package venn would like to use the QCA package to
> demonstrate its abilities to plot Venn diagrams based on truth tables
> produced by the QCA package. Both have very different purposes, yet both
> use functions from each other.
>
> So I'm with Bill Dunlap here that several smaller packages are preferable
> to one larger one, but on the other hand I can't separate those functions
> into a third package: the truth table production is very specific to the
> QCA package, while plotting Venn diagrams is very specific to the venn
> package. I don't see how to separate those functions from their main
> packages and create a third one that each would depend on.
>
> This is just an example, there could be others as well, reason for which I
> am (still) looking for a solution to:
> - preserve the current functionalities in packages A and B (to follow
> Dmitri's original post)
> - be able to use functions from each other
> - yet avoid circular dependency
>
> I hope this explains it,
> Adrian
>
>
> On Thu, Apr 7, 2016 at 11:36 PM, Mark van der Loo <
> mark.vander...@gmail.com> wrote:
>
>> At the risk of stating the over-obvious: there's also the option of
>> creating just a single package containing all functions. None of the
>> functions that create the interdependencies need to be exported that way.
>>
>> Btw, his question is probably better at home at the r-package-devel list.
>>
>>
>> Best,
>>
>> M
>>
>>
>>
>>
>> On Thu, Apr 7, 2016, 22:24 Dmitri Popavenko 
>> wrote:
>>
>>> Hi Thierry,
>>>
>>> Thanks for that, the trouble is functions are package specific so moving
>>> from one package to another could be a solution, but I would rather save
>>> that as a last resort.
>>>
>>> As mentioned, creating a package C with all the common functions could
>>> also
>>> be an option, but this strategy quickly inflates the number of packages
>>> on
>>> CRAN. If no other option is possible, that could be the way but I was
>>> still
>>> thinking about a more direct solution if possible.
>>>
>>> Best,
>>> Dmitri
>>>
>>> On Thu, Apr 7, 2016 at 3:47 PM, Thierry Onkelinx <
>>> thierry.onkel...@inbo.be>
>>> wrote:
>>>
>>> > Dear Dmitri,
>>> >
>>> > If it's only a small number of functions then move them the relevant
>>> > functions for A to B so that B works without A. Then Import these
>>> functions
>>> > from B in A. Hence A depends on B but B is independent of A.
>>> >
>>> > It is requires to move a lot of functions than you better create a
>>> package
>>> > C with all the common functions. Then A and B import those functions
>>> from C.
>>> >
>>> > Best regards,
>>> >
>>> > ir. Thierry Onkelinx
>>> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>>> and
>>> > Forest
>>> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>>> > Kliniekstraat 25
>>> > 1070 Anderlecht
>>> > Belgium
>>> >
>>> > To call in the statistician after the experiment is done may be no more
>>> > than asking him to perform a post-mortem examination: he may be able
>>> to say
>>> > what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> > The plural of anecdote is not data. ~ Roger Brinner
>>> > The combination of some data and an aching desire for an answer does
>>> not
>>> > ensure that a reasonable answer can be extracted from a given body of
>>> data.
>>> > ~ John Tukey
>>> >
>>> > 2016-04-06 8:42 GMT+02:00 Dmitri Pop

Re: [Rd] Single-threaded aspect

2016-05-12 Thread Mark van der Loo
Charles,

1. Perhaps this question is better directed at the R-help or
R-pacakge-devel mailinglist.

2. It basically means that R itself can only evaluate one R expression at
the time.

The parallel package circumvents this by starting multiple R-sessions and
dividing workload.

Compiled code called by R (such as C++ code through RCpp or C-code through
base R's interface) can execute multi-threaded code for internal purposes,
using e.g. openMP. A limitation is that compiled code cannot call R's C API
from multiple threads (in many cases). For example, it is not thread-safe
to create R-variables from multiple threads running in C. (R's variable
administration is such that the order of (un)making them from compiled code
matters).

I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk
provided answers about that in your SO-question.

Best,
Mark










Op do 12 mei 2016 om 14:46 schreef Charles Determan :

> R Developers,
>
> Could someone help explain what it means that R is single threaded?  I am
> trying to understand what is actually going on inside R when users want to
> parallelize code.  For example, using mclapply or foreach (with some
> backend) somehow allows users to benefit from multiple CPUs.
>
> Similarly there is the RcppParallel package for RMatrix/RVector objects.
> But none of these address the general XPtr objects in Rcpp.  Some readers
> here may recognize my question on SO (
>
> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr
> )
> where I was curious about parallel calls to C++/Rcpp functions that return
> XPtr objects.  I am being a little more persistent here as this limitation
> provides a very hard stop on the development on one of my packages that
> heavily uses XPtr objects.  It's not meant to be a criticism or intended to
> be rude, I just want to fully understand.
>
> I am willing to accept that it may be impossible currently but I want to at
> least understand why it is impossible so I can explain to future users why
> parallel functionality is not available.  Which just echos my original
> question, what does it mean that R is single threaded?
>
> Kind Regards,
> Charles
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines

2017-06-14 Thread Mark van der Loo
Having some line-breaking character for string literals would have benefits
as string literals can then be constructed parse-time rather than run-time.
I have run into this myself a few times as well. One way to at least
emulate something like that is the following.

`%+%` <- function(x,y) paste0(x,y)

"hello" %+%
  " pretty" %+%
  " world"


-Mark



Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting :

> On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <
> murdoch.dun...@gmail.com> wrote:
>
> > On 14/06/2017 5:58 AM, Andreas Kersting wrote:
> > > Hi,
> > >
> > > I would really like to have a way to split long string literals across
> > > multiple lines in R.
> >
> > I don't understand why you require the string to be a literal.  Why not
> > construct the long string in an expression like
> >
> >   paste0("aaa",
> >  "bbb")
> >
> > ?  Surely the execution time of the paste0 call is negligible.
> >
> > Duncan Murdoch
>
> Actually "execution time" is precisely one of the reasons why I would like
> to see this feature as - depending on the context (e.g. in a tight loop) -
> the execution time of paste0 (or probably also glue, thanks Gabor) is not
> necessarily insignificant.
>
> The other reason is style: I think it is cleaner if we can construct such
> a long string literal without the need for a function call.
>
> Andreas
>
> > >
> > > Currently, if a string literal spans multiple lines, there is no way to
> > > inhibit the introduction of newline characters:
> > >
> > >  > "aaa
> > > + bbb"
> > > [1] "aaa\nbbb"
> > >
> > >
> > > If a line ends with a backslash, it is just ignored:
> > >
> > >  > "aaa\
> > > + bbb"
> > > [1] "aaa\nbbb"
> > >
> > >
> > > We could use this fact to implement string splitting in a fairly
> > > backward-compatible way, since currently such trailing backslashes
> > > should hardly be used as they do not have any effect. The attached
> patch
> > > makes the parser ignore a newline character directly following a
> backslash:
> > >
> > >  > "aaa\
> > > + bbb"
> > > [1] "aaabbb"
> > >
> > >
> > > I personally would also prefer if leading blanks (spaces and tabs) in
> > > the second line are ignored to allow for proper indentation:
> > >
> > >  >   "aaa \
> > > +bbb"
> > > [1] "aaa bbb"
> > >
> > >  >   "aaa\
> > > +\ bbb"
> > > [1] "aaa bbb"
> > >
> > > This is also implemented by this patch.
> > >
> > >
> > > An alternative approach could be to have something like
> > >
> > > ("aaa "
> > > "bbb")
> > >
> > > or
> > >
> > > ("aaa ",
> > > "bbb")
> > >
> > > be interpreted as "aaa bbb".
> > >
> > > I don't know the ins and outs of the parser of R (hence: please very
> > > carefully review the attached patch), but I guess this would be more
> > > work to implement!?
> > >
> > >
> > > What do you think? Is there anybody else who is missing this feature in
> > > the first place?
> > >
> > > Regards,
> > > Andreas
> > >
> > >
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines

2017-06-14 Thread Mark van der Loo
I know it doesn't cause construction at parse time, and it was also not
what I said. What I meant was that it makes the syntax at least look a
little as if you have a line-breaking character within string literals.

Op wo 14 jun. 2017 om 14:18 schreef Joris Meys :

> Mark, that's actually a fair statement, although your extra operator
> doesn't cause construction at parse time. You still call paste0(), but just
> add an extra layer on top of it.
>
> I also doubt that even in gigantic loops the benefit is going to be
> significant. Take following example:
>
> atestfun <- function(x){
>   y <- paste0("a very long",
>  "string for testing")
>   grep(x, y)
> }
> atestfun2 <- function(x){
>   y <- "a very long
> string for testing"
>   grep(x,y)
> }
> cfun <- cmpfun(atestfun)
> cfun2 <- cmpfun(atestfun2)
>
> require(rbenchmark)
> benchmark(atestfun("a"),
>   atestfun2("a"),
>   cfun("a"),
>   cfun2("a"),
>   replications = 10)
>
> Which gives after 100,000 replications:
>
> test replications elapsed relative
> 1  atestfun("a")   100.831.339
> 2 atestfun2("a")   100.621.000
> 3  cfun("a")   100.811.306
> 4 cfun2("a")   100.621.000
>
> The patch can in principle make similar code marginally faster, but I'm
> not convinced the patch is going to make any real difference except for in
> some very specific and exotic cases. Even more, calling a function like the
> examples inside the loop is the only way I can come up with where this
> might be a problem. If you just construct the string inside the loop,
> there's two possibilities:
>
> - the string does not need to change, and then you better construct it
> outside of the loop
> - the string does need to change, and then you need paste() or paste0()
> anyway
>
> I'm not against incorporating the patch, as it would eliminate a few
> keystrokes. It's a neat idea, but I don't expect any other noticeable
> advantage from it.
>
> my humble 2 cents
> Cheers
> Joris
>
> On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo <
> mark.vander...@gmail.com> wrote:
>
>> Having some line-breaking character for string literals would have
>> benefits
>> as string literals can then be constructed parse-time rather than
>> run-time.
>> I have run into this myself a few times as well. One way to at least
>> emulate something like that is the following.
>>
>> `%+%` <- function(x,y) paste0(x,y)
>>
>> "hello" %+%
>>   " pretty" %+%
>>   " world"
>>
>>
>> -Mark
>>
>>
>>
>> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting <
>> r-de...@akersting.de>:
>>
>> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <
>> > murdoch.dun...@gmail.com> wrote:
>> >
>> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote:
>> > > > Hi,
>> > > >
>> > > > I would really like to have a way to split long string literals
>> across
>> > > > multiple lines in R.
>> > >
>> > > I don't understand why you require the string to be a literal.  Why
>> not
>> > > construct the long string in an expression like
>> > >
>> > >   paste0("aaa",
>> > >  "bbb")
>> > >
>> > > ?  Surely the execution time of the paste0 call is negligible.
>> > >
>> > > Duncan Murdoch
>> >
>> > Actually "execution time" is precisely one of the reasons why I would
>> like
>> > to see this feature as - depending on the context (e.g. in a tight
>> loop) -
>> > the execution time of paste0 (or probably also glue, thanks Gabor) is
>> not
>> > necessarily insignificant.
>> >
>> > The other reason is style: I think it is cleaner if we can construct
>> such
>> > a long string literal without the need for a function call.
>> >
>> > Andreas
>> >
>> > > >
>> > > > Currently, if a string literal spans multiple lines, there is no
>> way to
>> > > > inhibit the introduction of newline characters:
>> > > >
>> > > >  > "aaa
>> > > > + bbb"
>> > > > [1] "aaa\nbbb"
>> > > >
>> 

[Rd] bugs in documentation of stats::stl

2017-08-23 Thread Mark van der Loo
Dear list, R-core,


The documentation of stats::stl explicitly refers to the paper by
Cleveland[1] to explain the parameters. However, the description is
confusing, with two descriptions seeming to refer to the same parameter in
the paper.

s.window: [...] the loess window for seasonal extraction, which should be
odd and at least 7, according to Cleveland et al

--> The phrase 'odd and at least 7' refers to Cleveland's parameter n_(s),
section 3.5 of [1].

Confusing: Cleveland calls this 'seasonal smoothing', not extraction.


l.window:  the span (in lags) of the loess window of the low-pass filter
used for each subseries.[...]

--> The description 'low-pass filter used for each subseries' also seems to
correspond to Cleveland's parameter n_(s), in step two of the algorithm in
the reference. (section 2.2 of [1]).

Confusing: Cleveland does not apply a low-pass filter to each subseries[2].
The subseries are reconstructed to a single series and after that a
low-pass filter is applied (step 3 of the algorithm in section 2.2 of [1])


So what should it be? A literal reference to Cleveland's n_(s), n_(l), and
n_(t) would be really helpful here.



Thank you,
Best,
Mark


[1] https://www.wessa.net/download/stl.pdf
[2] well, technically he does in step 2 of the algorithm, but it is not
called a low-pass filter in the paper.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Natural vs National in R signon banner?

2017-09-01 Thread Mark van der Loo
The way it's phrased now makes it seem that English is not a Natural
language ("Natural language support *but* running in an English locale").
Why not just state: "running in an English locale" and leave it with that?
Better to leave something out than to be unclear (being correct formally
does not always mean being clear to all users).
-M

Op vr 1 sep. 2017 om 11:00 schreef Peter Dalgaard :

> Just leave it, I think. Some nations have 4 national languages (as Martin
> will know), some languages are not national, and adopted children often do
> not speak their native (=born) language... I suspect someone already put a
> substantial amount of thought into the terminology.
>
> -pd
>
>
> > On 1 Sep 2017, at 09:45 , Martin Maechler 
> wrote:
> >
> >> Paul McQuesten 
> >>on Thu, 31 Aug 2017 18:48:12 -0500 writes:
> >
> >> Actually, I do agree with you about Microsoft.
> >> But they have so many users that their terminology should not be
> ignored.
> >
> >> Here are a few more views:
> >
> >>
> https://www.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.performance/natl_lang_supp_locale_speed.htm
> >> https://docs.oracle.com/cd/E23824_01/html/E26033/glmbx.html
> >>
> http://support.sas.com/documentation/cdl/en/nlsref/69741/HTML/default/viewer.htm#n1n9bwctsthuqbn1xgipyw5xwujl.htm
> >>
> https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=GSA_config_nls
> >>
> https://sites.ualberta.ca/dept/chemeng/AIX-43/share/man/info/C/a_doc_lib/aixprggd/genprogc/nls.htm
> >>
> http://scc.ustc.edu.cn/zlsc/tc4600/intel/2017.0.098/compiler_f/common/core/GUID-1AEC889E-98A7-4A7D-91B3-865C476F603D.html
> >
> >> It does appear, however, that what I call 'National Language' is often
> >> referred to as 'Native Language'. And the 'National Language'
> terminology
> >> is said to not be used consistently:
> >> https://en.wikipedia.org/wiki/National_language
> >
> >> I do still feel, however, that claiming 'Natural Language' support in R
> >> sets expectations of new users overly high.
> >
> >> Thank you for spending so much time on such a minor nit.
> >
> > continuing the nits and gnats :
> >
> > I think I now understand what you mean.  From the little I
> > understand about English intricacies and with my not
> > fully developed gut feeling of good English (which I rarely
> > speak but sometimes appreciate when reading / listening),
> > I would indeed
> >
> > prefer  'Native Language'
> > to'Natural Language'
> >
> > Martin Maechler
> > ETH Zurich
> >
> >> Regards
> >
> >
> >
> >> On Thu, Aug 31, 2017 at 5:45 PM, Duncan Murdoch <
> murdoch.dun...@gmail.com>
> >> wrote:
> >
> >>> On 31/08/2017 6:37 PM, Paul McQuesten wrote:
> >>>
>  Thanks, Duncan. But if it is not inappropriate, I feel empowered to
> argue.
> 
>  According to this definition, https://en.wikipedia.org/wiki/
>  Natural_language:
>  In neuropsychology, linguistics and the philosophy of language, a
>  natural language or ordinary language is any language that has evolved
>  naturally in humans ...
> 
>  Thus this banner statement may appear over-claiming to a significant
>  fraction of R users.
> 
>  It seems that LOCALE is called 'National language' support in other
>  software systems.
>  Eg: https://www.microsoft.com/resources/msdn/goglobal/default.mspx
> 
> >>>
> >>> I wouldn't take Microsoft as an authority on this (or much of
> anything).
> >>> They really are amazingly incompetent, considering how much money they
> earn.
> >>>
> >>> Duncan Murdoch
> >>>
> >>>
>  And, yes, this is a low priority issue. All of you have better things
> to
>  do.
> 
>  R is an extremely powerful and comprehensive software system.
>  Thank you all for that.
>  And I would like to clean one gnat from the windshield.
> 
>  I just wax pedantic at times.
> 
>  On Thu, Aug 31, 2017 at 5:13 PM, Duncan Murdoch <
> murdoch.dun...@gmail.com
>  > wrote:
> 
>  On 31/08/2017 5:38 PM, Paul McQuesten wrote:
> 
>  The R signon banner includes this statement:
>  Natural language support but running in an English locale
> 
>  Should that not say 'National' instead of 'Natural'?
>  Meaning that LOCALE support is enabled, not that the interface
>  understands
>  human language?
> 
> 
>  No, "natural language" refers to human languages, but it doesn't
>  imply that R understands them.  NLS just means that messages may be
>  presented in (or translated to) other human languages in an
>  appropriate context.
> 
>  For example, you can start R on most platforms from the console using
> 
>  LANGUAGE=de R
> 
>  and instead of the start message you saw, you'll see
> 
>  R ist freie Software und kommt OHNE JEGLICHE GARANTIE.
>  Sie sind eingeladen, es unter bestimmten Bedingungen weiter zu
>  verbreiten.
>  Tippen Sie 'license()' 

Re: [Rd] R Configuration Variable: Maximum Memory Allocation per R Instance

2017-09-17 Thread Mark van der Loo
Dear Juan,

I'm not deeply familiar with the DB's you mention but it seems to me that
me that 'memory.limits' does what you want on one OS and you can use shell
commands to limit R's memory usage for *nix-alike systems (see
?memory.limits). Also, Jeroen Ooms wrote a nice article about this in the
JSS: https://www.jstatsoft.org/article/view/v055i07 . There's also a
package for it: RAppArmor.

-M




Op zo 17 sep. 2017 om 00:39 schreef Juan Telleria :

> Dear R Developers,
>
> In the same way that MySQL/MariaDB's Engine InnoDB or MyISAM/Aria have the
> innodb_buffer_pool_size or the key_buffer_size for setting the maximum
> amount of RAM which can be used by a Server Instance:
>
> ¿Would it be possible to create an R Configuration Variable which fixes the
> maximum amount of RAM memory to be used as Commit / Dynamic Memory
> Allocation?
>
> Thank you.
> Juan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Change to r-devel warns on #pragma

2017-12-11 Thread Mark van der Loo
Hi Patrick,

It was recently added as a cran policy (thanks Dirk's cran policy watch:
https://twitter.com/markvdloo/status/935810241190617088).

It seems to be a general stricter policy on keeping to the C(++) standard.
Warnings are there for a reason and should usually not be ignored. I'm not
familiar with the warning you are suppressing  but it seems likely that
your code might assume type size that is not guaranteed by the standard and
thus may differ a cross systems/compilers. (An example is wchar_t which has
typically 16b on Windows as guaranteed by the standard and 32b on Unix)

Best,
Mark

On Mon, Dec 11, 2017, 4:33 PM Patrick Perry  wrote:

> A recent change to r-devel causes an R CMD check warning when a C file
> includes a "#pragma GCC diagnostic ignored" pragma:
>
> https://github.com/wch/r-source/commit/b76c8fd355a0f5b23d42aaf44a879cac0fc31fa4
> . This causes the CRAN checks for the "corpus" package to emit a
> warning:
>
> https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-fedora-clang/corpus-00check.html
> .
>
> The offending code is in an upstream library bundled with the package:
> https://github.com/patperry/corpus/blob/master/src/table.c#L118
>
> #pragma GCC diagnostic push
> #pragma GCC diagnostic ignored "-Wtype-limits"
>  // gcc emits a warning if sizeof(size_t) > sizeof(unsigned)
>
>  if ((size_t)size > SIZE_MAX / sizeof(*items)) {
> #pragma GCC diagnostic pop
>
> This is code appears in the "corpus" library that gets bundled with the
> corpus r-package but can also be installed by itself. I am the
> maintainer for both projects but in theory the library is independent
> from the r package (the latter depends on the former). I put the pragma
> there in the first place because this is the cleanest way I know of to
> remove the gcc compiler warning "comparison is always false due to
> limited range of data type" which appears whenever sizeof(unsigned) <
> sizeof(size_t); the warning does not appear for clang.
>
> Does anyone have recommendations for what I should do to remove the R
> CMD check warning? Is it possible to do this while simultaneously
> removing the gcc warning? Note that the package does not use autoconf.
>
> Fortunately, I am the maintainer for the included library, so I can
> potentially remove the pragma. However, I can imagine that there are
> many other cases of R packages bundling C libraries where R package
> maintainers do not have control over the downstream source. Perhaps
> there is a compelling case for this new CRAN check that I'm not seeing,
> but it seems to me that this additional CRAN check will cause extra work
> for package developers without providing any additional safety for R
> users. Package developers that do not control bundled upstream libraries
> will have to resort to `sed s/^#pragma.*//` or manually patch unfamiliar
> code to remove the CRAN warning, potentially introducing bugs in the
> process.
>
>
> Patrick
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] vignettes present in 2 folders or won't work

2020-11-02 Thread Mark van der Loo
On Sun, Nov 1, 2020 at 10:39 PM Duncan Murdoch 
wrote:

> On 01/11/2020 2:57 p.m., Dirk Eddelbuettel wrote:
> >
> > The closest to a canonical reference for a static vignette is the basic
> blog
> > post by Mark at
> >
> >
> https://www.markvanderloo.eu/yaRb/2019/01/11/add-a-static-pdf-vignette-to-an-r-package/
> >
> > which I follow in a number of packages.
> >
> > Back to the original point by Alexandre: No, I do _not_ think we can do
> > without a double copy of the _pre-made_ pdf ("input") and the
> _resulting_ pdf
> > ("output").
> >
> > That bugs me a little too but I take it as a given as static / pre-made
> > vignettes are non-standard (given lack of any mention in WRE, and the
> pretty
> > obvious violation of the "spirit of the law" of vignette which is after
> all
> > made to run code, not to avoid it). Yet uses for static vignettes are
> pretty
> > valid and here we are with another clear as mud situation.
> >
>
> In many cases such files aren't vignettes.
>
> By definition, packages should contain plain text source code for
> vignettes.  They can contain other PDF files in inst/doc, but if you
> don't include the plain text source, those aren't vignettes.
>
> An exception would be a package that contains the source code but
> doesn't want to require CRAN or other users to run it, because it's too
> time-consuming, or needs obscure resources.  The CRAN policy discusses
> this.
>
> Duncan Murdoch
>
>
It would be nice if the documents in inst/doc were linked to on the CRAN
landing page of a package. I think that documents under inst/doc are a bit
hard to find if package authors do not create (possibly many) links to them
in Rd files or vignettes.

Cheers,
Mark

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] setting .libPaths() with parallel::clusterCall

2020-12-22 Thread Mark van der Loo
Dear all,

It is not possible to set library paths on worker nodes with
parallel::clusterCall (or snow::clusterCall) and I wonder if this is
intended behavior.

Example.

library(parallel)
libdir <- "./tmplib"
if (!dir.exists(libdir)) dir.create("./tmplib")

cl <- makeCluster(2)
clusterCall(cl, .libPaths, c(libdir, .libPaths()) )

The output is as expected with the extra libdir returned for each worker
node. However, running

clusterEvalQ(cl, .libPaths())

Shows that the library paths have not been set.

If this is indeed a bug, I'm happy to file it at bugzilla. Tested on R
4.0.3 and r-devel.

Best,
Mark
ps: a workaround is documented here:
https://www.markvanderloo.eu/yaRb/2020/12/17/how-to-set-library-path-on-a-parallel-r-cluster/


> sessionInfo()
R Under development (unstable) (2020-12-21 r79668)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS:   /home/mark/projects/Rdev/R-devel/lib/libRblas.so
LAPACK: /home/mark/projects/Rdev/R-devel/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=nl_NL.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=nl_NL.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=nl_NL.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats graphics  grDevices utils datasets  methods
[8] base

loaded via a namespace (and not attached):
[1] compiler_4.1.0

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] setting .libPaths() with parallel::clusterCall

2020-12-23 Thread Mark van der Loo
Dear Luke,

Thank you, this makes perfect sense.

I find it quite hard to express this issue in a way that is both compact
and understandable.
In any case, below you find a proposal for an update of the documentation.

Thank you again for all your work,
Mark



Index: src/library/parallel/man/clusterApply.Rd
===
--- src/library/parallel/man/clusterApply.Rd (revision 79673)
+++ src/library/parallel/man/clusterApply.Rd (working copy)
@@ -136,6 +136,15 @@
   more efficient than \code{parApply} but do less post-processing of the
   result.

+  Functions with a \code{fun} or \code{FUN} parameter send a serialized
+  copy of the argument from the main process to each worker node.
+  When the argument passed to \code{fun} or \code{FUN} is a function
+  this is equivalent to calling the same function on the worker node,
+  except when the function has an enclosing environment it modifies.
+  A notable example is \code{\link{.libPaths}}. To ensure that the
+  function local to each worker is called so it modifies its local
+  enclosing environment, pass the name of the function as a string.
+
   A chunk size of \code{0} with static scheduling uses the default (one
   chunk per node).  With dynamic scheduling, chunk size of \code{0} has the
   same effect as \code{1} (one invocation of \code{FUN}/\code{fun} per










On Tue, Dec 22, 2020 at 2:37 PM  wrote:

> On Tue, 22 Dec 2020, Mark van der Loo wrote:
>
> > Dear all,
> >
> > It is not possible to set library paths on worker nodes with
> > parallel::clusterCall (or snow::clusterCall) and I wonder if this is
> > intended behavior.
> >
> > Example.
> >
> > library(parallel)
> > libdir <- "./tmplib"
> > if (!dir.exists(libdir)) dir.create("./tmplib")
> >
> > cl <- makeCluster(2)
> > clusterCall(cl, .libPaths, c(libdir, .libPaths()) )
> >
> > The output is as expected with the extra libdir returned for each worker
> > node. However, running
> >
> > clusterEvalQ(cl, .libPaths())
> >
> > Shows that the library paths have not been set.
>
> Use this:
>
>  clusterCall(cl, ".libPaths", c(libdir, .libPaths()) )
>
> This will find the function .libPaths on the workers.
>
> Your clusterCall sends across a serialized copy of your process'
> .libPaths and calls that. Usually that is equivalent to calling the
> function found by the name you used on the workers, but not when the
> function has an enclosing environment that the function modifies by
> assignment.
>
> Alternate implementations of .libPaths that are more
> serialization-friendly are possible in principle but probably not
> practical given limitations of the base package.
>
> The distinction between providing a function value or a character
> string as the function argument to clusterCall and others could
> probably use a paragraph in the help file; happy to consider a patch
> if anyone wants to take a crack at it.
>
> Best,
>
> luke
>
> >
> > If this is indeed a bug, I'm happy to file it at bugzilla. Tested on R
> > 4.0.3 and r-devel.
> >
> > Best,
> > Mark
> > ps: a workaround is documented here:
> >
> https://www.markvanderloo.eu/yaRb/2020/12/17/how-to-set-library-path-on-a-parallel-r-cluster/
> >
> >
> >> sessionInfo()
> > R Under development (unstable) (2020-12-21 r79668)
> > Platform: x86_64-pc-linux-gnu (64-bit)
> > Running under: Ubuntu 20.04.1 LTS
> >
> > Matrix products: default
> > BLAS:   /home/mark/projects/Rdev/R-devel/lib/libRblas.so
> > LAPACK: /home/mark/projects/Rdev/R-devel/lib/libRlapack.so
> >
> > locale:
> > [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> > [3] LC_TIME=nl_NL.UTF-8LC_COLLATE=en_US.UTF-8
> > [5] LC_MONETARY=nl_NL.UTF-8LC_MESSAGES=en_US.UTF-8
> > [7] LC_PAPER=nl_NL.UTF-8   LC_NAME=C
> > [9] LC_ADDRESS=C   LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] parallel  stats graphics  grDevices utils datasets  methods
> > [8] base
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_4.1.0
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] 1954 from NA

2021-05-23 Thread Mark van der Loo
I wrote about this once over here:
http://www.markvanderloo.eu/yaRb/2012/07/08/representation-of-numerical-nas-in-r-and-the-1954-enigma/

-M



Op zo 23 mei 2021 15:33 schreef brodie gaslam via R-devel <
r-devel@r-project.org>:

> I should add, I don't know that you can rely on this
> particular encoding of R's NA.  If I were trying to restore
> an NA from some external format, I would just generate an
> R NA via e.g NA_real_ in the R session I'm restoring the
> external data into, and not try to hand assemble one.
>
> Best,
>
> B.
>
>
> On Sunday, May 23, 2021, 9:23:54 AM EDT, brodie gaslam via R-devel <
> r-devel@r-project.org> wrote:
>
>
>
>
>
> This is because the NA in question is NA_real_, which
> is encoded in double precision IEEE-754, which uses
> 64 bits.  The "1954" is just part of the NA.  The NA
> must also conform to the NaN encoding for double precision
> numbers, which requires that the "beginning" portion of
> the number be "0x7ff0" (well, I think it should be "0x7ff8"
> but that's a different story), as you can see here:
>
> x.word[hw] = 0x7ff0;
> x.word[lw] = 1954;
>
> Both those components are part of the same double precision
> value.  They are just accessed this way to make it easy to
> set the high bits (63-32) and the low bits (31-0).
>
> So NA is not just 1954, its 0x7ff0  & 1954 (note I'm
> mixing hex and decimals here).
>
> In IEEE 754 double precision encoding numbers that start
> with 0x7ff are all NaNs.  The rest of the number except for
> the first bit which designates "quiet" vs "signaling" NaNs can
> be anything.  R has taken advantage of that to designate the
> R NA by setting the lower bits to be 1954.
>
> Note I'm being pretty loose about endianess, etc. here, but
> hopefully this conveys the problem.
>
> In terms of your proposal, I'm not entirely sure what you gain.
> You're still attempting to generate a 64 bit representation
> in the end.  If all you need is to encode the fact that there
> was an NA, and restore it later as a 64 bit NA, then you can do
> whatever you want so long as the end result conforms to the
> expected encoding.
>
> In terms of using 'short' here (which again, I don't see the
> need for as you're using it to generate the final 64 bit encoding),
> I see two possible problems.  You're adding the dependency that
> short will be 16 bits.  We already have the (implicit) assumption
> in R that double is 64 bits, and explicit that int is 32 bits.
> But I think you'd be going a bit on a limb assuming that short
> is 16 bits (not sure).  More important, if short is indeed 16 bits,
> I think in:
>
> x.word[hw] = 0x7ff0;
>
> You overflow short.
>
> Best,
>
> B.
>
>
>
> On Sunday, May 23, 2021, 8:56:18 AM EDT, Adrian Dușa <
> dusa.adr...@unibuc.ro> wrote:
>
>
>
>
>
> Dear R devs,
>
> I am probably missing something obvious, but still trying to understand why
> the 1954 from the definition of an NA has to fill 32 bits when it normally
> doesn't need more than 16.
>
> Wouldn't the code below achieve exactly the same thing?
>
> typedef union
> {
> double value;
> unsigned short word[4];
> } ieee_double;
>
>
> #ifdef WORDS_BIGENDIAN
> static CONST int hw = 0;
> static CONST int lw = 3;
> #else  /* !WORDS_BIGENDIAN */
> static CONST int hw = 3;
> static CONST int lw = 0;
> #endif /* WORDS_BIGENDIAN */
>
>
> static double R_ValueOfNA(void)
> {
> volatile ieee_double x;
> x.word[hw] = 0x7ff0;
> x.word[lw] = 1954;
> return x.value;
> }
>
> This question has to do with the tagged NA values from package haven, on
> which I want to improve. Every available bit counts, especially if
> multi-byte characters are going to be involved.
>
> Best wishes,
> --
> Adrian Dusa
> University of Bucharest
> Romanian Social Data Archive
> Soseaua Panduri nr. 90-92
> 050663 Bucharest sector 5
> Romania
> https://adriandusa.eu
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] nchar reporting wrong width when zero-space character is present?

2014-11-19 Thread Mark van der Loo
Dear list,

If I include the zero-width non-breaking space (\ufeff) in a string,
nchar seems to compute the wrong number of columns used by 'cat'.

> x <- "f\ufeffoo"
> x
[1] "foo"
> nchar(x,type="width")
[1] 2

I would expect "3" here. Going through the documentation of 'Encoding'
and 'encodeString', I don't think this is expected behavior. Am I
missing something? If it is a bug I will file a report.

Secondly, the documentation of 'nchars' states that with type='chars'
(the default) it returns "the number of human-readable characters". I
get:

> nchar(x,type='chars')
[1] 4

I would hardly call the zero-width space human-readable. Also, since for example

> nchar("foo\r")
[1] 4

it is probably more accurate to say that the number of symbols
(abstract characters) are counted, noting that some of the symbols in
an alphabet represented by an encoding may be invisible (or hardly
visible).


Much thanks in advance,
Best, Mark


> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=nl_NL.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=nl_NL.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=nl_NL.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_3.1.2

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R string comparisons may vary with platform (plain text)

2014-11-24 Thread Mark van der Loo
The 'stringi' package claims robust cross-platform performance. It exports
much functionality of the ICU library and will attempt to install it when
not present.
The function 'stri_sort' accepts a collation argument that can be defined
with 'stri_opts_collator'.




On Sun, Nov 23, 2014 at 5:15 PM, Martin Morgan 
wrote:

>
> For many scientific applications one is really dealing with ASCII
> characters and LC_COLLATE="C", even if the user is running in non-C
> locales. What robust approaches (if any?) are available to write code that
> sorts in a locale-independent way? The Note in ?Sys.setlocale is not overly
> optimistic about setting the locale within a session.
>
> Martin Morgan
>
>
> On 11/23/2014 03:44 AM, Prof Brian Ripley wrote:
>
>> On 23/11/2014 09:39, peter dalgaard wrote:
>>
>>>
>>>  On 23 Nov 2014, at 01:05 , Henrik Bengtsson 
 wrote:

 On Sat, Nov 22, 2014 at 12:42 PM, Duncan Murdoch
  wrote:

> On 22/11/2014, 2:59 PM, Stuart Ambler wrote:
>
>> A colleague¹s R program behaved differently when I ran it, and we
>> thought
>> we traced it probably to different results from string comparisons as
>> below, with different R versions.  However the platforms also
>> differed.  A
>> friend ran it on a few machines and found that the comparison behavior
>> didn¹t correlate with R version, but rather with platform.
>>
>> I wonder if you¹ve seen this.  If it¹s not some setting I¹m unaware
>> of,
>> maybe someone should look into it.  Sorry I haven¹t taken the time to
>> read
>> the source code myself.
>>
>
> Looks like a collation order issue.  See ?Comparison.
>

 With the oddity that both platforms use what look like similar locales:

 LC_COLLATE=en_US.UTF-8
 LC_COLLATE=en_US.utf8

>>>
>>> It's the sort of thing thay I've tried to wrap my mind around multiple
>>> times
>>> and failed, but have a look at
>>>
>>> http://stackoverflow.com/questions/19967555/postgres-
>>> collation-differences-osx-v-ubuntu
>>>
>>>
>>> which seems to be essentially the same issue, just for Postgres. If you
>>> have
>>> the stamina, also look into the python question that it links to.
>>>
>>> As I understand it, there are two potential reasons: Either the two
>>> platforms
>>> are not using the same collation table for en_US, or at least one of
>>> them is
>>> not fully implementing the Unicode Collation Algorithm.
>>>
>>
>> And I have seen both with R.  At the very least, check if ICU is being
>> used
>> (capabilities("ICU") in current R, maybe not in some of the obsolete
>> versions
>> seen in this thread).
>>
>> As a further possibility, there are choices in the UCA (in R, see
>> ?icuSetCollate) and ICU can be compiled with different default choices.
>> It is
>> not clear to me what (if any) difference ICU versions make, but in R-devel
>> extSoftVersion() reports that.
>>
>>
>>  In general, collation is a minefield: Some languages have the same
>>> letters in
>>> different order (e.g. Estonian with Z between S and T); accented
>>> characters
>>> sort with the unaccented counterpart in some languages but as separate
>>> characters in others; some locales sort ABab, others AaBb, yet others
>>> aAbB;
>>> sometimes punctuation is ignored, sometimes not; sometimes multiple
>>> characters
>>> count as one, etc.
>>>
>>>  As ?Comparison has long said.
>>
>>
>>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] \U with more than 4 digits returns the wrong character

2014-12-04 Thread Mark van der Loo
Richie,

The R language definition [1] says (10.3.1):

\U \U{}
(where multibyte locales are supported and not on Windows, otherwise
an error). Unicode character with given hex code – sequences of up to
eight hex digits.


Best,
Mark

[1] http://cran.r-project.org/doc/manuals/r-release/R-lang.html
http://www.markvanderloo.eu
---
If you cannot quantify it,
you don't know what you're talking about


On Thu, Dec 4, 2014 at 8:00 PM, Richard Cotton  wrote:
> If I type a character using \U syntax that has more than 4 digits, I
> get the wrong character.  For example,
>
> "\U1d4d0"
>
> should print a mathematical bold script capital A.  See
> http://www.fileformat.info/info/unicode/char/1d4d0/index.htm
>
> On my machine, it prints the Hangul character corresponding to
>
> "\Ud4d0"
> http://www.fileformat.info/info/unicode/char/d4d0/index.htm
>
> It seems that the hex-digit part is overflowing at 16^4.
>
> I tested this on R3.1.2 and devel (2014-12-03 r67101) x64 under
> Windows.  I played around with Sys.setlocale and options("encoding"),
> but couldn't get the expected value.
>
> Can others reproduce this?  It feels like a bug, but experience tells
> me I probably have something silly going on with my setup.
>
> --
> Regards,
> Richie
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] \U with more than 4 digits returns the wrong character

2014-12-04 Thread Mark van der Loo
I agree. You could post a documentation bug and a request here:
https://bugs.r-project.org/bugzilla3/

Cheers, Mark

On Thu, Dec 4, 2014 at 8:37 PM, Richard Cotton  wrote:
> Great spot, thanks Mark.
>
> This really ought to appear somewhere in the ?Quotes help page.
>
> Having a warning under Windows might be nicer behaviour than silently
> returning the wrong value too.
>
> On 4 December 2014 at 22:24, Mark van der Loo  
> wrote:
>> Richie,
>>
>> The R language definition [1] says (10.3.1):
>>
>> \U \U{}
>> (where multibyte locales are supported and not on Windows, otherwise
>> an error). Unicode character with given hex code – sequences of up to
>> eight hex digits.
>>
>>
>> Best,
>> Mark
>>
>> [1] http://cran.r-project.org/doc/manuals/r-release/R-lang.html
>> http://www.markvanderloo.eu
>> ---
>> If you cannot quantify it,
>> you don't know what you're talking about
>>
>>
>> On Thu, Dec 4, 2014 at 8:00 PM, Richard Cotton  wrote:
>>> If I type a character using \U syntax that has more than 4 digits, I
>>> get the wrong character.  For example,
>>>
>>> "\U1d4d0"
>>>
>>> should print a mathematical bold script capital A.  See
>>> http://www.fileformat.info/info/unicode/char/1d4d0/index.htm
>>>
>>> On my machine, it prints the Hangul character corresponding to
>>>
>>> "\Ud4d0"
>>> http://www.fileformat.info/info/unicode/char/d4d0/index.htm
>>>
>>> It seems that the hex-digit part is overflowing at 16^4.
>>>
>>> I tested this on R3.1.2 and devel (2014-12-03 r67101) x64 under
>>> Windows.  I played around with Sys.setlocale and options("encoding"),
>>> but couldn't get the expected value.
>>>
>>> Can others reproduce this?  It feels like a bug, but experience tells
>>> me I probably have something silly going on with my setup.
>>>
>>> --
>>> Regards,
>>> Richie
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Regards,
> Richie
>
> Learning R
> 4dpiecharts.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Possible values for R version status

2015-03-23 Thread Mark van der Loo
In the R installation and administration manual[*] I see at least mentioned

  The alpha, beta and RC versions of an upcoming x.y.0 release are
available [...]

so 'beta' seems to be an option unless it is only  used informally there.

Mark

[*] 
http://cran.r-project.org/doc/manuals/r-release/R-admin.html#Using-Subversion-and-rsync


On Mon, Mar 23, 2015 at 2:17 PM, Richard Cotton  wrote:
> Is there a complete list somewhere of the possible values for R's
> status, as returned by version$status?
>
> I know about these values:
> Stable: ""
> Devel: "Under development (unstable)"
> Patched: "Patched"
> Release candidate: "RC"
> Alpha: "Alpha"
>
> Are there any others that I've missed?
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Development version of R: Improved nchar(), nzchar() but changed API

2015-04-27 Thread Mark van der Loo
Dear Martin,

Does the work on nchar mean that bugs #16090 and #16091 will be resolved
[1,2]?

Thanks,
Mark

[1] https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16090
[2] https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16091





On Sat, Apr 25, 2015 at 11:06 PM, James Cloos  wrote:

> > "GC" == Gábor Csárdi  writes:
>
> GC> You can get an RSS/Atom feed, however, if that's good:
> GC> https://github.com/wch/r-source/commits/master.atom
>
> That is available in gwene/gmane as:
>
>  gwene.com.github.wch.r-source.commits.trunk
>
> -JimC
> --
> James Cloos  OpenPGP: 0x997A9F17ED7DAEA6
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel