[R] Strange lazy evaluation of default arguments

2017-09-02 Thread Matthias Gondan
Dear R developers,

sessionInfo() below

Please have a look at the following two versions of the same function:

1. Intended behavior:

> Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
+ {
+   print(c(u, l, mu)) # here, l is set to u’s value
+   u = u/sqrt(sigma2)
+   l = l/sqrt(sigma2)
+   mu = mu/sqrt(sigma2)
+   print(c(u, l, mu))
+ }
> 
> Su1()
[1] 100.00 100.00   0.53
[1] 23.2558140 23.2558140  0.1232558

In the first version, both u and l are correctly divided by 4.3.

2. Strange behavior:

> Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
+ {
+   # print(c(u, l, mu))
+   u = u/sqrt(sigma2)
+   l = l/sqrt(sigma2) # here, l is set to u’s value
+   mu = mu/sqrt(sigma2)
+   print(c(u, l, mu))
+ }
> 
> Su2()
[1] 23.2558140  5.4083288  0.1232558
In the second version, the print function is commented out, so the variable u 
is 
copied to l (lowercase L) at a later place, and L is divided twice by 4.3.

Is this behavior intended? It seems strange that the result depends on a 
debugging message.

Best wishes,

Matthias


> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=CLC_TIME=German_Germany.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1   


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange lazy evaluation of default arguments

2017-09-02 Thread Matthias Gondan
Dear Bill,

All makes perfect sense (including the late evaluation). I actually discovered 
the problem by looking at old code which used your proposed solution. Still I 
find it strange (and, hnestly, I don’t like R’s behavior in this respect), and 
I am wondering why u is not being copied to L just before u is assigned a new 
value. Of course, this would require the R interpreter to track all these 
dependencies in both ways incl. more complicated ones in which L might depend 
on more than just u.

In the future, I’ll avoid dependencies between parameters.

Su4 <- function(u=100, l=100, mu=0.53, sigma2=4.3^2) # instead of l=u

And maybe also „in-place“ changes of values…

Best regards,

Matthias

Von: William Dunlap
Gesendet: Samstag, 2. September 2017 19:41
An: Rui Barradas
Cc: Matthias Gondan; r-help@r-project.org
Betreff: Re: [R] Strange lazy evaluation of default arguments

Another way to avoid the problem is to not redefine variables that are 
arguments.  E.g.,

> Su3 <- function(u=100, l=u, mu=0.53, sigma2=4.3^2, verbose)
  {
    if (verbose) {
      print(c(u, l, mu))
    }
    uNormalized <- u/sqrt(sigma2)
    lNormalized <- l/sqrt(sigma2)
    muNormalized <- mu/sqrt(sigma2)
    c(uNormalized, lNormalized, muNormalized)
  }
> Su3(verbose=TRUE)
[1] 100.00 100.00   0.53
[1] 23.2558140 23.2558140  0.1232558
> Su3(verbose=FALSE)
[1] 23.2558140 23.2558140  0.1232558

Not redefining variables at all makes debugging easier, although it may waste 
space.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Sep 2, 2017 at 10:33 AM,  wrote:
Hello,

One way of preventing that is to use ?force.
Just put

   force(l)

right after the commented out print and before you change 'u'.

Hope this helps,

Rui Barradas



Citando Matthias Gondan :

Dear R developers,

sessionInfo() below

Please have a look at the following two versions of the same function:

1. Intended behavior:
Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
+ {
+   print(c(u, l, mu)) # here, l is set to u’s value
+   u = u/sqrt(sigma2)
+   l = l/sqrt(sigma2)
+   mu = mu/sqrt(sigma2)
+   print(c(u, l, mu))
+ }

Su1()
[1] 100.00 100.00   0.53
[1] 23.2558140 23.2558140  0.1232558

In the first version, both u and l are correctly divided by 4.3.

2. Strange behavior:
Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
+ {
+   # print(c(u, l, mu))
+   u = u/sqrt(sigma2)
+   l = l/sqrt(sigma2) # here, l is set to u’s value
+   mu = mu/sqrt(sigma2)
+   print(c(u, l, mu))
+ }

Su2()
[1] 23.2558140  5.4083288  0.1232558
In the second version, the print
function is commented out, so the variable u is
copied to l (lowercase L) at a later place, and L is divided twice by 4.3.

Is this behavior intended? It seems strange that the result depends on a 
debugging message.

Best wishes,

Matthias

sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    
LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1


        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange lazy evaluation of default arguments

2017-09-05 Thread Matthias Gondan
Dear S Ellison,

Thanks for the flowers! Indeed, I was actually considering to use it in my own 
teaching material, as well. With rounded numbers, of course.

Though I am still slightly disturbed about this feature. I thought, now it is 
the time to switch to Python, but that’s even worse, see here:

def add_elem(List=[]):
List.append('elem')
return List

>>> add_elem([2])
[2, 'elem']
>>> add_elem()
['elem']
>>> add_elem()
['elem', 'elem'] <<<<<<<<<<< why on earth does this happen? [it’s documented 
behavior, but still…]

So, I’ll stick with R. Still 25 years or so until retirement, but I’ll survive, 
even without crossreferenced default arguments.

Best wishes,

Matthias


Von: S Ellison
Gesendet: Dienstag, 5. September 2017 16:17
An: Matthias Gondan; r-help@r-project.org
Betreff: RE: [R] Strange lazy evaluation of default arguments

Mathias,
If it's any comfort, I appreciated the example; 'expected' behaviour maybe, but 
a very nice example for staff/student training!

S Ellison


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matthias
> Gondan
> Sent: 02 September 2017 18:22
> To: r-help@r-project.org
> Subject: [R] Strange lazy evaluation of default arguments
> 
> Dear R developers,
> 
> sessionInfo() below
> 
> Please have a look at the following two versions of the same function:
> 
> 1. Intended behavior:
> 
> > Su1 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
> + {
> +   print(c(u, l, mu)) # here, l is set to u’s value
> +   u = u/sqrt(sigma2)
> +   l = l/sqrt(sigma2)
> +   mu = mu/sqrt(sigma2)
> +   print(c(u, l, mu))
> + }
> >
> > Su1()
> [1] 100.00 100.00   0.53
> [1] 23.2558140 23.2558140  0.1232558
> 
> In the first version, both u and l are correctly divided by 4.3.
> 
> 2. Strange behavior:
> 
> > Su2 = function(u=100, l=u, mu=0.53, sigma2=4.3^2)
> + {
> +   # print(c(u, l, mu))
> +   u = u/sqrt(sigma2)
> +   l = l/sqrt(sigma2) # here, l is set to u’s value
> +   mu = mu/sqrt(sigma2)
> +   print(c(u, l, mu))
> + }
> >
> > Su2()
> [1] 23.2558140  5.4083288  0.1232558
> In the second version, the print function is commented out, so the variable u
> is copied to l (lowercase L) at a later place, and L is divided twice by 4.3.
> 
> Is this behavior intended? It seems strange that the result depends on a
> debugging message.
> 
> Best wishes,
> 
> Matthias
> 
> 
> > sessionInfo()
> R version 3.4.1 (2017-06-30)
> Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8
> x64 (build 9200)
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
> LC_MONETARY=German_Germany.1252
> [4] LC_NUMERIC=CLC_TIME=German_Germany.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> loaded via a namespace (and not attached):
> [1] compiler_3.4.1 tools_3.4.1
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


***
This email and any attachments are confidential. Any use...{{dropped:12}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quantiles with ordered categories

2017-03-14 Thread matthias-gondan
Dear R users,

This works: 

quantile(1:10, probs=0.5)

This fails (obviously):

quantile(factor(1:10), probs=0.5)

But why do quantiles for ordered factors not work either?

quantile(ordered(1:10), probs=0.5)

Is it because interpolation (see the optional type argument) is not defined? Is 
there an elegant workaround?

Thank you.

Best wishes,

Matthias

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quantiles with ordered categories

2017-03-14 Thread matthias-gondan
I found it:

quantile(ordered(1:10), probs=0.5, type=1) 

works, because type=1 seems to round up or down, whatever. The default option 
for is 7, which wants to interpolate, and then produces the error. 

Two options come to my mind:

- The error message could be improved.
- The default type could be 1 if the data is from ordered categories.
- Or both.

It is probably a little thing to fix, but I lack the skills to do this myself.

Best wishes,

Matthias

Von: Bert Gunter
Gesendet: Dienstag, 14. März 2017 21:34
An: matthias-gon...@gmx.de
Cc: r-help@r-project.org
Betreff: Re: [R] Quantiles with ordered categories

Inline.
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 14, 2017 at 12:36 PM,   wrote:
> Dear R users,
>
> This works:
>
> quantile(1:10, probs=0.5)
>
> This fails (obviously):
>
> quantile(factor(1:10), probs=0.5)
>
> But why do quantiles for ordered factors not work either?
>
> quantile(ordered(1:10), probs=0.5)
>
> Is it because interpolation (see the optional type argument) is not defined?
Yes.


Is there an elegant workaround?
No. How can there be? By definition, all that is assumed by an ordered
factor is an ordering of the categories. How can you "interpolate" in
ordered(letters[1:3]) . ASAIK there is no "a.5"  .

-- Bert



>
> Thank you.
>
> Best wishes,
>
> Matthias
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Test for stochastic dominance, non-inferiority test for distributions

2009-08-31 Thread Matthias Gondan
Dear R-Users,

Is anyone aware of a significance test which allows 
demonstrating that one distribution dominates another?

Let F(t) and G(t) be two distribution functions, the
alternative hypothesis would be something like:

F(t) >= G(t), for all t

null hypothesis: F(t) < G(t), for some t.


Best wishes,

Matthias


PS. This one would be ok, as well:

F(t) > G(t), for all t

null hypothesis: F(t) <= G(t), for some t.

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Accessing named elements of a vector

2010-02-25 Thread Matthias Gondan

Dear R developers,

A great R feature is that elements of vectors, lists and dataframes can 
have names:


vx = c(a=1, b=2)
lx = list(a=1, b=2)

Accessing element "a" of vx: vx['a']
Accessing element "a" of lx: lx[['a']] or lx$a

Might be a matter of taste, but I like the $ very much. Unfortunately, 
vx$a is not
functional. Would it break existing compatibility if the $ would be 
allowed to

access elements of a vector, as well?

Best regards,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonparametric generalization of ANOVA

2010-03-05 Thread Matthias Gondan
> This is your first of three postings in the last hour and they are all in 
> a category that could well be described as requests for tutoring in basic 
> statistical topics. I am of the impression you have been requested not to 
> engage in such behavior on this list. For this question for instance 
> there is an entire CRAN Task View available and you have been in 
> particular asked to sue such resource before posting.

Please allow me to ask for details on this task view, because I am interested 
in the topic of nonparametric ANOVAs, as well. To my knowledge,
there are some R scripts from Brunner et al. available on his website

http://www.ams.med.uni-goettingen.de/de/sof/ld/index.html

But they seem not to be working with current R versions.

Best regards,

Matthias Gondan



-- 
Sicherer, schneller und einfacher. Die aktuellen Internet-Browser -
jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/atbrowser

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] programming to calculate variance

2009-09-30 Thread Matthias Gondan

I think it should be

var(y[i-3:i-1,])

instead of 


var(x[i-3:i-1,])

otherwise the values of the vector are overwritten

Best wishes,

Matthias



marlene marchena schrieb:

Dear R-user

Suppose I have the following data

 y=c(2,1,5,8,11,3,1,7,50,21,33,7,60)

x=data.frame(y)

for(i in 4:nrow(x)) x[i,] =var(x[i-3:i-1,])

I'm trying to get a new variable with the variance of the 3 previous values
(just an example) and with NA in the three first positions. I know that my
for() is wrong
but I'm not able to find my error.

Any idea?

Thanks,

Marlene.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  



--
Dr. rer. nat. Matthias Gondan
Institut für Psychologie
Universität Regensburg
D-93050 Regensburg
Tel. 0941-943-3856
Fax 0941-943-3233
Email: matthias.gon...@psychologie.uni-regensburg.de
http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sscanf

2009-05-08 Thread Matthias Gondan
Dear list,

Apparently, there is no function like sscanf in R.

I have a string, "Condition: 311", and I would like
to read out the number and store it to a numeric
variable. Is there an easy way to do this?

Best wishes,

Matthias
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Embedded R: Test if initialized

2021-06-16 Thread Matthias Gondan
Dear R friends,

I am currently trying to write a piece of C code that uses „embedded R“, and 
for specific reasons*, I cannot keep track if R already has been initialized. 
So the code snippet looks like this:

LibExtern char *R_TempDir;

if(R_TempDir == NULL)
…throw exception R not initialized…

I have seen that the source code of Rf_initialize_R itself checks if it is 
ivoked twice (num_initialized), but this latter flag does not seem to 
accessible, or is it? 

int Rf_initialize_R(int ac, char **av)
{
int i, ioff = 1, j;
Rboolean useX11 = TRUE, useTk = FALSE;
char *p, msg[1024], cmdlines[1], **avv;
structRstart rstart;
Rstart Rp = &rstart;
Rboolean force_interactive = FALSE;

if (num_initialized++) {
fprintf(stderr, "%s", "R is already initialized\n");
exit(1);
}


Is the test of the TempDir a good substitute, or should I choose another 
solution? Having said this, it may be a good idea to expose a function 
Rf_R_initialized that performs such a test.

Thank you for your consideration.

Best regards,

Matthias

*The use case is an R library that connects to swi-prolog and allows the 
„embedded“ swi-prolog to establish the reverse connection to R. In that case, 
i.e., R -> Prolog -> R, I do not want to initialize R a second time.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] density with weights missing values

2021-07-12 Thread Matthias Gondan
Dear R users,

This works as expected:

• plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))

This raises an error

• plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1)))
• plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))

This seems to work (it triggers a warning that the weights don’t add up to 1, 
which makes sense*):

• plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1)))

Questions

• But shouldn’t the na.rm filter also filter the corresponding weights?
• Extra question: In case the na.rm filter is changed to filter the weights, 
the check for sum(weights) == 1 might trigger false positive warnings since the 
weights might not add up to 1 anymore

Best wishes,

Matthias


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density with weights missing values

2021-07-12 Thread Matthias Gondan
Weighted mean behaves differently:
• weight is excluded for missing x
• no warning for sum(weights) != 1

> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))
[1] 2.5
> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))
[1] NA
> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)
[1] 2




Von: Richard O'Keefe
Gesendet: Montag, 12. Juli 2021 13:18
An: Matthias Gondan
Betreff: Re: [R] density with weights missing values

Does your copy of R say that the weights must add up to 1?
?density doesn't say that in mine.   But it does check.

On Mon, 12 Jul 2021 at 22:42, Matthias Gondan  wrote:
>
> Dear R users,
>
> This works as expected:
>
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))
>
> This raises an error
>
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1)))
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))
>
> This seems to work (it triggers a warning that the weights don’t add up to 1, 
> which makes sense*):
>
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1)))
>
> Questions
>
> • But shouldn’t the na.rm filter also filter the corresponding weights?
> • Extra question: In case the na.rm filter is changed to filter the weights, 
> the check for sum(weights) == 1 might trigger false positive warnings since 
> the weights might not add up to 1 anymore
>
> Best wishes,
>
> Matthias
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density with weights missing values

2021-07-12 Thread matthias-gondan
The thing is that for na.rm=TRUE, I would expect the weights corresponding to 
the missing x to be removed, as well. Like in weighted.mean. So this one 
shouldn't raise an error,density(c(1, 2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 
1, 1, 1, 1, 1))Or am I missing something? 
 Ursprüngliche Nachricht Von: Bert Gunter 
 Datum: 12.07.21  16:25  (GMT+01:00) An: Matthias 
Gondan  Cc: r-help@r-project.org Betreff: Re: [R] 
density with weights missing values The behavior is as documented 
AFAICS.na.rmlogical; if TRUE, missing values are removed from x. If FALSE 
anymissing values cause an error.The default is FALSE.weightsnumeric vector of 
non-negative observation weights.NA is not a non-negative numeric.Bert 
Gunter"The trouble with having an open mind is that people keep coming alongand 
sticking things into it."-- Opus (aka Berkeley Breathed in his "Bloom County" 
comic strip )Bert Gunter"The trouble with having an open mind is that people 
keep coming alongand sticking things into it."-- Opus (aka Berkeley Breathed in 
his "Bloom County" comic strip )On Mon, Jul 12, 2021 at 6:10 AM Matthias Gondan 
 wrote:>> Weighted mean behaves differently:> • weight 
is excluded for missing x> • no warning for sum(weights) != 1>> > 
weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))> [1] 2.5> > 
weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))> [1] NA> > 
weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)> [1] 2>>>>> 
Von: Richard O'Keefe> Gesendet: Montag, 12. Juli 2021 13:18> An: Matthias 
Gondan> Betreff: Re: [R] density with weights missing values>> Does your copy 
of R say that the weights must add up to 1?> ?density doesn't say that in mine. 
  But it does check.>> On Mon, 12 Jul 2021 at 22:42, Matthias Gondan 
 wrote:> >> > Dear R users,> >> > This works as 
expected:> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))> >> > This 
raises an error> >> > • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, 
weights=c(1, 1, 1, 1, 1, 1)))> > • plot(density(c(1,2, 3, 4, 5, NA), 
na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))> >> > This seems to work (it 
triggers a warning that the weights don’t add up to 1, which makes sense*):> >> 
> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1)))> 
>> > Questions> >> > • But shouldn’t the na.rm filter also filter the 
corresponding weights?> > • Extra question: In case the na.rm filter is changed 
to filter the weights, the check for sum(weights) == 1 might trigger false 
positive warnings since the weights might not add up to 1 anymore> >> > Best 
wishes,> >> > Matthias> >> >> > [[alternative HTML version deleted]]> 
>> > __> > R-help@r-project.org 
mailing list -- To UNSUBSCRIBE and more, see> > 
https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting 
guide http://www.R-project.org/posting-guide.html> > and provide commented, 
minimal, self-contained, reproducible code.>>> [[alternative HTML 
version deleted]]>> __> 
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see> 
https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html> and provide commented, minimal, 
self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] density with weights missing values

2021-07-12 Thread matthias-gondan
You're right, of course. Extrapolating your argument a bit, the whole practice 
of na.rm is questionable, since there's always a reason for missingness (that 
is not in x and rarely elsewhere in the data)Best wishes Matthias 
 Ursprüngliche Nachricht Von: Jeff Newmiller 
 Datum: 12.07.21  18:44  (GMT+01:00) An: 
r-help@r-project.org, matthias-gondan , Bert Gunter 
 Cc: r-help@r-project.org Betreff: Re: [R] density with 
weights missing values Sure, you might think that.But most likely the reason 
this code has not been corrected is that when you give weights for missing data 
the most correct result is for your entire density to be invalid.Fix your 
inputs so they make sense to you and there is no problem. But absent your 
intellectual input to restructure your problem the weights no longer make sense 
once density() removes the NAs from the data.On July 12, 2021 9:13:12 AM PDT, 
matthias-gondan  wrote:>The thing is that for 
na.rm=TRUE, I would expect the weights>corresponding to the missing x to be 
removed, as well. Like in>weighted.mean. So this one shouldn't raise an 
error,density(c(1, 2, 3,>4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))Or 
am I missing>something? > Ursprüngliche Nachricht Von: Bert 
Gunter> Datum: 12.07.21  16:25  (GMT+01:00) 
An:>Matthias Gondan  Cc: r-help@r-project.org>Betreff: 
Re: [R] density with weights missing values The behavior is as>documented 
AFAICS.na.rmlogical; if TRUE, missing values are removed>from x. If FALSE 
anymissing values cause an error.The default is>FALSE.weightsnumeric vector of 
non-negative observation weights.NA is>not a non-negative numeric.Bert 
Gunter"The trouble with having an open>mind is that people keep coming alongand 
sticking things into it."-->Opus (aka Berkeley Breathed in his "Bloom County" 
comic strip )Bert>Gunter"The trouble with having an open mind is that people 
keep coming>alongand sticking things into it."-- Opus (aka Berkeley Breathed in 
his>"Bloom County" comic strip )On Mon, Jul 12, 2021 at 6:10 AM Matthias>Gondan 
 wrote:>> Weighted mean behaves>differently:> • weight 
is excluded for missing x> • no warning for>sum(weights) != 1>> > 
weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1,>1))> [1] 2.5> > 
weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))>>[1] NA> > 
weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1),>na.rm=TRUE)> [1] 2>>>>> 
Von: Richard O'Keefe> Gesendet: Montag, 12.>Juli 2021 13:18> An: Matthias 
Gondan> Betreff: Re: [R] density with>weights missing values>> Does your copy 
of R say that the weights must>add up to 1?> ?density doesn't say that in mine. 
  But it does check.>>>On Mon, 12 Jul 2021 at 22:42, Matthias Gondan 
>wrote:> >> > Dear R users,> >> > This works as 
expected:> >> > •>plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))> >> > This 
raises an>error> >> > • plot(density(c(1,2, 3, 4, 5, NA), 
na.rm=TRUE,>weights=c(1, 1, 1, 1, 1, 1)))> > • plot(density(c(1,2, 3, 4, 5, 
NA),>na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))> >> > This seems to work 
(it>triggers a warning that the weights don’t add up to 1, which 
makes>sense*):> >> > • plot(density(c(1,2, 3, 4, 5, NA), 
na.rm=TRUE,>weights=c(1, 1, 1, 1, 1)))> >> > Questions> >> > • But shouldn’t 
the>na.rm filter also filter the corresponding weights?> > • Extra>question: In 
case the na.rm filter is changed to filter the weights,>the check for 
sum(weights) == 1 might trigger false positive warnings>since the weights might 
not add up to 1 anymore> >> > Best wishes,> >>>> Matthias> >> >> > 
[[alternative HTML version deleted]]> >> 
>>__> > 
R-help@r-project.org>mailing list -- To UNSUBSCRIBE and more, see> 
>>https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the>posting 
guide http://www.R-project.org/posting-guide.html> > and>provide commented, 
minimal, self-contained, reproducible>code.>>> [[alternative HTML 
version deleted]]>>>__> 
R-help@r-project.org>mailing list -- To UNSUBSCRIBE and more, 
see>>https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the>posting 
guide http://www.R-project.org/posting-guide.html> and provide>commented, 
minimal, self-contained, reproducible code.> [[alternative HTML version 
deleted]]>>__>R-help@r-project.org 
mailing list -- To UNSUBSCRIBE and more, 
see>https://stat.ethz.ch/mail

Re: [R] density with weights missing values

2021-07-13 Thread Matthias Gondan
Thanks Martin and the others. I will do so accordingly. 

I guess the 0.1% of the population who uses density with weights will write 
code like this

x = c(1, 2, 3, NA)
weights = c(1, 1, 1, 1)
density(x[!is.na(x)], weights=weights[!is.na(x)])

These people won’t be affected. For the 0.01% of people with code like this,

density(x, weights=weights[!is.na(x)], na.rm=TRUE)

the corrected version would almost surely raise an error. Note that the error 
message can, in principle, check if length(x[!is.na(x)]) == length(the provided 
weights) and tell the programmer that this was the old behavior.

Best wishes,

Matthias

PS. Sorry for the HTML email. I’ve given up trying to fix such behavior.


Von: Martin Maechler
Gesendet: Dienstag, 13. Juli 2021 09:09
An: Matthias Gondan
Cc: r-help@r-project.org
Betreff: Re: [R] density with weights missing values

>>>>> Matthias Gondan 
>>>>> on Mon, 12 Jul 2021 15:09:38 +0200 writes:

> Weighted mean behaves differently:
> • weight is excluded for missing x
> • no warning for sum(weights) != 1

>> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))
> [1] 2.5
>> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))
> [1] NA
>> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)
> [1] 2


I'm sure the 'weights' argument in weighted.mean() has been used
much more often than the one in density().
Hence, it's quite "probable statistically" :-)  that the
weighted.mean() behavior in the NA case has been more rational
and thought through 

So I agree with you, Matthias, that ideally density() should
behave differently here,  probably entirely analogously to weighted.mean().

Still, Bert and others are right that there is no bug formally,
but something that possibly should be changed; even though it
breaks back compatibility for those cases,  such case may be
very rare (I'm not sure I've ever used weights in density() but
I know I've used it very much all those 25 years ..).

https://www.r-project.org/bugs.html

contains good information about determining if something may be
a bug in R *and* tell you how to apply for an account on R's
bugzilla for reporting it formally.
I'm hereby encouraging you, Matthias, to do that and then in
your report mention both density() and weighted.mean(), i.e., a
cleaned up version of the union of your first 2 e-mails..

Thank you for thinking about this and concisely reporting it.
Martin


> Von: Richard O'Keefe
> Gesendet: Montag, 12. Juli 2021 13:18
> An: Matthias Gondan
> Betreff: Re: [R] density with weights missing values

> Does your copy of R say that the weights must add up to 1?
> ?density doesn't say that in mine.   But it does check.

another small part to could be improved, indeed,
thank you, Richard.

--
Martin Maechler
ETH Zurich  and  R Core team
 
> On Mon, 12 Jul 2021 at 22:42, Matthias Gondan  
wrote:
>> 
>> Dear R users,
>> 
>> This works as expected:
>> 
>> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))
>> 
>> This raises an error
>> 
>> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 
1)))
>> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 
NA)))
[..]


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What are the pros and cons of the log.p parameter in (p|q)norm and similar?

2021-08-04 Thread matthias-gondan
Response to 1You need the log version e.g. in maximum likelihood, otherwise the 
product of the densities and probabilities can become very small.
 Ursprüngliche Nachricht Von: r-help-requ...@r-project.org 
Datum: 04.08.21  12:01  (GMT+01:00) An: r-help@r-project.org Betreff: R-help 
Digest, Vol 222, Issue 4 Send R-help mailing list submissions to
r-help@r-project.orgTo subscribe or unsubscribe via the World Wide Web, visit   
https://stat.ethz.ch/mailman/listinfo/r-helpor, via email, send a message with 
subject or body 'help' tor-help-request@r-project.orgYou can reach the 
person managing the list at   r-help-owner@r-project.orgWhen replying, 
please edit your Subject line so it is more specificthan "Re: Contents of 
R-help digest..."Today's Topics:   1. What are the pros and cons of the log.p 
parameter in  (p|q)norm and similar? (Michael Dewey)   2. Help with package 
EasyPubmed (bharat rawlley)   3. Re: Help with package EasyPubmed (bharat 
rawlley)   4. Re:  What are the pros and cons of the log.p parameter in  
(p|q)norm and similar? (Duncan Murdoch)   5. Re:  What are the pros and cons of 
the log.p parameter in  (p|q)norm and similar? (Bill Dunlap)   6. Creating 
a log-transformed histogram of multiclass data  (Tom Woolman)   7. Re: 
Creating a log-transformed histogram of multiclass data  (Tom 
Woolman)--Message:
 1Date: Tue, 3 Aug 2021 17:20:12 +0100From: Michael Dewey 
To: "r-help@r-project.org" 
Subject: [R] What are the pros and cons of the log.p 
parameter in (p|q)norm and similar?Message-ID: 
Content-Type: 
text/plain; charset="utf-8"; Format="flowed"Short versionApart from the ability 
to work with values of p too small to be of much practical use what are the 
advantages and disadvantages of setting this to TRUE?Longer versionI am 
contemplating upgrading various functions in one of my packages to use this and 
as far as I can see it would only have the advantage of allowing people to use 
very small p-values but before I go ahead have I missed anything? I am most 
concerned with negatives but if there is any other advantage I would mention 
that in the vignette. I am not concerned about speed or the extra effort in 
coding and expanding the documentation.-- 
Michaelhttp://www.dewey.myzen.co.uk/home.html--Message:
 2Date: Tue, 3 Aug 2021 18:20:52 + (UTC)From: bharat rawlley 
To: R-help Mailing List 
Subject: [R] Help with package EasyPubmedMessage-ID: 
<1046636584.2205366.1628014852...@mail.yahoo.com>Content-Type: text/plain; 
charset="utf-8"Hello, When I try to run the following code using the package 
Easypubmed, I get a null result - > batch_pubmed_download(query_7)NULL#query_7 
<- "Cardiology AND randomizedcontrolledtrial[Filter] AND 2011[PDAT]"However, 
the exact same search string yields 668 results on Pubmed. I am unable to 
figure out why this is happening. If I use the search string "Cardiology AND 
2011[PDAT]" then it works just fine. Any help would be greatly appreciatedThank 
you!  [[alternative HTML version 
deleted]]--Message: 3Date: Tue, 3 Aug 2021 18:26:40 
+ (UTC)From: bharat rawlley To: R-help Mailing 
List Subject: Re: [R] Help with package 
EasyPubmedMessage-ID: 
<712126143.2207911.1628015200...@mail.yahoo.com>Content-Type: text/plain; 
charset="utf-8"  Okay, the following search string resolved my issue  - 
"Cardiology AND randomized controlled trial[Publication type] AND 
2011[PDAT]"Thank you!    On Tuesday, 3 August, 2021, 02:21:38 pm GMT-4, bharat 
rawlley via R-help  wrote:    Hello, When I try to run 
the following code using the package Easypubmed, I get a null result - > 
batch_pubmed_download(query_7)NULL#query_7 <- "Cardiology AND 
randomizedcontrolledtrial[Filter] AND 2011[PDAT]"However, the exact same search 
string yields 668 results on Pubmed. I am unable to figure out why this is 
happening. If I use the search string "Cardiology AND 2011[PDAT]" then it works 
just fine. Any help would be greatly appreciatedThank you! [[alternative 
HTML version 
deleted]]__r-h...@r-project.org 
mailing list -- To UNSUBSCRIBE and more, 
seehttps://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide 
http://www.R-project.org/posting-guide.htmland provide commented, minimal, 
self-contained, reproducible code.  [[alternative HTML version 
deleted]]--Message: 4Date: Tue, 3 Aug 2021 14:53:28 
-0400From: Duncan Murdoch To: Michael Dewey 
, "r-help@r-project.org"  
Subject: Re: [R]  What are the pros and cons of the log.p 
parameter in(p|q)norm and similar?Message-ID: 
Content-Type: text/plain; 
charset="utf-8"; Format="flowed"On 03/08/2021 12:20 p.m., Michael Dewey wrote:> 
Short version> > Apart from the ability to work with values of p too small to 
be of much> practical use what a

[R] ylim with only one value specified

2012-10-09 Thread Matthias Gondan
Dear R developers,

I would like to have R choose the limits of the y-axis semi-automatically,
e.g., zero should be included, but the maximum should be chosen depending 
on the data.

Examples:

plot(1:10, 1:10) # selects min and max automatically
plot(1:10, 1:10, ylim=c(1, 10)) # manual definition
plot(1:10, 1:10, ylim=c(0, Inf)) # this would be a nice feature, i.e. lower y 
limit = 0 defined manually, upper limit = 10 selected automatically

The relevant code from plot.default would have to be modified:

old:

ylim <- if (is.null(ylim)) 
range(xy$y[is.finite(xy$y)])
else ylim

new:

ylim <- if (is.null(ylim)) 
range(xy$y[is.finite(xy$y)])
else if(length(ylim) == 2 & ylim[2] == Inf)
c(ylim[1], max(xy$y[is.finite(xy$y)])
else ylim

and some more if-else statements if cases like ylim=c(0, -Inf), 
ylim=c(Inf, 0), ylim=c(-Inf, 0) and ylim=c(-Inf, Inf)[as a replacement for NULL/
autoselection] and ylim=c(Inf, -Inf)[autoselection, reversed y axis] should be 
handled correctly.

I would find such a feature useful. Do you think it would interfere with other 
functions? 

Thank you for your consideration.

Best wishes,

Matthias Gondan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with "any" function

2012-11-17 Thread Matthias Gondan

The warning

1: In (ind.c == TRUE) & (ind.sgn == TRUE) :
   longer object length is not a multiple of shorter object length

means that ind.c and ind.sgn have different lengths, for whatever reason.
Although R continues the routine, the warning should, in general, not be 
ignored.


Try:
1:3 + 1:2
at the command prompt to see what type of unintended things happen.

Actually, 1:4 + 1:2 is running silently.

Best wishes,

Matthias

Am 17.11.2012 19:56, schrieb Rui Barradas:

Hello,

First of all, that's not an error message, it's a warning.
As for your condition, it's too complicated. You don't need to test a
logical value for equality to TRUE, it's redundant. And any(x, y) is
equivalent to any(x, y, x & y). Try

any(ind.c, ind.r, ind.sgn)

and you'll be fine.

Hope this helps,

Rui Barradas
Em 17-11-2012 18:36, Haris Rhrlp escreveu:

Dear R users,

I have the "any" function of R

any(ind.c, ind.r, ind.sgn)


all are logical factors

it works fine when any of three is true
but when they are combined it doesnt work.

i tried
this
any(ind.c, ind.r, ind.sgn,((ind.c==TRUE) & (ind.r==TRUE)),
((ind.c==TRUE) & (ind.sgn==TRUE)),((ind.r==TRUE) & (ind.sgn==TRUE)),
   ((ind.c==TRUE) & (ind.r==TRUE) & (ind.sgn==TRUE)))
}

but i have this error message
Warning messages:
1: In (ind.c == TRUE) & (ind.sgn == TRUE) :
longer object length is not a multiple of shorter object length
2: In (ind.c == TRUE) & (ind.r == TRUE) & (ind.sgn == TRUE) :
longer object length is not a multiple of shorter object length

I want  to check combined  the three logikal factors

any help will be welcome
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [OT] R on Atlas library

2010-08-06 Thread Matthias Gondan
Dear List,

I am aware this is slightly off-topic, but I am sure there are people who 
already had the problem and who perhaps solved it.

I am running long-lasting model fits using constrOptim command. At work
there is a linux computer (Quad Core, debian) on which I already have
compiled R and Atlas, in the hope that things will go faster on that
machine.

Atlas offers the possibility to be compiled for multiple cores, I used
that option, but without success. Performance meter ("top") for Linux
shows 25% CPU usage, and there is no loss of performance if I run
4 instances of R doing heavy matrix multiplications. Performance drops
when a 5th instance of R is doing the same job, so I assume my attempt
was not successful.

I am sure I did something wrong. Is there anybody who sucessfully
runs R with Atlas and all processors? A short description of the
necessary steps would be helpful.

Searching around the internet was not very encourageing. Some people
wrote that it is not so simple to have Atlas fully exploit a multicore
computer.

I hope this is not too unspecific.

Best wishes,

Matthias

-- 
Dr. rer. nat. Matthias Gondan
Institut für Psychologie
Universität Regensburg
D-93050 Regensburg
Tel. 0941-943-3856
Fax 0941-943-3233
Email: matthias.gon...@psychologie.uni-regensburg.de
http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html 
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimization subject to constraints

2010-08-09 Thread Matthias Gondan

 try command solnp in package Rsolnp

Am 09.08.2010 18:56, schrieb Dwayne Blind:

Hi !

Why not constrOptim ?

Dwayne

2010/8/9 Gildas Mazo


Dear R users,

I'm looking for tools to perform optimization subject to constraints,
both linear and non-linear. I don't mind which algorithm may be used, my
primary aim is to get something general and easy-to-use to study simples
examples.

Thanks for helping,

Gildas



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] optimization subject to constraints

2010-08-10 Thread Matthias Gondan

 try this (package Rsolnp)

library(Rsolnp)

g<- function(x)
{
  return(x[1]^2+x[2]^2)
} # constraint

f<- function(x)
{
  return(x[1]+x[2])
} # objective function

x0 = c(1, 1)

solnp(x0, fun=f, eqfun=g, eqB=c(1))



Am 10.08.2010 14:59, schrieb Gildas Mazo:

Thanks, but I still cannot get to solve my problem: consider this simple
example:



f<- function(x){
   return(x[1]+x[2])
} # objective function

g<- function(x){
   return(x[1]^2+x[2]^2)
} # constraint

#

I wanna Maximize f(x) subject to g(x) = 1. By hand the solution is
(1/sqrt(2), 1/sqrt(2), sqrt(2)). This is to maximizing a linear function
subject to a nonlinear equality constraint.  I didn't find any suitable
function in the packages I went through.

Thanks in advance,

Gildas





Spencer Graves a écrit :

To find every help page containing the term "constrained
optimization", you can try the following:


library(sos)
co<- findFn('constrained optimization')


   "Printing" this "co" object opens a table in a web browser with
all matches sorted first by the package with the most matches and with
hot links to the documentation page.


writeFindFn2xls(co)


   This writes an excel file, with the browser table as the second
tab and the first being a summary of the packages.  This summary table
can be made more complete and useful using the "installPackages"
function, as noted in the "sos" vignette.


   A shameless plug from the lead author of the  "sos" package.
   Spencer Graves


On 8/9/2010 10:01 AM, Ravi Varadhan wrote:

constrOptim can only handle linear inequality constraints.  It cannot
handle
equality (linear or nonlinear) as well as nonlinear inequality
constraints.

Ravi.

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On
Behalf Of Dwayne Blind
Sent: Monday, August 09, 2010 12:56 PM
To: Gildas Mazo
Cc: r-help@r-project.org
Subject: Re: [R] optimization subject to constraints

Hi !

Why not constrOptim ?

Dwayne

2010/8/9 Gildas Mazo


Dear R users,

I'm looking for tools to perform optimization subject to constraints,
both linear and non-linear. I don't mind which algorithm may be
used, my
primary aim is to get something general and easy-to-use to study
simples
examples.

Thanks for helping,

Gildas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide


http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] R on Atlas library

2010-08-11 Thread Matthias Gondan

Hi Allan,

Thank you for your response. Finally I succeeded. These were the 
commands to compile Atlas and R:


Atlas:

../configure -t 4 -Fa alg -fPIC
make
make install

(-fPIC is needed to wrap some of objects of lapack.a as into a shared lib)


R:
../configure --with-blas="/usr/local/lib/libptf77blas.a 
/usr/local/lib/libatlas.a -lpthread"

make
make install

Using your benchmark example, all CPUs seem to be used. It is now
working at such a high speed that I could not entirely be sure, "top"
refreshes only every 5 s.

Cheers

Matthias

Am 09.08.2010 11:51, schrieb Allan Engelhardt:

I don't know about the specific function (a simple, stand-alone
reproducible example is always helpful, see the posting guide for
details), but ATLAS certainly uses all my cores on a test like this:

s <- 7500 # Adjust for your hardware
a <- matrix(rnorm(s*s), ncol = s, nrow = s)
b <- crossprod(a) # Uses all cores here.

My configuration of R with ATLAS on Linux (Fedora) is described at
http://www.cybaea.net/Blogs/Data/Faster-R-through-better-BLAS.html

Maybe your distribution has single-threaded ATLAS and you forgot to
rebuild it with "enable_native_atlas 1" or the equivalent for your
platform?

Allan


On 06/08/10 15:12, Matthias Gondan wrote:

Dear List,

I am aware this is slightly off-topic, but I am sure there are people
who already had the problem and who perhaps solved it.

I am running long-lasting model fits using constrOptim command. At work
there is a linux computer (Quad Core, debian) on which I already have
compiled R and Atlas, in the hope that things will go faster on that
machine.

Atlas offers the possibility to be compiled for multiple cores, I used
that option, but without success. Performance meter ("top") for Linux
shows 25% CPU usage, and there is no loss of performance if I run
4 instances of R doing heavy matrix multiplications. Performance drops
when a 5th instance of R is doing the same job, so I assume my attempt
was not successful.

I am sure I did something wrong. Is there anybody who sucessfully
runs R with Atlas and all processors? A short description of the
necessary steps would be helpful.

Searching around the internet was not very encourageing. Some people
wrote that it is not so simple to have Atlas fully exploit a multicore
computer.

I hope this is not too unspecific.

Best wishes,

Matthias





--
Dr. rer. nat. Matthias Gondan
Institut für Psychologie
Universität Regensburg
D-93050 Regensburg
Tel. 0941-943-3856
Fax 0941-943-3233
Email: matthias.gon...@psychologie.uni-regensburg.de
http://www.psychologie.uni-r.de/Greenlee/team/gondan/gondan.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] random sequences for rnorm and runif

2011-02-03 Thread Matthias Gondan
Dear R experts,

For a fixed seed, the first random number produced by rnorm and runif 
has the same rank within the distribution, which I find useful. The 
following ranks differ, however.

> set.seed(123)
> runif(4)
[1] *0.2875775* 0.7883051 *0.4089769* 0.8830174

> set.seed(123)
> pnorm(rnorm(4))
[1] 0.2875775 0.4089769 0.9404673 0.5281055

I noticed that rnorm seems to 'eat' two seeds of the random
number generator, whereas runif eats only one seed. Is this 
intended behavior or do you think it should be fixed? 

The strange thing is that the 1st/3rd/5th etc number of rnorm 
corresponds to the 1st/2nd/3rd in runif. If two seeds are necessary, 
I would have expected the following correspondence, 2-1, 4-2, 6-3,
etc.

Temporary fix:

> myrnorm = function(n, mean=0, sd=1)
+ {
+ qnorm(runif(n), mean, sd)
+ } # myrnorm
> 
> set.seed(123)
> pnorm(myrnorm(4))
[1] 0.2875775 0.7883051 0.4089769 0.8830174

Best wishes,

Matthias Gondan
-- 
GMX DSL Doppel-Flat ab 19,99 Euro/mtl.! Jetzt mit 
gratis Handy-Flat! http://portal.gmx.net/de/go/dsl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] image problem in 2.13.1

2011-07-13 Thread Matthias Gondan
Dear R people,

On my freshly installed R-2.13.1 (winxp), the following code yields 
unsatisfactory results (irregular grid lines instead of a smooth plane):

image(matrix(1, nrow=100, ncol=100))

This is working fine,

image(matrix(1, nrow=100, ncol=100), useRaster=TRUE)

but then right-click and "save as metafile" causes a segmentation fault.

Everything worked fine on 2.13.0

Best wishes,

Matthias
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What makes R different from other programming languages?

2012-08-20 Thread Matthias Gondan

My vote:

1. Symbolic function arguments:

fn = function(a, b)
{
a/b
}

fn(b=10, a=2)

2. Names for elements of a vector and matrices

v = c(a=1, b=2)
v['a'] = v['a'] * 2

same for matrices

3. about 10,000 user-contributed packages on CRAN

4. weird things like

a = numeric(10)
a[1:10] = 1:2

a
answer: five times 1:2

which guarantee happy debugging

5. and, of course, much built-in statistical stuff


Am 20.08.2012 20:02, schrieb johannes rara:

My intention is to give a presentation about R programming language
for software developers. I would like to ask, what are the things that
make R different from other programming languages? What are the
specific cases where Java/C#/Python developer might say "Wow, that was
neat!"? What are the things that are easy in R, but very difficult in
other programming languages (like Java)?

Thanks,
-J

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] transform RTs

2012-08-30 Thread Matthias Gondan

you might to do something like

library(SuppDists)
t = runif(100, 100, 500) # original RT
t_IG = qinvGauss(ecdf(t)(t)-0.5/length(t), 1, 16)
plot(density(t_IG))

but what is the purpose of it? Usually reaction times are thought to
follow a certain kind of distribution (e.g. an inverse Gaussian 
distribution).


Am 29.08.2012 17:54, schrieb Katja Böer:

Hello,

I'm trying to transform reaction times which are not normally distributed
to an ex gaussian or an
inverse gaussian distribution, but I don't really know how to use the
exGAUS() function.

Can someone show me a script in which data has been transformed?

Thanks in advance

k

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] arrow egdes and lty

2011-11-14 Thread Matthias Gondan
Dear R developers,

I want to draw an arrow in a figure with lty=2. The
lty argument also affects the edge of the arrow, which is 
unfortunate. Feature or bug?

Is there some way to draw an arrow with intact edge, still
with lty=2?

Example code:

plot(1:10)
arrows(4, 5, 6, 7, lty=2)

Best wishes,

Matthias
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Re: Poisson GLM using non-integer response/predictors?

2011-12-30 Thread Matthias Gondan
Hi,

Use offset variables if count occurrences of an event and you want to 
model the
observation time.

glm(count ~ predictors + offset(log(observation_time)), family=poisson)

If you want to compare durations, look at library(survival), ?coxph

If tnoise_sqrt is the square root of tourist noise, your example seems
incorrect, because it is a predictor, not the dependent variable

tnoise_sqrt ~ lengthfeeding_log

Best wishes,

Matthias


Am 30.12.2011 16:29, schrieb Lucy Dablin:
> Great lists, I always find them useful, thank you to
> everyone who contributes to them.
>
>
> My question is regarding non-integer values from some data I
> collected on parrots when using the poisson GLM. I observed the parrots on a
> daily basis to see if they were affected by tourist presence. My key 
> predictors
> are tourist noise (averaged over a day period so decimal value, square root to
> adjust for skew),  tourist number (the
> number of tourists at a site, square root), and the number of boats passing 
> the
> site in a day (log). These are compared with predictors: total number of birds
> (count data, square root), average time devoted to foraging at site (log), 
> species
> richness (sqrt), and the number of flushes per day. Apart from the last one
> they are all non-integer values. When I run a glm for example:
>
>
> parrots<- glm(tnoise_sqrt ~ lengthfeeding_log, family =
> poisson)
>
> summary(parrots)
>
>
> There are warnings which are "27: In dpois(y, mu, log =
> TRUE) : non-integer x = 1.889822" I was advised to use the offset() function
> however this does not seem to correct the problem and I find the code 
> confusing.
> What GLM approach should I be using for multiple non-integer predictors and
> non-integer responses? Does my GLM approach seem appropriate?
> Thank you for taking the time to consider this.
>
>
>
>   
>   [[alternative HTML version deleted]]
>
>
>
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Re: Poisson GLM using non-integer response/predictors?

2011-12-30 Thread Matthias Gondan

Hi Ben,

Thanks for clarifying this, I used a misleading word, "model" the 
observation time
sounds as if observation time were the dependent variable - which it is 
not, of course,

instead, in the scenario described, the parrot counts are modeled.

Best wishes,

Matthias

Am 30.12.2011 20:50, schrieb Ben Bolker:

Matthias Gondan  gmx.de>  writes:


Hi,

Use offset variables if count occurrences of an event and you want to
model the
observation time.

glm(count ~ predictors + offset(log(observation_time)), family=poisson)

If you want to compare durations, look at library(survival), ?coxph

If tnoise_sqrt is the square root of tourist noise, your example seems
incorrect, because it is a predictor, not the dependent variable

tnoise_sqrt ~ lengthfeeding_log

Best wishes,

Matthias

Am 30.12.2011 16:29, schrieb Lucy Dablin:

Great lists, I always find them useful, thank you to
everyone who contributes to them.

My question is regarding non-integer values from some data I
collected on parrots when using the poisson GLM. I observed the
parrots on a daily basis to see if they were affected by tourist
presence. My key predictors are tourist noise (averaged over a day
period so decimal value, square root to adjust for skew), tourist
number (the number of tourists at a site, square root), and the
number of boats passing the site in a day (log). These are
compared with predictors: total number of birds (count data,
square root), average time devoted to foraging at site (log),
species richness (sqrt), and the number of flushes per day. Apart
from the last one they are all non-integer values. When I run a
glm for example:

  Your description sounds like you might already have transformed
your predictors: generally speaking, you don't want to do that
before running a GLM (the variance function incorporated in the
GLM takes care of heteroscedasticity, and the link function takes
care of nonlinearity in the response).

   I suspect you want total number of birds, number of flushes per day,
and species richness to be modeled as Poisson (or negative binomial --
see ?glm.nb in the MASS package).  Species richness *might* be
binomial, or more complicated, if you are drawing from a limited
species pool (e.g. if there are only 5 possible species and you
sometimes see 4 or 5 of them in a day).  Is the total number
of birds really non-integer *before* you square-root transform it?

Time devoted to foraging at the site is most easily
modeled as log-normal (unless the response includes zeros:
i.e., log-transform as you have already done and use lm),
or possibly Gamma-distributed (although you may want to
use a log link instead of the default inverse link).

  As Matthias said, offsets are used for the specific case of
non-uniform sampling effort (e.g. if you sampled different areas,
or for different lengths of time, every day).

   You may be interested in r-sig-ecol...@r-project.org , which
is an R mailing list specifically devoted to ecological questions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reference for dataset colon (package survival)

2012-01-17 Thread Matthias Gondan
Dear R team, dear Prof. Therneau,

library(survival)
data(colon)
?colon

gives me only a very rudimentary source (only a name). Is there a 
possibility to get a reference to the clinical trial these data
are taken from?

Many thanks in advance. With best wishes,

Matthias Gondan
--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] line width in legend of interaction.plot

2012-02-22 Thread Matthias Gondan

Dear R developers,

The following command produces an interaction plot with lwd=2.

interaction.plot(c(1, 2, 1, 2), c(1, 1, 2, 2), 1:4, lwd=2)

In the legend, however, lwd seems to be 1, which does not seem
to be intended behavior. Probably the lwd is not correctly forwarded
to legend:

from the interaction.plot source:

legend(xleg, yleg, legend = ylabs, col = col, pch = if (type %in%
c("p", "b"))
pch, lty = if (type %in% c("l", "b"))
lty, bty = leg.bty, bg = leg.bg) <- here I would add 
lwd=


Best wishes,

Matthias Gondan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Small Bug in constrOptim documentation

2008-06-18 Thread Matthias Gondan

Dear list,

I think there is a small bug in the constrOptim documentation (R-2.7.0):

## from optim
fr <- function(x) {   ## Rosenbrock Banana function
x1 <- x[1]
x2 <- x[2]
100 * (x2 - x1 * x1)^2 + (1 - x1)^2
}
grr <- function(x) { ## Gradient of 'fr'
x1 <- x[1]
x2 <- x[2]
c(-400 * x1 * (x2 - x1 * x1) - 2 * (1 - x1),
   200 *  (x2 - x1 * x1))
}

optim(c(-1.2,1), fr, grr)
#Box-constraint, optimum on the boundary
constrOptim(c(-1.2,0.9), fr, grr, ui=rbind(c(-1,0),c(0,-1)),ci=c(-1,-1))

# x<=0.9,  y-x>0.1 <<-- here
constrOptim(c(.5,0), fr, grr, ui=rbind(c(-1,0),c(1,-1)), ci=c(-0.9,0.1))

I think the ui/ci bounds translate to x <= 0.9 and x-y > 0.1, not y-x

Could anybody confirm this, please?

Best wishes,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem in anova with coxph object

2008-01-08 Thread Matthias Gondan
Dear R users,

I noticed a problem in the anova command when applied on
a single coxph object if there are missing observations in
the data:

This example code was run on R-2.6.1:

 > library(survival)
 > data(colon)
 > colondeath = colon[colon$etype==2, ]
 > m = coxph(Surv(time, status) ~ rx + sex + age + perfor, data=colondeath)
 > m
Call:
coxph(formula = Surv(time, status) ~ rx + sex + age + perfor,
data = colondeath)

   coef exp(coef) se(coef)  z  p
rxLev -0.028895 0.972  0.11037 -0.262 0.7900
rxLev+5FU -0.374286 0.688  0.11885 -3.149 0.0016
sex   -0.000754 0.999  0.09431 -0.008 0.9900
age0.002442 1.002  0.00405  0.603 0.5500
perfor 0.155695 1.168  0.26286  0.592 0.5500

Likelihood ratio test=12.8  on 5 df, p=0.0251  n= 929

 > anova(m, test='Chisq')
Analysis of Deviance Table
 Cox model: response is Surv(time, status)
Terms added sequentially (first to last)

Df  Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL   929 5860.4 
rx   2  12.1   927 5848.2 2.302e-03
sex  1 2.054e-05   926 5848.2   1.0
age  1   0.3   925 5847.9   0.6
perfor   1   0.3   924 5847.6   0.6

Now I include nodes which has some missing data:

 > m = coxph(Surv(time, status) ~ rx + sex + age + perfor + nodes, 
data=colondeath)
 > m
Call:
coxph(formula = Surv(time, status) ~ rx + sex + age + perfor +
nodes, data = colondeath)

  coef exp(coef) se(coef)  z   p
rxLev -0.08245 0.921  0.11168 -0.738 0.46000
rxLev+5FU -0.40310 0.668  0.12054 -3.344 0.00083
sex   -0.02854 0.972  0.09573 -0.298 0.77000
age0.00547 1.005  0.00405  1.350 0.18000
perfor 0.19040 1.210  0.26335  0.723 0.47000
nodes  0.09296 1.097  0.00889 10.460 0.0

Likelihood ratio test=88.3  on 6 df, p=1.11e-16  n=911 (18 observations 
deleted due to missingness)

 > anova(m, test='Chisq')
Analysis of Deviance Table
 Cox model: response is Surv(time, status)
Terms added sequentially (first to last)

Df  Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL   911 5700.6 
rx   2   0.0   909 5848.2   1.0
sex  1 2.054e-05   908 5848.2   1.0
age  1   0.3   907 5847.9   0.6
perfor   1   0.3   906 5847.6   0.6
nodes1 235.3   905 5612.3 4.253e-53

The strange thing is that rx is not significant anymore.

In the documentation for anova.coxph, there is a warning that

> The comparison between two or more models by |anova| or will only be 
> valid if they are fitted to the same dataset. This may be a problem if 
> there are missing values.
>
However, I inserted a single object to be analyzed sequentially. Is
this a bug in R, or is it covered by the warning?

Best wishes,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem in anova with coxph object

2008-01-08 Thread Matthias Gondan
Peter Dalgaard schrieb:
> Matthias Gondan wrote:
>   
>> Dear R users,
>>
>> I noticed a problem in the anova command when applied on
>> a single coxph object if there are missing observations in
>> the data:
>> 
...
>> In the documentation for anova.coxph, there is a warning that
>>  
>> 
>>> The comparison between two or more models by |anova| or will only be 
>>> valid if they are fitted to the same dataset. This may be a problem if 
>>> there are missing values.
>>>
>>> 
>>>   
>> However, I inserted a single object to be analyzed sequentially. Is
>> this a bug in R, or is it covered by the warning?
>>   
>> 
> Notice that you also lose the 18 observations in the comparison of  .~rx
> with the empty model.
>
> This is standard, losing observations on the way through an anova table
> leads to madness.
>
> What happens if you do something like
>
> coxph(Surv(time, status) ~ rx, 
> data=colondeath, subset=complete.cases(nodes))
>   

Then it is working fine, see here:

 > m = coxph(Surv(time, status) ~ rx + sex + age + perfor + nodes, 
data=colondeath, subset=complete.cases(nodes))
 > m
Call:
coxph(formula = Surv(time, status) ~ rx + sex + age + perfor +
nodes, data = colondeath, subset = complete.cases(nodes))


  coef exp(coef) se(coef)  z   p
rxLev -0.08245 0.921  0.11168 -0.738 0.46000
rxLev+5FU -0.40310 0.668  0.12054 -3.344 0.00083
sex   -0.02854 0.972  0.09573 -0.298 0.77000
age0.00547 1.005  0.00405  1.350 0.18000
perfor 0.19040 1.210  0.26335  0.723 0.47000
nodes  0.09296 1.097  0.00889 10.460 0.0

Likelihood ratio test=88.3  on 6 df, p=1.11e-16  n= 911
 > anova(m, test='Chisq')
Analysis of Deviance Table
 Cox model: response is Surv(time, status)
Terms added sequentially (first to last)

Df  Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL   911 5700.6 
rx   2  12.6   909 5688.0 1.866e-03
sex  1 2.523e-03   908 5688.0   1.0
age  1   0.4   907 5687.6   0.5
perfor   1   0.4   906 5687.2   0.5
nodes1  74.9   905 5612.3 4.898e-18

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova for stratified Cox regression

2008-01-15 Thread Matthias Gondan
Dear List,

I have tried a stratified Cox Regression, it is working fine, except for
the "Anova"-Tests:

Here the commands (should work out of the box):

library(survival)
d = colon[colon$etype==2, ]
m = coxph(Surv(time, status) ~ strata(sex) + rx, data=d)
summary(m)
# Printout ok
anova(m, test='Chisq')

This is the output of the anova command:

> Analysis of Deviance Table
>  Cox model: response is Surv(time, status)
> Terms added sequentially (first to last)
>
>  Df Deviance Resid. Df Resid. Dev P(>|Chi|)
> NULL   929 5233.5  
> strata(sex)   0929 
> rx2927 5221.2  
> Warning message:
> In is.na(coef(fit)) :
>   is.na() auf nicht-(Liste oder Vektor) des Typs 'NULL' angewendet

It should be possible to do Chi-Square-Tests in a stratified analysis, 
right?

Best wishes,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Two-way non-parametric ANOVA?

2008-01-23 Thread Matthias Gondan
Hi Diana,

Look at Edgar Brunner's Homepage

http://www.ams.med.uni-goettingen.de/en/sta/e.brunner.html

and here:

http://www.ams.med.uni-goettingen.de/en/sof/index.html

Unfortunately, the R script for a 2x2 design with two groups and two
timepoints (f1.ld.f1.r) seems to be broken. Uncommenting the branch to
the special case "case2x2" in the script should do well, and more than 2
groups or timepoints seem to be working, too.

The basic problem with non-parametric (i.e. ordinal) Anova with more
than one factor is that the covariance matrix is not homoscedastic
anymore after a rank transformation, and autoregressive or compound
symmetry covariance structures are destroyed.

Best wishes,

Matthias

Diana Perdiguero schrieb:
> We need a two-way non-parametric ANOVA in order to analysis properly some
> ecological data, do you know any reference in R? or how to do it?
> Thank you very much
> All the best
>
> diana
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [OT?] Question on bootstrapping

2008-02-02 Thread Matthias Gondan
Dear list,

This is off-topic, but perhaps there is an expert out there who can help 
me in
a bootstrapping test.

I have two samples A, B from, say, a normal distribution. A third sample R
is the vector of pairwise minima of  the same random variables:

A = rnorm(100)
B = rnorm(100)

A2 = rnorm(100)
B2 = rnorm(100)
R = pmin(A2, B2)

For some purposes, I want to do some bootstrapping. From A and B, this
should not be a problem:

sa = sample(A, replace=T)
sb = sample(B, replace=T)

But for the minima distribution, I have the "feeling" (sorry) that the 
bootstrapped
values are systematically too high:

a2 = sample(A, replace=T)
b2 = sample(B, replace=T)
sr = pmin(a2, b2)

Can anybody give me a hint whether my feeling is correct, and perhaps 
indicate
some literature in which such issues are handled?

Best wishes,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT?] Question on bootstrapping

2008-02-03 Thread Matthias Gondan
(little correction below)

> Dear list,
>
> This is off-topic, but perhaps there is an expert out there who can help 
> me in
> a bootstrapping test.
>
> I have two samples A, B from, say, a normal distribution. A third sample R
> is the vector of pairwise minima of  the same random variables:
>
> A = rnorm(100)
> B = rnorm(100)
>
> A2 = rnorm(100)
> B2 = rnorm(100)
> R = pmin(A2, B2)
>
> For some purposes, I want to do some bootstrapping. From A and B, this
> should not be a problem:
>
> sa = sample(A, replace=T)
> sb = sample(B, replace=T)
>
> But for the minima distribution, I have the "feeling" (sorry) that the 
> bootstrapped
> values are systematically too high:
>   

too high = compared to the random variable R, as defined above

> a2 = sample(A, replace=T)
> b2 = sample(B, replace=T)
> sr = pmin(a2, b2)
>
> Can anybody give me a hint whether my feeling is correct, and perhaps 
> indicate
> some literature in which such issues are handled?
>
> Best wishes,
>
> Matthias
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package for sample size calculation

2008-02-13 Thread Matthias Gondan
Dear list,

Is anyone aware of a library for sample size calculation in R, similar
to NQuery? I have to give a course in this area, and I would like to
enable the students playing around with this.

Best wishes,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cox model

2008-02-13 Thread Matthias Gondan
Hi Eleni,

The problem of this approach is easily explained: Under the Null 
hypothesis, the P values
of a significance test are random variables, uniformly distributed in 
the interval [0, 1]. It
is easily seen that the lowest of these P values is not any 'better' 
than the highest of the
P values.

Best wishes,

Matthias

Eleni Christodoulou schrieb:
> Hmm...I see. I think I will give a try to the univariate analysis
> nonetheless...I intend to catch the p-values for each gene and select the
> most significant from these...I have seen it in several papers.
>
> Best Regards,
> Eleni
>
> On Feb 13, 2008 2:59 PM, Terry Therneau <[EMAIL PROTECTED]> wrote:
>
>   
>>  What you appear to want are all of the univariate models.  You can get
>> this
>> with a loop (and patience - it won't be fast).
>>
>> ngene <- ncol(genes)
>> coefmat <- matrix(0., nrow=ngene, ncol=2)
>> for (i in 1:ngene) {
>>tempfit <- coxph(Surv(time, relapse) ~ genes[,i])
>>coefmat[i,] <- c(tempfit$coef, sqrt(tempfit$var))
>>}
>>
>>
>>  However, the fact that R can do this for you does not mean it is a good
>> idea.
>> In fact, doing all of the univariate tests for a microarray has been shown
>> by
>> many people to be a very bad idea.  There are several approaches to deal
>> with
>> the key issues, which you should research before going forward.
>>
>>  Terry Therneau
>>
>>
>> 
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package for sample size calculation

2008-02-13 Thread Matthias Gondan
Thanks a lot!

I have now three packages: asypow, pwr, and 
http://www.cs.uiowa.edu/~rlenth/Power/

This should solve most of the problems we will encounter in this 
introductory course.

Best wishes,

Matthias

Mitchell Maltenfort schrieb:
> the library is called "pwr" and is based on the Cohen book.
>
>
> On Feb 13, 2008 4:52 AM, Matthias Gondan <[EMAIL PROTECTED]> wrote:
>   
>> Dear list,
>>
>> Is anyone aware of a library for sample size calculation in R, similar
>> to NQuery? I have to give a course in this area, and I would like to
>> enable the students playing around with this.
>>
>> Best wishes,
>>
>> Matthias
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>> 
>
>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kaplan Meier function

2008-02-14 Thread Matthias Gondan
Hi Eleni, hi list,

here is small sample program, the library is "survival", it is included
in the standard R distribution.

> data(colon)
> s = survfit(Surv(time, status) ~ rx, data=colon)
> plot(s)
> plot(s, col=1:3)

By the way: Does anyone know a neat way to indicate the number of
patients under risk in
the curve? Like here:

http://www.drugs.com/pro/images/8c3bb783-1c3c-4d75-a502-2f7ee3632f41/inspra-image02.jpg

Best wishes,

Matthias

Eleni Christodoulou schrieb:
> Hi all,
>
> I am trying to draw a Kaplan-Meier curve and I found online that Kaplan -
> Meier estimates are computed with a function called km in the event package.
> Is there an update for that because when I choose to download packages in
> R,. there is no package called event, even though I have selected all the
> repositories.
>
> Thanks in advance,
> Eleni
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kaplan Meier function

2008-02-14 Thread Matthias Gondan
Frank E Harrell Jr schrieb:
> Matthias Gondan wrote:
>>> data(colon)
>>> s = survfit(Surv(time, status) ~ rx, data=colon)
>>> plot(s)
>>> plot(s, col=1:3)
>>
>> By the way: Does anyone know a neat way to indicate the number of
>> patients under risk in
>> the curve?
> That's one of many options in the survplot function in the Design 
> package, which uses the survival package.

Ah, thanks! Nice.

data(colon)
s = survfit(Surv(time, status) ~ rx, data=colon)
survplot(s, n.risk=TRUE, conf='none', time.inc=500)

Is there an option for drawing small vertical lines if a patient has 
been censored?

Best wishes,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A stats question -- about survival analysis and censoring

2008-03-10 Thread Matthias Gondan
Hi Geoff,

I think the answer to such a problem (overall survival vs. disease free 
survival) depends on the regulatory
environment, for example, in a phase III clinical trial, OS would be 
used, whereas in an equivalence study,
DFS would be used.

Best,

Matthias


Geoff Russell schrieb:
> Dear UseRs,
>
> Suppose I have data regarding smoking habits of a prospective cohort and wish
> to determine the risk ratio of colorectal cancer in the smokers compared to
> the non-smokers.  What do I do at the end of the study with people who die
> of heart disease? Can I just censor them exactly the same as people who become
> uncontactable or who die in a plane crash?  If not, why not?
>
> I'm thinking that heart disease isn't independent of smoking even though
> a death from heart disease is probably uninformative about colorectal
> cancer risk. Hence
> I suspect simply censoring these deaths will introduce a bias, but I don't 
> know
> how to correct for it.
>
> Many thanks,
>
> Geoff Russell -- an interested student
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Statistical question: one-sample binomial test for clustered data

2008-11-25 Thread Matthias Gondan

Dear list,

I hope the topic is of sufficient interest, because it is not
R-related. I have N=100 yes/no-responses from a psychophysics
paradigm (say Y Yes and 100-Y No-Responses). I want to see
whether these yes-no-responses are in line with a model
predicting a certain amount p of yes-responses. Standard
procedure would be a one-sample binomial test for the observed
proportion,

chi²(1 df) = (Y-Np)²/(Np) + [(100-Y)-N(1-p)]²/[N(1-p)]

Actually, this is the approximate chi²-test, but the sample
size seems to be reasonably high for an asymptotic test.

The problem is that the experiment took quite a while, and
the 100 responses are grouped into 20 blocks of 5 responses
each. The responses within the blocks are clustered, ICC is
about 0.13 or so.

Can anyone point me to some literature explaining a one-sample
binomial test / or chi² test for correlated data? Most of the
literature I found starts with more advanced stuff, e.g.
2x2 cross-tabulated data.

Best wishes,

Matthias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Function arguments sometimes ignored

2007-11-08 Thread Matthias Gondan
Dear List,

I encountered a strange problem when trying to print a
path diagram:

 > path.diagram(s,
+ ingore.double=FALSE,
+ edge.labels='values')
digraph "s" {
   rankdir=LR;
   size="8,8";
   node [fontname="Helvetica" fontsize=14 shape=box];
   edge [fontname="Helvetica" fontsize=10];
   center=1;
   "TIME" -> "SALARY" [label="1096.03"];
   "PUBS" -> "SALARY" [label="133"];
}
 > path.diagram(s, ignore.double=FALSE, edge.labels='values')
digraph "s" {
   rankdir=LR;
   size="8,8";
   node [fontname="Helvetica" fontsize=14 shape=box];
   edge [fontname="Helvetica" fontsize=10];
   center=1;
   "TIME" -> "SALARY" [label="1096.03"];
   "PUBS" -> "SALARY" [label="133"];
   "TIME" -> "PUBS" [label="38.97" dir=both];
   "TIME" -> "TIME" [label="18.3" dir=both];
   "PUBS" -> "PUBS" [label="196.12" dir=both];
   "SALARY" -> "SALARY" [label="57393086.81" dir=both];
}


The ignore.double option seems to be ignored when I split the
command across multiple lines. A bug in R? (Windows binary, 2.6.0)

Best wishes,

Matthias Gondan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.