Re: [R] Coefficients paths - comparison of ridge, lasso and elastic net regression

2013-06-09 Thread Colstat
Hi, beginner
Do you have reproducible data?  I think your question is more related to
statistical learning theory than R.  You may want to watch Prof.Hastie's
webinar.  http://www.stanford.edu/~hastie/lectures.htm


On Wed, Jun 5, 2013 at 10:22 AM, beginner  wrote:

> I would like to compare models selected with ridge, lasso and elastic net.
> Fig. below shows coefficients paths using all 3 methods: ridge (Fig A,
> alpha=0), lasso (Fig B; alpha=1) and elastic net (Fig C; alpha=0.5). The
> optimal solution depends on the selected value of lambda, which is chosen
> based on cross validation.
>
> 
>
> When looking at these plots, I would expect the elastic net (Fig C) to
> exhibit a grouping effect. However it is not clear in the presented case.
> The coefficients path for lasso and elastic net are very similar. What
> could
> be the reason for this ? Is it just a coding mistake ? I used the following
> code in R:
>
>
> library(glmnet)
> X<- as.matrix(mydata[,2:22])
> Y<- mydata[,23]
> par(mfrow=c(1,3))
> ans1<-cv.glmnet(X, Y, alpha=0) # ridge
> plot(ans1$glmnet.fit, "lambda", label=FALSE)
> text (6, 0.4, "A", cex=1.8, font=1)
> ans2<-cv.glmnet(X, Y, alpha=1) # lasso
> plot(ans2$glmnet.fit, "lambda", label=FALSE)
> text (-0.8, 0.48, "B", cex=1.8, font=1)
> ans3<-cv.glmnet(X, Y, alpha=0.5) # elastic net
> plot(ans3$glmnet.fit, "lambda", label=FALSE)
> text (0, 0.62, "C", cex=1.8, font=1)
>
>
> The code used to plot elastic net coefficients paths is exactly the same as
> for ridge and lasso. The only difference is in the value of alpha. Alpha
> parameter for elastic net regression was selected based on the lowest MSE
> (mean squared error) for corresponding lambda values.
>
> Thank you for your help !
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Coefficients-paths-comparison-of-ridge-lasso-and-elastic-net-regression-tp4668722.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] In optim() function, second parameter in par() missing

2011-09-05 Thread colstat
Hi,
First time using the optim(), can someone please tell me what I am doing
wrong?  The error looks like this

Error in .Internal(pnorm(q, mean, sd, lower.tail, log.p)) : 
  'sd' is missing


An example of the error
dat = c(20, 19, 9, 8, 7, 4, 3, 2)
dat_mu=mean(dat)
dat_s=sd(dat)

max.func = function(dat, mu, sd) {
pnorm(dat, mu, sd)
}

optim(fn=max.func, dat=dat, par=c(mu=dat_mu, s=dat_s))

I get sd is missing error.  If I wrote par=c(s=dat_s, mu=dat_mu) , then it
tells me mu is missing.  Can someone please help?

Thanks!

Colin


--
View this message in context: 
http://r.789695.n4.nabble.com/In-optim-function-second-parameter-in-par-missing-tp3791391p3791391.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] In optim() function, second parameter in par() missing

2011-09-05 Thread colstat
Thanks, David.  You suggestion worked very well.  The par() in optim() only
takes one argument, so I combine it into a vector.  Now, I will run it with
my actual code and see what happens.
Colin

--
View this message in context: 
http://r.789695.n4.nabble.com/In-optim-function-parameter-in-par-missing-tp3791391p3792356.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] In optim() function, second parameter in par() missing

2011-09-15 Thread colstat
Hi, David
I happened to see this http://www.unc.edu/~monogan/computing/r/MLE_in_R.pdf
And they wrote optim(c(0,1),normal.lik1,y=y,method="BFGS")

But it doesn't actually work, like you mentioned, it only takes 2
parameters, is that the reason why.

R gives this error
> optim(par=parameters, normal.lik, x, method="BFGS")
Error in fn(par, ...) : argument "parameters" is missing, with no default


--
View this message in context: 
http://r.789695.n4.nabble.com/In-optim-function-parameter-in-par-missing-tp3791391p3816976.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data analysis: normal approximation for binomial

2011-11-19 Thread Colstat
Dear R experts,

I am trying to analyze data from an article, the data looks like this

Patient Age Sex Aura preCSM preFreq preIntensity postFreq postIntensity
postOutcome
1 47 F A 4 6 9 2 8 SD
2 40 F A/N 5 8 9 0 0 E
3 49 M N 5 8 9 2 6 SD
4 40 F A 5 3 10 0 0 E
5 42 F N 5 4 9 0 0 E
6 35 F N 5 8 9 12 7 NR
7 38 F A 5 NA 10 2 9 SD
8 44 M A 4 4 10 0 0 E
9 47 M A 4 5 8 2 7 SD
10 53 F A 5 3 10 0 0 E
11 41 F N 5 6 7 0 0 E
12 49 F A 4 6 8 0 0 E
13 48 F A 5 4 8 0 0 E
14 63 M N 4 6 9 15 9 NR
15 58 M N 5 9 7 2 8 SD
16 53 F A 4 3 9 0 0 E
17 47 F N 5 4 8 1 4 SD
18 34 F A NA  5 9 0 0 E
19 53 F N 5 4 9 5 7 NR
20 45 F N 5 5 8 5 4 SD
21 30 F A 5 3 8 0 0 E
22 29 F A 4 5 9 0 0 E
23 49 F N 5 9 10 0 0 E
24 24 F A 5 5 9 0 0 E
25 63 F N 4 19 7 10 7 NR
26 62 F A 5 8 9 11 9 NR
27 44 F A 5 3 10 0 0 E
28 38 F N 4 8 10 1 3 SD
29 38 F N 5 3 10 0 0 E

How do I do a binomial distribution z statistics with continuity
correction? basically normal approximation.
Could anyone give me some suggestions what I (or R) can do with these data?
I have tried tried histogram, maybe t-test? or even lattice?  what else can
I(or can R) do?
help please, thanks so much.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data analysis: normal approximation for binomial

2011-11-20 Thread Colstat
Hey, Joshua
Thank so much for your quick response.  Those examples you produced are
very good, I'm pretty impressed by the graphs.  When I ran the last line, I
hit an error, so I ran what's inside summary(), it give me

Error: could not find function "lmer"

Something with the package "lme4"?
Colin


On Sun, Nov 20, 2011 at 1:00 AM, Joshua Wiley wrote:

> Hi Colin,
>
> I have never heard of a binomial distribution z statistic with (or
> without for that matter) a continuity correction, but I am not a
> statistician.  Other's may have some ideas there.  As for other ways
> to analyze the data, I skimmed through the article and brought the
> data and played around with some different analyses and graphs.  I
> attached a file headache.txt with all the R script (including the data
> in an Rish format).  It is really a script file (i.e., .R) but for the
> listservs sake I saved it as a txt.  There are quite a few different
> things I tried in there so hopefully it gives you some ideas.
> Regardless of the analysis type used and whether one considers
> proportion that "significantly improved" or the raw frequency or
> intensity scores, I would say that concluding the treatment was
> effective is a good conclusion.  The only real concern could be that
> people would naturally get better on their own (a control group would
> be needed to bolster the causal inference drawn from a pre/post
> measurement).  However, given at least what I know about migraines, it
> is often a fairly chronic condition so over a relatively short time
> period, it seems implausible to conclude that as many people would be
> improving as this study reported.
>
> Cheers,
>
> Josh
>
> On Sat, Nov 19, 2011 at 7:43 PM, Colstat  wrote:
> > hey, Joshua
> > I was reading this paper, in attachment, and reproducing the results.
> >
> > I was really confused when he said in the paper "The results were then
> > statistically analyzed using binomial distribution z statistics with
> > continuity correction."  The data is binomial?  To me, this is a paired
> > t-test.
> >
> > What command should I use to get those results (the first paragraph in
> > Results section)?  Basically, it's a pre and post treatment problem.
> >
> > What other graphical analysis do you think is appropriate? reshape
> package?
> > lattice package, namely conditional graph?
> >
> > I know this might be too much, but I do really appreciate it if you do
> take
> > a look at it.
> >
> > Thanks,
> > Colin
> >
> >
> > On Sat, Nov 19, 2011 at 10:15 PM, Joshua Wiley 
> > wrote:
> >>
> >> Hi,
> >>
> >> I am not clear what your goal is.  There is a variety of data there.
> >> You could look at t-test differences in preIntensity broken down by
> >> sex, you could use regression looking at postIntensity controlling for
> >> preIntensity and explained by age, you could
> >>
> >> Why are you analyzing data from an article?  What did the article do?
> >> What you mention---some sort of z statistic (what exactly this was of
> >> and how it should be calculated did not seem like was clear even to
> >> you), histogram, t-test, lattice, are all very different things that
> >> help answer different questions, show different things, and in one is
> >> a piece of software.
> >>
> >> Without a clearer question and goal, my best advice is here are a
> >> number of different functions some of which may be useful to you:
> >>
> >> ls(pos = "package:stats")
> >>
> >> Cheers,
> >>
> >> Josh
> >>
> >> On Sat, Nov 19, 2011 at 3:01 PM, Colstat  wrote:
> >> > Dear R experts,
> >> >
> >> > I am trying to analyze data from an article, the data looks like this
> >> >
> >> > Patient Age Sex Aura preCSM preFreq preIntensity postFreq
> postIntensity
> >> > postOutcome
> >> > 1 47 F A 4 6 9 2 8 SD
> >> > 2 40 F A/N 5 8 9 0 0 E
> >> > 3 49 M N 5 8 9 2 6 SD
> >> > 4 40 F A 5 3 10 0 0 E
> >> > 5 42 F N 5 4 9 0 0 E
> >> > 6 35 F N 5 8 9 12 7 NR
> >> > 7 38 F A 5 NA 10 2 9 SD
> >> > 8 44 M A 4 4 10 0 0 E
> >> > 9 47 M A 4 5 8 2 7 SD
> >> > 10 53 F A 5 3 10 0 0 E
> >> > 11 41 F N 5 6 7 0 0 E
> >> > 12 49 F A 4 6 8 0 0 E
> >> > 13 48 F A 5 4 8 0 0 E
> >> > 14 63 M N 4 6 9 15 9 NR
> >> > 15 58 M N 5 9 7 2 8 SD
> >> &

Re: [R] Data analysis: normal approximation for binomial

2011-11-20 Thread Colstat
Hey, John

I like the explicit formula they put in there.  I looked around last night
and found this
http://www.stat.yale.edu/Courses/1997-98/101/binom.htm

which is basically normal approximation to the binomial, I thought that was
what the author was trying to get at?

Colin

On Sun, Nov 20, 2011 at 8:49 AM, John Kane  wrote:

> Hi Colin,
>
> I'm no statistician and it's been a very long time but IIRC a t-test is a
> 'modified version of a x-test that is used on small sample sizes.  (I can
> hear some of our statistians screaming in the background as I type.)
>
> In any case I thing a Z distribution is descrete and a standard normal is
> not so a user can use Yates continuity correction to interpolate values for
> the normal between the discrete z-values.  Or something like this.
>
> I have only encountered it once in a Psych stats course taught by an
> animal geneticist who seemed to think it was important. To be honest, it
> looked pretty trivial for the type of data I'd be likely to see.
>
> I cannot remember ever seeing a continuity correction used in a published
> paper--for that matter I have trouble remembering a z-test.
>
> If you want more information on the subject I found a very tiny bit of
> info at
> http://books.google.ca/books?id=SiJ2UB3dv9UC&pg=PA139&lpg=PA139&dq=z-test+with+continuity+correction&source=bl&ots=0vMTCUZWXx&sig=bfCPx0vynGjA0tHLRAf6B42x0mM&hl=en&ei=nQHJTo7LPIrf0gHxs6Aq&sa=X&oi=book_result&ct=result&resnum=2&ved=0CC0Q6AEwAQ#v=onepage&q=z-test%20with%20continuity%20correction&f=false
>
> A print source that, IIRC, has a discussion of this is "Hayes, W. (1981.
> Statistics. 3rd Ed., Holt Rinehart and Winston
>
> Have fun
>
> --- On Sat, 11/19/11, Colstat  wrote:
>
> > From: Colstat 
> > Subject: [R] Data analysis: normal approximation for binomial
> > To: r-help@r-project.org
> > Received: Saturday, November 19, 2011, 6:01 PM
> > Dear R experts,
> >
> > I am trying to analyze data from an article, the data looks
> > like this
> >
> > Patient Age Sex Aura preCSM preFreq preIntensity postFreq
> > postIntensity
> > postOutcome
> > 1 47 F A 4 6 9 2 8 SD
> > 2 40 F A/N 5 8 9 0 0 E
> > 3 49 M N 5 8 9 2 6 SD
> > 4 40 F A 5 3 10 0 0 E
> > 5 42 F N 5 4 9 0 0 E
> > 6 35 F N 5 8 9 12 7 NR
> > 7 38 F A 5 NA 10 2 9 SD
> > 8 44 M A 4 4 10 0 0 E
> > 9 47 M A 4 5 8 2 7 SD
> > 10 53 F A 5 3 10 0 0 E
> > 11 41 F N 5 6 7 0 0 E
> > 12 49 F A 4 6 8 0 0 E
> > 13 48 F A 5 4 8 0 0 E
> > 14 63 M N 4 6 9 15 9 NR
> > 15 58 M N 5 9 7 2 8 SD
> > 16 53 F A 4 3 9 0 0 E
> > 17 47 F N 5 4 8 1 4 SD
> > 18 34 F A NA  5 9 0 0 E
> > 19 53 F N 5 4 9 5 7 NR
> > 20 45 F N 5 5 8 5 4 SD
> > 21 30 F A 5 3 8 0 0 E
> > 22 29 F A 4 5 9 0 0 E
> > 23 49 F N 5 9 10 0 0 E
> > 24 24 F A 5 5 9 0 0 E
> > 25 63 F N 4 19 7 10 7 NR
> > 26 62 F A 5 8 9 11 9 NR
> > 27 44 F A 5 3 10 0 0 E
> > 28 38 F N 4 8 10 1 3 SD
> > 29 38 F N 5 3 10 0 0 E
> >
> > How do I do a binomial distribution z statistics with
> > continuity
> > correction? basically normal approximation.
> > Could anyone give me some suggestions what I (or R) can do
> > with these data?
> > I have tried tried histogram, maybe t-test? or even
> > lattice?  what else can
> > I(or can R) do?
> > help please, thanks so much.
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org
> > mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> > reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Should data for the linear mixed model analysis meet the three assumptions of ANOVA?

2011-12-18 Thread Colstat
I am working on similar stuff.  I use aov(), I didn't use any package.
And here's what I've been looking at
http://www.gardenersown.co.uk/Education/Lectures/R/anova.htm
http://www.personality-project.org/r/r.anova.html
HTH
On Sun, Dec 18, 2011 at 11:20 AM, David Winsemius wrote:

>
> On Dec 17, 2011, at 9:42 PM, Junli wrote:
>
>  Hi All,
>>
>> I am doing linear mixed model analysis for my multi-location
>> experiment using R package "lme4". I just wonder whether I should
>> check my data first to see whether they meet the three assumptions of
>> ANOVA, that is, independence,
>>
>
> Independence is not part of the data, but rather is a property that arises
> from the design and conduct of the study.
>
>
>  normality and homogeneity. I saw a lot
>> of examples and the manual of lme4, but no one did data check first.
>>
>
> Perhaps because they know more about regression than you do?
>
>
>  In my experiment, the assumption of homogeneity usually cannot be met.
>> I do not know whether it will affect the result a lot or not.
>>
>
> I'm guessing that you are talking about homogeneity of variances or
> homoschedasticity. I'm wondering how you propose to test that assumption
> _before_ you construct a model? You should refer back to your text book to
> see how this assumption was actually presented. If it tells you that one
> can check for that assumption before the model is created, then toss that
> book in the garbage.
>
> --
>
> David Winsemius, MD
> West Hartford, CT
>
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting leapfrog in R

2012-04-12 Thread Colstat
Dear List

Is there a package for leapfrog plotting (Hamiltonian Monte Carlo
estimation) in R?  I tried the actual "LEAPFrOG" package which doesn't
actually give the plot like this one?
http://xianblog.files.wordpress.com/2010/09/hamilton.jpg

How doe one plot this in R?  So, there semi-circle and dots on that
semi-circle.
I don't think curve() or plot() would produce such plot.  Thanks in advance!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to see a R function's code

2012-02-11 Thread Colstat
I was wondering how do I actually see what's inside a function, say,
density of t distribution, dt()?

I know for some, I can type the function name inside R and the code will be
displayed.  But for dt(), I get
> dt
function (x, df, ncp, log = FALSE)
{
if (missing(ncp))
.Internal(dt(x, df, log))
else .Internal(dnt(x, df, ncp, log))
}


I am curious because I am doing rejection sampling and want to find a
"bigger" distribution.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.