Lorenzo: This is a complicated and subtle question that I believe is mostly about statistical methodology, not R. I would suggest that you post your query to stats.stackexchange.com rather than here in order to determine *what* you should do. Then, if necessary, you can come back here to ask about *how* to do it in R (with code from your failed attempts, etc.).
Better yet, you might wish to have this discussion with a local expert, if you can find one. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, May 12, 2016 at 7:49 AM, Lorenzo Isella <lorenzo.ise...@gmail.com> wrote: > Dear All, > Please have a look at the code at the end of the email. > It is just an example of regression based on glmnet with some > artificial data. > My question is how I can evaluate the uncertainty of the prediction > yhat. > > It looks like there are some reasons for not providing a standard > error estimate, see e.g. > > > > http://stackoverflow.com/questions/12937331/how-to-get-statistical-summary-information-from-glmnet-model > and > https://www.reddit.com/r/statistics/comments/1vg8k0/standard_errors_in_glmnet/ > > However, from what I read in this thesis > > https://air.unimi.it/retrieve/handle/2434/153099/133417/phd_unimi_R07738.pdf > > (see sections 3.2 and 3.3) > > and in the quoted papers > > http://www.stat.cmu.edu/~fienberg/Statistics36-756/Efron1979.pdf > and > http://www.ams.org/journals/proc/2010-138-12/S0002-9939-2010-10474-4/S0002-9939-2010-10474-4.pdf > > there are some bootstrap methods that are quite general and applicable > well beyond the case of glmnet. > Is there anything already implemented to help me out? Is anybody aware > of this? > Cheers > > Lorenzo > > ######################################################################### > ######################################################################### > ######################################################################### > ######################################################################### > ######################################################################### > > > library(glmnet) > > > # Generate data > set.seed(19875) # Set seed for reproducibility > n <- 1000 # Number of observations > p <- 5000 # Number of predictors included in model > real_p <- 15 # Number of true predictors > x <- matrix(rnorm(n*p), nrow=n, ncol=p) > y <- apply(x[,1:real_p], 1, sum) + rnorm(n) > > # Split data into train (2/3) and test (1/3) sets > train_rows <- sample(1:n, .66*n) > x.train <- x[train_rows, ] > x.test <- x[-train_rows, ] > > y.train <- y[train_rows] > y.test <- y[-train_rows] > > > > fit.elnet <- glmnet(x.train, y.train, family="gaussian", alpha=.5) > > yhat <- predict(fit.elnet, s=fit.elnet$lambda, newx=x.test) > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.