Hi Gough,
A possible solution is to use the survreg() in the survival package
without specifying the covariates, i.e.
library(survival)
survreg(Surv(..)~1, dist="weibull")
where Surv(..) accepts information about "times", censoring/truncation
variables and dist allows to specify alternative distributions.
See ?Surv e ?survreg
hope this helps you,
Gough Lauren ha scritto:
Dear All,
I have two questions regarding distribution fitting.
I have several datasets, all left-truncated at x=1, that I am attempting
to fit distributions to (lognormal, weibull and exponential). I had
been using fitdistr in the MASS package as follows:
fitdistr<-(x,"weibull")
However, this does not take into consideration the truncation at x=1. I
read another posting in this forum that suggested using the argument
"lower" to truncate the distribution fitting. However, this does not
seem to be working. For example, when I attempt to fit a weibull
distribution truncated at x=1 using "lower", it seems to set the
best-fit shape parameter at 1:
fitdistr(x,"weibull",lower=1)
shape scale
1.00000000 9.87964337
(0.02358731) (0.40649570) ##I have tried this on other datasets also
truncated at x=1 and get the same result (i.e. shape=1).
Does anyone know how to successfully fit the exponential, weibull and
lognormal distributions to truncated data?
Secondly, as my datasets are large (>1000 data points) assessing the fit
of the distribution with kolmogorov smirnov goodness of fit tests is
routinely showing statistical significance for all distributions.
Therefore, I would like to plot the observed data with the theoretical
best fit distributions (weibull, exponential and lognormal) to visually
assess which fits the observed data best. So far I have been doing this
as follows:
fitdistr(x,"weibull")
shape scale
a b
D1<-density(x) ##density distribution of observed data
D2<-density(rweibull(1500,shape=a,scale=b)) ##density of a random
variable following the theoretical best fit weibull distribution with
shape parameter =a, scale parameter = b.
plot(range(D1$x),range(D1$y,D2$y),type="n",xlab="x",ylab="Density")
lines(D1,col="red")
lines(D2,col="blue")
This successfully plots the two density curves on the same graph, but it
plots data below the x=1 threshold - even for the observed data! I have
tried limiting the scale of x-axis using xlim=c(1,150) but the graph
still plots the origin of the graph as (0,0). I can only get different
origins if I limit x more extremely e.g. xlim=c(50,150). Does anyone
know how I can successfully change the origin of the graph to (1,0)?
Sorry for the long e-mail! Any help would be greatly appreciated.
Regards,
Lauren
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
====================================
Vito M.R. Muggeo
Dip.to Sc Statist e Matem `Vianelli'
Università di Palermo
viale delle Scienze, edificio 13
90128 Palermo - ITALY
tel: 091 6626240
fax: 091 485726/485612
http://dssm.unipa.it/vmuggeo
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.