Thank you for your hints, now the goodness of fit test provides me
good results, but surprisingly for me the three-component model turns
out to be worse than the two-component one (indeed, I focused on the
three-component mixture because the two-component one exhibits a low
p-value).
In addition, I have noticed that for some data the function fails to
find good starting values, as you have mentioned in your previuous
answer. The problem is that the driver FLXMRnegbin() allows to specify
only the theta parameter (and only one value, even in the event of
mixtures of two or more components).
I have read the description of flexmix() function too, but it seems
that it does not allow to set starting values for the parameters of
the model. Am I right? Or is there a way to do it?
Achim Zeileis <achim.zeil...@uibk.ac.at> ha scritto:
On Tue, 8 Nov 2016, danilo.car...@uniparthenope.it wrote:
I tried the function flexmix() with the driver FLXMRnegbin() with
two components first, in order to compare its results with those
provided by my function mixnbinom(). In particular, I ran the
following code:
fm0 <- flexmix(y ~ 1, data = data.frame(y), k = 2, model = FLXMRnegbin())
where "y" is my vector of counts. The previous function provided me
the following parameters:
Comp.1 Comp.2
coef.(Intercept) 1.2746536 1.788578
theta 0.1418201 5.028766
with priors 0.342874 and 0.657126, respectively. I assume that the
coefficients "Intercept" represent the two means of the model (mu1
and mu2),
No, a log link is employed, i.e., exp(1.2746536) and exp(1.788578)
are the means.
while the "theta" coefficients are the size parameters (size1 and size2).
Yes.
Unfortunately, unlike my function mixnbinom(), the model computed
with flexmix() did not provide a good fit to my data (p-value ~0).
Is there something wrong in the process above?
Hard to say without a reproducible example. Using parameter values
similar to the ones you cite above, the following seems to do a
reasonable job:
## packages
library("countreg")
library("flexmix")
## artificial data from two NB distributions:
## 1/3 is NB(mu = 3.5, theta = 0.2) and
## 2/3 is NB(mu = 6.0, theta = 5.0)
set.seed(1)
y <- c(rnbinom(200, mu = 3.5, size = 0.2), rnbinom(400, mu = 6, size = 5))
## fit 2-component mixture model
set.seed(1)
fm <- flexmix(y ~ 1, k = 2, model = FLXMRnegbin())
## inspect estimated parameters -> look acceptable
parameters(fm)
exp(parameters(fm)[1,])
My experience was that finding good starting values may be a problem
for flexmix(). So maybe setting these in some better way would be
beneficial.
-------------------------------------------------------------
Danilo CaritÃ
PhD Candidate
University of Naples "Parthenope"
Dipartimento di Studi Aziendali e Quantitativi
via G. Parisi, 13, 80132 Napoli - Italy
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.