Dear Stat Tistician; Have you looked at the original distribution graphically? When I use densityplot on that data I am unable to discern even a hint of two separate peaks. Furthermore the density looks peaked versus what I would expect from a "true" normal distribution with those empiric parameters. So I think the algorithm is attempting to estimate the proportions and variances of two distributions with equal means and different standard deviations. That would seem to be a particularly difficult task for a non-deterministic algorithm as this one clearly is. Under these circumstances I thought you might get better results if you constrained the means to be identically zero.
On the other hand my experiments with mixtools::normalmixEM on the clearly separable peaks in the faithful$waiting vector used in that package's help page also gave widely divergent results on estimated mixing proportions as well (even with arbvar=FALSE), so I am now very suspicious about the stability of either the method or possibly the implementation of that method for data of this size: data(faithful) > attach(faithful) > length(waiting) [1] 272 > system.time(out<-normalmixEM(waiting, arbvar = FALSE, epsilon = 1e-03)) > system.time(out2<-normalmixEM(waiting, arbvar = FALSE, epsilon = 1e-03)) > out$lambda [1] 0.4548029 0.5451971 > out2$lambda [1] 0.998016838 0.001983162 > system.time(out3<-normalmixEM(waiting, arbvar = FALSE, epsilon = 1e-03)) number of iterations= 5 user system elapsed 0.019 0.001 0.021 > out3$lambda [1] 0.3609454 0.6390546 I think I could do a lot better just "drawing a line by eyeball" between the two peaks: > densityplot(waiting) > sum(waiting>68) [1] 171 > sum(waiting<=68) [1] 101 This makes the observation of implausibly high variation in estimated mixing proportions more reproducible (at least if you are on a Mac): > set.seed(123) > out1 <- normalmixEM(waiting, arbvar = FALSE, epsilon = 1e-03) number of iterations= 8 > out1$lambda [1] 0.3608581 0.6391419 > out2 <- normalmixEM(waiting, arbvar = FALSE, epsilon = 1e-03) number of iterations= 3 > out2$lambda [1] 0.08257889 0.91742111 (I'm copying the package maintainer, Derek Young .) > sessionInfo() R version 3.0.0 beta (2013-03-22 r62364) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grDevices datasets splines graphics utils stats methods base other attached packages: [1] mixtools_0.4.6 segmented_0.2-9.4 boot_1.3-9 car_2.0-16 nnet_7.3-6 MASS_7.3-26 [7] data.table_1.8.8 reshape2_1.2.2 rms_3.6-3 Hmisc_3.10-1 survival_2.37-4 sos_1.3-5 [13] brew_1.0-6 lattice_0.20-14 loaded via a namespace (and not attached): [1] cluster_1.14.3 colorspace_1.2-1 dichromat_2.0-0 digest_0.6.3 ggplot2_0.9.3.1 [6] grid_3.0.0 gtable_0.1.2 labeling_0.1 munsell_0.4 plyr_1.8 [11] proto_0.3-10 RColorBrewer_1.0-5 scales_0.2.3 stringr_0.6.2 tools_3.0.0 On Mar 30, 2013, at 3:20 AM, Stat Tistician wrote: > Hi, > I am currently working on fitting a mixture density to financial data. > > I have the following data: > http://s000.tinyupload.com/?file_id=00083355432555420222 > > I want to fit a mixture density of two normal distributions. > > I have the formula: > f(l)=πϕ(l;μ1,σ21)+(1−π)ϕ(l;μ2,σ22) > > my R code is: > > normalmix<-normalmixEM(dat,k=2,fast=TRUE) > > pi<-normalmix$lambda[1] > mu1<-normalmix$mu[1] > mu2<-normalmix$mu[2] > sigma1<-normalmix$sigma[1] > sigma2<-normalmix$sigma[2] > > Now I have the problem, that the output is not consistent, i.e. every time > I run the code, I get different outputs! And they are very different, no > small differences, which could be due to the precision of the numerical > procedures. > > E.g. sometimes for pi I get > > [1] 0.2653939 > > or > > [1] 0.3318069 > > I already recognized, that sometimes the numbering is changed, so the pi of > 0.7 would be equal to a pi of 0.3. Okay, I got this, I don't know why the R > procedures does this, but this would not be a problem. But the problem is, > that the outputs are way to different, sometimes I even get an error > message (german): Fehler in while (dl > eps && iter < maxit) { : Fehlender > Wert, wo TRUE/FALSE nötig ist > > Also, the number of iterations is very different, from 29 up to 1000 ...... > > Anyone can help? Thanks a lot! > > [[alternative HTML version deleted]] > David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.