I am running GTM on the same datda space points but changing the number of latent space points, the number of basis functions and parameter sigma. I found a combination of such parameters that works fine. On the other hand on page 7 of the paper "The Generative Topographic Mapping" by Swensen, Bishop, and Williams, it is stated "Thre is no over-fitting if the number of sample points is increased since the nymber of degrees of freedom in the model is controlled by the mapping function y(x;W)".
However, since you have transalted the code from matLab to R I am pretty sure you know hwt is the cause of the fllowing messages, which routines generates them and under which circumstances. Once I have these details clear, I can possible try and avoid the event that causes them "gtm_trn: Warning -- M-Step matrix singular, using pinv.\n" 1: In chol.default(A, pivot = TRUE) : matrix not positive definite 2: In gtm_trn(T, FI, W, lambda, 1, b, 2, quiet = FALSE, minSing = 0.01) : Using 40 out of 40 eigenvalues Thank you so much. Maura -----Messaggio originale----- Da: Ondrej Such [mailto:ondrej.s...@gmail.com] Inviato: gio 01/04/2010 17.07 A: mau...@alice.it Oggetto: Re: Generative Topographic Map Hello Maura, Thank you for your email. Marcus Svensen, one of method's authors works at Microsoft, and can give you more insight how it works. Email marcu...@microsoft.com, home page http://research.microsoft.com/en-us/um/people/markussv/. I also found his Ph.D. thesis very insightful. I believe it is never useful to have as many data points in latent space as there are data points. And with <2000 points I can't imagine going beyond latent dimension 3 or 4, in applying GTM. I've heard that GTM can be useful mostly in situations, when data follows a relatively smooth manifold. Best, --Ondrej 2010/4/1 <mau...@alice.it> > Thank you. I figured that out myself last night. I always forget that > read.table does not actually read data into a matrix. > GTM MatLab toolbox comes with a nice guide to use the package which may as > well become an R vignette. > > Anyway, I got the singular matrix warnings myself and do not know whether I > should be concerned about it or not. > Moreover, I do not know how to avoid that. > I will go through some other experiments keeping the data space samples and > dimensionality fixed and changing some of the input parameters. > I stress our goal is NOT visualization. We do not know the intrinsic > dimensionality of that data space samples. Therefore we can only proceed by > trial-&-error. That is we vary the dimensionality of the embedding space. In > this experiment the dimensionality of the data space is 7 so we start out > projecting our original data to a 1D embedding space, then we try out a 2D > embedding space, ..., all the way up to a 6D embedding space. Since we do > not know the intrinsic dimensionality of the original data, we need a method > to evaluate the reliability of the projection. To assess that we reconstruct > the data back from the embedding to the data space and here we calculate the > RMSD between the original data and the reconstructed ones. Basically, using > RMSD, we need as many reconstructed points as the original number. Such a > requirement is achieved by choosing as many points in the latent space as in > the data space. Can such a choice be the cause of the matrix singularity ? > Futhermore, is the number of basis functions related to the number of latent > space points somehow ? > Unluckily, even GTM MatLab documentation is not explicitly providing any > clear criteria about the parameters choice and their dependence, if any. > > Thank you, > Maura > > > -----Messaggio originale----- > Da: Ondrej Such [mailto:ondrej.s...@gmail.com <ondrej.s...@gmail.com>] > Inviato: gio 01/04/2010 11.16 > A: mau...@alice.it > Oggetto: Re: Generative Topographic Map > > > Hello, > > the problem that's tripping the package is that T is a data.frame and not a > matrix. > > Simply replacing > > T <- read.table("DHA_TNH.txt") > > with > > T <- as.matrix(read.table("DHA_TNH.txt")) > > makes the code run (though warnings about singular matrices remain, I'm not > sure to what degree that is worrisome). I'd be curious, as to how you'd > suggest improving the documentation. > > Hope this helps, > > --Ondrej > > 2010/3/31 <mau...@alice.it> > > > I tried to use R version of package > > I noticed the original MatLab Pckage is much better documented. > > I had a look at the R demo code "gtm_demo" and found that variable Y is > > used in advanced of being created: > > I wrote my own few lines as follows: > > inDir <- "C:/Documents and Settings/Monville/Alanine > Dipeptide/DBP1/DHA" > > > > setwd(inDir) > > T <- read.table("DHA_TNH.txt") > > L <- 3 > > X <- matrix(nrow=nrow(T),ncol=3,byrow=TRUE) > > MU <- matrix(nrow=round(nrow(T)/5), ncol=L) > > > > for(i in 1:ncol(X)) { > > for(j in 1:nrow(X)) { > > X[j,i] <- RANDU() > > } > > } > > > > for(i in 1:ncol(MU)) { > > for(j in 1:nrow(MU)) { > > MU[j,i] <- RANDU() > > } > > } > > sigma <-1 > > > > FI <- gtm_gbf(MU,sigma,X) > > W <- gtm_ri(T,FI) > > Y= FI%*%W > > b = gtm_bi(Y) > > lambda <- 0.001 > > for (m in 1:15) { > > trnResult = gtm_trn(T, FI, W, lambda, 1, b, 2,quiet = TRUE, minSing = > > 0.01) > > W = trnResult$W > > b = trnResult$beta > > Y = FI %*% W > > } > > > > I ran the above script on my own data representing 1969 samples of 7 > > dihedral angles of a folding molecule (attached. > > Upon running the 1st iteration of the training function "gtm_trn" I get > the > > following error that I cannot interpret. > > Any help and/or suggestion is welcome: > > > > > trnResult = gtm_trn(T, FI, W, lambda, 1, b, 2,quiet = TRUE, minSing = > > 1.) > > Error in gtmGlobalR %*% T : > > requires numeric/complex matrix/vector arguments > > In addition: Warning messages: > > 1: In chol.default(A, pivot = TRUE) : matrix not positive definite > > 2: In gtm_trn(T, FI, W, lambda, 1, b, 2, quiet = TRUE, minSing = 1) : > > Using 7 out of 395 eigenvalues > > > > Thank you in advance, > > Maura > > > > > > > > > > Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger > e > > tutti i telefonini TIM! > > Vai su > > > > > > Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e > tutti i telefonini TIM! er > > tutti i telefonini TIM! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.