Hi,
we have some sequence data (animal vocalizations) and are wondering if
someone could either confirm that our analysis strategy is sound or shoot
it down (and hopefully make a recommendation for something better). We're
using the *hmm.discnp R* package.
For each individual, we have sequence data from two years, each treated as
a single long sequence. For each individual, we want to test whether a new
model fitted to year two data is significantly better at predicting year
two sequences than is the year one model.

We're thinking of a Monte-Carlo method as follows.
1. Fit an HMM to year one data (treated as one long sequence). LetÂ’s call
that HMM1.
2. Fit an HMM to year two data (treated as one long sequence). Let's call
that HMM2.
3. Using HMM2, generate 99 (or whatever) shorter sequences of equal length.
4. For each generated sequence, find its log likelihood using HMM1.
5. For each generated sequence, find its log likelihood using HMM2. At this
point we have 99 generated "year two" sequences that each have two log
likelihoods, one from HMM1 (null model) and the second from HMM2
(alternative model).
6. For each generated sequence, find the log likelihood ratio using the LLs
from HMM1 and HMM2.
7. Calculate LLR test stat = (log likelihood of the observed year one
sequence under HMM1 - log likelihood of the observed year two sequence
under HMM2)*2
8. Count how many LLRs are greater than test stat to determine if HMM2 is
significantly better than HMM1 at predicting year two sequences.

So our questions for the community are:
A. Do you see any glaring errors in logic or methods?
B. Does fitting HMM2 to sequences generated using HMM2 problematically bias
our test to favor HMM2 over HMM1? We could use half of our year two data to
create HMM2 and the other half to create a third HMM (HMMgenerate) used
solely to generate the 99 sequences, but we'd like to use the entire data
set from both years if possible.

Thanks in advance for your replies!

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to