Dear All, Apologies for not posting a code snippet, but I really need a pointer about a methodology to look at my data and possibly some R package which can ease my task. I am given a set consisting of several multivariate noisy time series, let's call it {A}. Each A_i in {A}, in turn, consists of several numerical time series. Then I have another set of shorter time series {B}. Now, for every B_j in {B}, I need to determine the time series A_i where most likely B_j comes from (A_i is not just a subset of B_j). In other words, I need to determine the distance between A_i and B_j. I was thinking about the Mahalanobis distance described here.
http://en.wikipedia.org/wiki/Mahalanobis_distance However, I have several questions in my head 1) With the Mahalanobis distance, do I lose the info about the time structure of the data? I am not just comparing some distributions, but some time series and the ordering of the data is important. 2) Even if the use of the Mahalanobis distance was appropriate, it involves the calculation of a covariance matrix and a mean. Should I average A_i or B_j (or a subset of B_j having the same length as A_i)? And should I use a correlation matrix based on A_i or B_j? Any suggestion is welcome. Lorenzo [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.