On Fri, 31 Jan 2014, Benjamin Ward (ENV) wrote:

Hi R helpers,

I have a set of data best shown in this below graph.

Each coloured line represents a statistic calculated across pairs of DNA sequences. And for each coloured line, I would like to identify breakpoints - so identify the chunks where the values are high, for example, in the light blue line, there is a large high segment just after x=2e+05. From googling the aim to find such points, I've read about something called change-point analysis, used with time series data and I wondered if it or a variant of it in R might be of use here, this data is a series of % values (double), all a single measurement i.e. for each line, a 'scanner' passed over two sequences and at each step recorded the % value. Can change-point analysis help me here and if so what package or method will allow me to do this making as little assumptions about my data as possible?

The graph didn't make it through but from what you describe it seems that the "tilingArray" package on Bioconductor would be helpful for you. See als Huber et al. (2006, Bioinformatics, 22(16), 1963-1970).

Other useful packages on CRAN include the packages: bcp, changepoint, cpm, segmented and strucchange (among others).

Thanks in advance,

Ben W.

[X]

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to