[R] Gene expression clustering using several dependent samples

Moritz Kebschull Mon, 01 Jul 2013 00:23:39 -0700

Dear list.

I am looking at a dataset comprised of Affy images from disease-affected
tissue samples that I am trying to cluster.


The problem is that we have 2+ biopsies per study subject, and I am not
sure how to best account for their dependency. In contrast to cancer
samples, these biopsies differ to a certain extent in their disease
severity, i.e. they are not perfect replicates, but share certain
similarities since they are from the same person.

I first tried to just cluster all available biopsies using
ConsensusClusterPlus. However, this produced clusters of biopsies according
to their disease severity - often with different samples from the same
patient assigned to different clusters - and thatÂ´s not what I want. I am
trying to identify different classes between subjects, not biopsies.

For the diff exp analyses, we dealt with this issue by adding the patient
as a random effect to the model. Could I do something similar using
model-based clustering, perhaps also adding a variable for disease severity?

As an alternative, I have explored aggregating all available samples per
subject into one expression profile, and cluster the pattients using these
aggregates. I am, however, not convinced that this is right, since this
approach creates 'artificial' data.

Does anyone have an idea?

Many thanks,

Moritz

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Gene expression clustering using several dependent samples

Reply via email to