Re: [R] question about reproducibility/consistency of principal component and lda directions in R

David Romano Sun, 10 Feb 2013 06:56:36 -0800

On Sat, Feb 9, 2013 at 11:43 AM, Uwe Ligges <lig...@statistik.tu-dortmund.de
> wrote:


>
>
> On 08.02.2013 20:14, David Romano wrote:
>
>> Hi everyone,
>>
>> I'm not exactly sure how to ask this question most clearly, but I hope
>> that
>> giving the context in which it occurs for me will help:   I'm trying to
>> compare the brain images of two patient populations; each image is
>> composed
>> of voxels (the 3D analogue of pixels), and I have two images per patient,
>> one reflecting grey matter concentration at each voxel, and the other
>> reflecting white matter concentration at each voxel.
>>
>> I determined the groups by means of an analysis that involved information
>> from both types of images, and what I set out to do was to get a rough
>> idea
>> of where in the brain the two groups showed the most striking differences.
>>
>> My first attempt was to replace -- on a voxel by voxel basis -- the
>> bivariate grey/white data by a combined univariate measure, namely the
>> first principal component score.   From these principal component scores I
>> calculated Cohen's d to obtain a rough estimate of the effect size at each
>> voxel, and the resulting brain images show very nice separation into
>> meaningful brain regions, some corresponding to negative effect sizes and
>> some to positive ones.
>>
>> What puzzles me about how nice the separation into brain regions is, is
>> that the meaning of positive and negative is determined by the choice of
>> the first principal component direction at each voxel, but this choice is
>> -- in principle (no pun intended -- sorry!) -- arbitrary.  (Meaning
>> whether
>> an eigenvector or its negative is chosen as the direction is in principle
>> arbitrary.)
>>
>> So here are my questions:   Does the algorithm used in R produce the same
>> principal component directions if applied to the same data repeatedly?
>>
>
> Yes, but it may change if you execute it on another machine (depends on
> compiler hence also 32-bit vs 64-bit and OS).
>
>
>
>  And if so, should the directions chosen by the algorithm change
>> continuously with the data?  For example, if one data set were obtained by
>> applying a small amount of noise to another, should the resulting
>> directions be close to each other (as opposed to close negative of each
>> other)?  (Assuming the data is far from being "singular" in some vague
>> sense I'm not sure how to make precise.)
>>
>
> Noise means the sign can change again.
>
> Of course, you can define yourself e.g. the direction of the very first
> value and change signs otherwise.
>
>
>
>  My second attempt was to do the same, but with the first lda scores, so I
>> have the same questions about lda directions, too.
>>
>
>
> Same for lda.
>
> Best,
> Uwe Ligges
>

Thanks, Uwe; all good to know.

Best,
David





>
>  Any light you could shed on these questions would be very welcome!
>>
>> Thanks in advance,
>> David Romano
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________**________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about reproducibility/consistency of principal component and lda directions in R

Reply via email to