Re: [R] Partial correlations and p-values

Charles C. Berry Sat, 05 Dec 2009 09:56:56 -0800

On Sat, 5 Dec 2009, Juliet Hannah wrote:

Your R code looks correct.


There are a couple of hiccups.

First the degrees of freedom for the partial correlation would be wrongeven if there was no missing data.


Because this is a straightforward calculation, I would be surprised if there
were any differences with SPSS.

There are differences. SPSS seems to use the correlation matrix computedwith a pairwise present method and compute partial correlations from that.


Following

        http://wiki.r-project.org/rwiki/doku.php?id=tips:data-matrices:part_corr

R.pp <- cor(cbind(x,y,z1,z2),use='pair')
R.comp <- cor(cbind(x,y,z1,z2),use='complete')
Rinv <- solve(R.pp)
D <- diag(1 / sqrt(diag(Rinv)))
P <- -D %*% Rinv %*% D
P[1,2]

[1] 0.4596122

Rinv <- solve(R.comp)
D <- diag(1 / sqrt(diag(Rinv)))
P <- -D %*% Rinv %*% D
P[1,2]

[1] 0.657214

The pairwise present value seems to be what SPSS is reporting.

The complete cases values is nearly (but not the same as) what you got.

A real issue here is how to usefully compute and test partialcorrelations in the presence of missing data. If you want to persue that,I would suggest opening a new thread with a subject line like 'partialcorrelations with missing observations'


HTH,

Chuck


It may be worthwhile to check

if SPSS  gives partial correlations or semipartial correlations. For example,
if you take the correlation between

py <- resid(lm(y ~ z1 + z2,data=mydat2))

and

x

where mydat2 has missing values removed, you get 0.47.

On Tue, Dec 1, 2009 at 8:24 PM, dadrivr <dadr...@gmail.com> wrote:


I am trying to calculate a partial correlation and p-values.  Unfortunately,
the results in R are different than what SPSS gives.

Here is an example in R (calculating the partial correlation of x and y,
controlling for z1 and z2):

x <- c(1,20,14,30,9,4,8)
y <- c(5,6,7,9,NA,10,6)
z1 <- c(13,8,16,14,26,13,20)
z2 <- c(12,NA,2,5,8,16,13)
fmx <- lm(x ~ z1 + z2, na.action = na.exclude)
fmy <- lm(y ~ z1 + z2, na.action = na.exclude)
yres <- resid(fmy)
xres <- resid(fmx)
cor(xres, yres, use = "p")
ct <- cor.test(xres, yres)
ct$estimate
ct$p.value

R give me:
r = .65, p = .23

However, SPSS calculates:
r = .46, p = .70

I think something may be different with R's handling of missing data, as
when I replace the NA's with values, R and SPSS give the same r-values,
albeit different p-values still.  I am doing pairwise case exclusion in both
R and SPSS.  Any ideas why I'm getting different values?  Is something wrong
with my formula in R?  Any help would be greatly appreciated.  Thanks!


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu               UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Partial correlations and p-values

Reply via email to