On Sat, 5 Dec 2009, Juliet Hannah wrote:

Your R code looks correct.

There are a couple of hiccups.

First the degrees of freedom for the partial correlation would be wrong even if there was no missing data.


Because this is a straightforward calculation, I would be surprised if there
were any differences with SPSS.

There are differences. SPSS seems to use the correlation matrix computed with a pairwise present method and compute partial correlations from that.

Following

        http://wiki.r-project.org/rwiki/doku.php?id=tips:data-matrices:part_corr

R.pp <- cor(cbind(x,y,z1,z2),use='pair')
R.comp <- cor(cbind(x,y,z1,z2),use='complete')
Rinv <- solve(R.pp)
D <- diag(1 / sqrt(diag(Rinv)))
P <- -D %*% Rinv %*% D
P[1,2]
[1] 0.4596122
Rinv <- solve(R.comp)
D <- diag(1 / sqrt(diag(Rinv)))
P <- -D %*% Rinv %*% D
P[1,2]
[1] 0.657214

The pairwise present value seems to be what SPSS is reporting.

The complete cases values is nearly (but not the same as) what you got.

A real issue here is how to usefully compute and test partial correlations in the presence of missing data. If you want to persue that, I would suggest opening a new thread with a subject line like 'partial correlations with missing observations'

HTH,

Chuck


It may be worthwhile to check
if SPSS  gives partial correlations or semipartial correlations. For example,
if you take the correlation between

py <- resid(lm(y ~ z1 + z2,data=mydat2))

and

x

where mydat2 has missing values removed, you get 0.47.

On Tue, Dec 1, 2009 at 8:24 PM, dadrivr <dadr...@gmail.com> wrote:

I am trying to calculate a partial correlation and p-values.  Unfortunately,
the results in R are different than what SPSS gives.

Here is an example in R (calculating the partial correlation of x and y,
controlling for z1 and z2):

x <- c(1,20,14,30,9,4,8)
y <- c(5,6,7,9,NA,10,6)
z1 <- c(13,8,16,14,26,13,20)
z2 <- c(12,NA,2,5,8,16,13)
fmx <- lm(x ~ z1 + z2, na.action = na.exclude)
fmy <- lm(y ~ z1 + z2, na.action = na.exclude)
yres <- resid(fmy)
xres <- resid(fmx)
cor(xres, yres, use = "p")
ct <- cor.test(xres, yres)
ct$estimate
ct$p.value

R give me:
r = .65, p = .23

However, SPSS calculates:
r = .46, p = .70

I think something may be different with R's handling of missing data, as
when I replace the NA's with values, R and SPSS give the same r-values,
albeit different p-values still.  I am doing pairwise case exclusion in both
R and SPSS.  Any ideas why I'm getting different values?  Is something wrong
with my formula in R?  Any help would be greatly appreciated.  Thanks!


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu               UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to