Hi,
For your first question,
M1<-as.table(rbind(c(825,2407),c(828,2200)))
dimnames(M1)<- list(gender=c("Male","Female"),
MV=c("Study","NonStudy/missing"))
M1
# MV
#gender Study NonStudy/missing
# Male 825 2407
# Female 828 2200
Xsq<-chisq.test(M1)
Xsq
# Pearson's Chi-squared test with Yates' continuity correction
#data: M1
#X-squared = 2.5684, df = 1, p-value = 0.109
I will take a look at your second question later.
A.K.
________________________________
From: Usha Gurunathan <[email protected]>
To: arun <[email protected]>
Sent: Sunday, January 13, 2013 1:51 AM
Subject: Re: [R] random effects model
HI AK
Thanks a lot for explaining that.
1. With the chi sq. ( in order to find out if the diffce is significant between
groups) do I have create a separate excel file and make a dataframe.How do I go
about it?
On Sun, Jan 13, 2013 at 1:22 PM, arun <[email protected]> wrote:
HI,
>
>table(BP_2b$Sex) #original dataset
># 1 2
>#3232 3028
> nrow(BP_2b)
>#[1] 6898
> nrow(BP_2bSexNoMV)
>#[1] 6260
> 6898-6260
>#[1] 638 #these rows were removed from the BP_2b to create BP_2bSexNoMV
>BP_2bSexMale<-BP_2bSexNoMV[BP_2bSexNoMV$Sex=="Male",]
> nrow(BP_2bSexMale)
>#[1] 3232
> nrow(BP_2bSexMale[!complete.cases(BP_2bSexMale),]) #Missing rows with Male
>#[1] 2407
> nrow(BP_2bSexMale[complete.cases(BP_2bSexMale),]) #Non missing rows with Male
>#[1] 825
>
>
>You did the chisquare test on the new dataset with 6260 rows, right.
>I removed those 638 rows because these doesn't belong to either male or
>female, but you want the % of missing value per male or female. So, I thought
>this will bias the results. If you want to include the missing values, you
>could do it, but I don't know where you would put that missing values as it
>cannot be classified as belonging specifically to males or females. I hope
>you understand it.
>
>Sometimes, the maintainer's respond a bit slow. You have to sent an email
>reminding him again.
>
>Regarding the vmv package, you could email Waqas Ahmed Malik
>([email protected]) regarding options for changing the title and the
>the font etc.
>You could also use this link
>(http://www.r-bloggers.com/visualizing-missing-data-2/ ) to plot missing value
>(?plot.missing()). I never used that package, but you could try. Looks like
>it gives more information.
>
>A.K.
>
>
>
>
>
>
>
>
>________________________________
>From: Usha Gurunathan <[email protected]>
>To: arun <[email protected]>
>Sent: Saturday, January 12, 2013 9:05 PM
>
>Subject: Re: [R] random effects model
>
>
>Hi A.K
>
>So it is number of females missing/total female participants enrolled: 72.65%
>Number of females missing/total (of males+ females) participants enrolled :
>35.14%
>
>The total no. with the master data: Males: 3232, females: 3028 ( I got this
>before removing any missing values)
>
>with table(Copy.of.BP_2$ Sex) ## BP
>
>
>If I were to write a table ( and do a chi sq. later),
>
>as Gender Study Non study/missing Total
> Male 825 (25.53%) 2407 (74.47%) 3232
>(100%)
> Female 828 (27.35%) 2200 (72.65%) 3028 ( 100%)
> Total 1653 4607
> 6260
>
>
>The problem is when I did
>>colSums(is.na(Copy.of.BP_2), the sex category showed N=638.
>
>I cannot understand the discrepancy.Also, when you have mentioned to remove
>NA, is that not a missing value that needs to be included in the total number
>missing. I am a bit confused. Can you help?
>
>## I tried sending email to gee pack maintainer at the ID with R site, mail
>didn't go through??
>
>Many thanks
>
>
>
>
>
>
>On Sun, Jan 13, 2013 at 9:17 AM, arun <[email protected]> wrote:
>
>Hi,
>>Yes, you are right. 72.655222% was those missing among females. 35.14377%
>>of values in females are missing from among the whole dataset (combined total
>>of Males+Females data after removing the NAs from the variable "Sex").
>>
>>A.K.
>>
>>
>>
>>________________________________
>>From: Usha Gurunathan <[email protected]>
>>To: arun <[email protected]>
>>Cc: R help <[email protected]>
>>Sent: Saturday, January 12, 2013 5:59 PM
>>
>>Subject: Re: [R] random effects model
>>
>>
>>
>>Hi AK
>>That works. I was trying to get similar results from any other package.
>>Being a beginner, I was not sure how to modify the syntax to get my output.
>>
>>lapply(split(BP_2bSexNoMV,BP_
>>2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100)
>>#gives the percentage of rows of missing #values from the overall rows for
>>Males and Females
>>#$Female
>>#[1] 72.65522
>>#
>>#$Male
>>#[1] 74.47401
>>
>>#iF you want the percentage from the total number rows in Males and Females
>>(without NA's in the the Sex column)
>> lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x)
>>(nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100)
>>#$Female
>>#[1] 35.14377
>>#
>>#$Male
>>#[1] 38.45048
>>
>>How do I interpret the above 2 difft results? 72.66% of values were missing
>>among female participants?? Can you pl. clarify.
>>
>>Many thanks.
>>
>>
>>On Sun, Jan 13, 2013 at 3:28 AM, arun <[email protected]> wrote:
>>
>>lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x)
>>(nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100) #gives the percentage of
>>rows of missing #values from the overall rows for Males and Females
>>>#$Female
>>>#[1] 72.65522
>>>#
>>>#$Male
>>>#[1] 74.47401
>>>
>>>#iF you want the percentage from the total number rows in Males and Females
>>>(without NA's in the the Sex column)
>>> lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x)
>>>(nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100)
>>>#$Female
>>>#[1] 35.14377
>>>#
>>>#$Male
>>>#[1] 38.45048
>>
>
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.