Re: [R] help with reshaping wide to long format

usha2013 Mon, 31 Dec 2012 07:28:38 -0800

 Hi Arun

Thanks for the answer.1) I was not aware of the "tail" function.


2)Regarding the 2nd question answer, I am trying to understand the
BP.stacknormal$Categ<-"Normal"
 BP.stackObese$Categ<-"Obese"

what does the"category" imply? Also, when we bind two subsets(rbind), the
result of regression has "time=14 years" and "Normal weight" as the
reference category, is that correct.

Since the question is Obese/Overweight14 Vs Normal weight14 in relation to
HIBP at 21.

I did the following:
BP.stacknormal<-subset(Copy.of.BP_2,Obese14==0 & Overweight14==0)
nrow(BP.stacknormal)
BP.stackObese<-subset(Copy.of.BP_2,Obese14==1 )
nrow(BP.stackObese)
BP.stackOverweight<-subset(Copy.of.BP_2, Overweight14==1)
nrow(BP.stackOverweight)

How to do the rest of the regression since  there is no "time" category in
the original data.


BP.stacknormal$Categ<-"Normal14"
BP.stackObese$Categ<-"Obese14"
BP.newObeseNormal<-rbind(BP.stacknormal,BP.stackObese)
BP.newObeseNormal$*time<-factor(BP.newObeseNormal$time)*
BP.newObeseNormal$Categ<- factor(BP.newObeseNormal$Categ)
BPlogit2<-glm(hibp21~*time*+Categ+time*Categ,data=BP.newObeseNormal,family="binomial")

summary(BPlogit2)

Or should I stack BP.newObeseNormal  and then regress.

How acceptable is missing values?

  I do need random effects model. i can put in nlme, with ID as the random
effect? I cant seem to apply geese model for getting the contrasts as there
are only two levels in all of BP, time,Obese/overweight etc. which leaves
me with only glm and random effects model. Is my conclusion correct?

 Just to clarify a couple of things, for different questions:a)   Is
overweight/obesity associated with high blood pressure for adolescents and
young adults:
Can a log.regression of Overweight/obese with HiBP in the stacked data be
sufficient?


b) What variables,measured at baseline or age 14, are associated with high
blood pressure at ages 14 and 21?

I was thinking of doing a correlation between factors( as numeric).Could
this answer the question.

(p.s, i shall try postingthese questions int he general email too.)

Happy N.Y

Thank you so much, Usha








On Mon, Dec 31, 2012 at 10:15 AM, arun kirshna [via R] <
ml-node+s789695n4654269...@n4.nabble.com> wrote:

> HI Usha,
> I tried the codes on the full dataset.  This is what I get:
> BP_2b<-read.csv("BP_2b.csv",sep="\t")
> #head(BP_2b,2)
> #  CODEA Sex MaternalAge Education Birthplace AggScore IntScore Obese14
> #1     1  NA           3         4          1       NA       NA      NA
> #2     3   2           3         3          1        0        0       0
>  # Overweight14 Overweight21 Obese21 hibp14 hibp21
> #1           NA           NA      NA     NA     NA
> #2            0            1       0      0      0
> BP.stack1<-
>  reshape(BP_2b,sep="",timevar=
> "time",direction="long",varying=list(names(BP_2b[8:9]),names(BP_2b[10:11]),names(BP_2b[12:13])),v.names=c("Obese","Overweight","HiBP"),idvar="CODEA")
>
>
> BP_2b<-BP_2b[,c(1:8,11,9:10,12:13)] #Obese21 was not adjacent to Obese14.
> So, I rearranged the columns to make it adjacent.
> str(BP_2b)
> #'data.frame':    6898 obs. of  13 variables:
> # $ CODEA       : int  1 3 4 7 8 9 10 11 12 13 ...
> # $ Sex         : int  NA 2 2 2 2 1 NA 1 1 2 ...
> # $ MaternalAge : int  3 3 3 4 4 3 3 3 3 3 ...
> # $ Education   : int  4 3 6 6 4 6 3 4 4 4 ...
> # $ Birthplace  : int  1 1 1 1 1 2 1 1 2 2 ...
> # $ AggScore    : int  NA 0 NA 0 0 0 NA 0 NA NA ...
> # $ IntScore    : int  NA 0 NA 0 0 0 NA 0 NA NA ...
> # $ Obese14     : int  NA 0 NA 0 0 0 NA NA NA NA ...
> # $ Obese21     : int  NA 0 NA 1 0 0 NA 0 NA NA ...
> # $ Overweight14: int  NA 0 NA 0 0 0 NA NA NA NA ...
> # $ Overweight21: int  NA 1 NA 1 0 0 NA 0 NA NA ...
> # $ hibp14      : int  NA 0 NA 0 0 0 NA NA NA NA ...
> # $ hibp21      : int  NA 0 NA 0 0 0 NA 0 NA NA ...
>
> BP.stack2<-
>  reshape(BP_2b,sep="",timevar=
> "time",direction="long",varying=list(names(BP_2b[8:9]),names(BP_2b[10:11]),names(BP_2b[12:13])),v.names=c("Obese","Overweight","HiBP"),idvar="CODEA")
>
>  identical(BP.stack1,BP.stack2)
> #[1] FALSE
>
> library(car)
> BP.stack2$time <- recode(BP.stack2$time,("1=14;2=21"))
> head(BP.stack2,2)
> #    CODEA Sex MaternalAge Education Birthplace AggScore IntScore time
> Obese
> #1.1     1  NA           3         4          1       NA       NA   14
> NA
> #3.1     3   2           3         3          1        0        0   14
> 0
> #    Overweight HiBP
> #1.1         NA   NA
> #3.1          0    0
>  tail(BP.stack2,2)
> #       CODEA Sex MaternalAge Education Birthplace AggScore IntScore time
> Obese
> #8555.2  8555   1           3         6          2        0        0
> 21    NA
> #8556.2  8556   1           3         4          1        0        0
> 21     0
> #       Overweight HiBP
> #8555.2         NA   NA
> #8556.2          0   NA
> nrow(BP.stack2)
> #[1] 13796
>  length(BP.stack2$time[BP.stack2$time==14])
> #[1] 6898
>  length(BP.stack2$time[BP.stack2$time==21])
> #[1] 6898
>
> #or
> names(BP_2b)[grep("\\d+",names(BP_2b))]<-gsub("(\\D+)(\\d+)","\\1_\\2",names(BP_2b)[grep("\\d+",names(BP_2b))])
>
> BP.stack3<-reshape(BP_2b,dir="long",varying=8:13,sep="_")
>  head(BP.stack3,2)
> #     CODEA Sex MaternalAge Education Birthplace AggScore IntScore time
> Obese
> #1.14     1  NA           3         4          1       NA       NA   14
> NA
> #2.14     3   2           3         3          1        0        0
> 14     0
> #     Overweight hibp id
> #1.14         NA   NA  1
> #2.14          0    0  2
>  tail(BP.stack3,2)
> #        CODEA Sex MaternalAge Education Birthplace AggScore IntScore time
> Obese
> #6897.21  8555   1           3         6          2        0        0
> 21    NA
> #6898.21  8556   1           3         4          1        0        0
> 21     0
> #        Overweight hibp   id
> #6897.21         NA   NA 6897
> #6898.21          0   NA 6898
>  nrow(BP.stack3)
> #[1] 13796
>  length(BP.stack3$time[BP.stack3$time==21])
> #[1] 6898
>  length(BP.stack3$time[BP.stack3$time==14])
> #[1] 6898
>
> #So, I guess this solves the problem.  Regarding your second question,
> "Do children who are obese or overweight at age 14 experience different
> change in blood pressure by 21 years, compared to those with normal
> weight at age 14?"
>
> In
>  the Obese or Overweight columns, I am not sure what "0" means.  For
> example, in the overweight column, does "0" means it is normal, or it
> has to be both "0" for overweight and obese for a particular ID (CODEA)
> in each of the age groups?  I guess it should be the latter.  If that is
>  the case, subset data in to normal, Obese, Overweight and then do the
> comparisons.
> For example:
>  BP.stacknormal<-subset(BP.stack3,Obese==0 & Overweight==0)
>  nrow(BP.stacknormal)
> [1] 4371
>  head(BP.stacknormal,3)
> #     CODEA Sex MaternalAge Education Birthplace AggScore IntScore time
> Obese
> #2.14     3   2           3         3          1        0        0
> 14     0
> #4.14     7   2           4         6          1        0        0
> 14     0
> #5.14     8   2           4         4          1        0        0
> 14     0
>   #   Overweight hibp id
> #2.14          0    0  2
> #4.14          0    0  4
> #5.14          0    0  5
>
> Similarly, you need to subset for Obese==1 , Overweight==1
> BP.stackObese <-subset(BP.stack3,Obese==1)
>  nrow(BP.stackObese)
> #[1] 530
>
> Regarding the analysis, I think logistic regression should fit in this
> case because the response variable (hibp) is binary.
> #Just a simple logistic regression
> BP.stacknormal$Categ<-"Normal"
>  BP.stackObese$Categ<-"Obese"
>  BP.newObeseNormal<-rbind(BP.stacknormal,BP.stackObese)
> BP.newObeseNormal$time<-factor(BP.newObeseNormal$time)
> BP.newObeseNormal$Categ<- factor(BP.newObeseNormal$Categ)
>
>
> BPlogit<-glm(hibp~time+Categ,data=BP.newObeseNormal,family="binomial")
> #or
> BPlogit<-glm(hibp~time+Categ,data=BP.newObeseNormal,family=binomial(logit))
>
>  summary(BPlogit)
> #
> #Call:
> #glm(formula = hibp ~ time + Categ, family = "binomial", data =
> BP.newObeseNormal)
> #
> #Deviance Residuals:
>  #   Min       1Q   Median       3Q      Max
> #-0.8630  -0.5452  -0.5452  -0.4965   2.0759
>
> #Coefficients:
>  #           Estimate Std. Error z value Pr(>|z|)
> #(Intercept) -1.83099    0.05768 -31.745   <2e-16 ***
> #time21      -0.20038    0.09290  -2.157    0.031 *
> #CategObese   1.03506    0.12046   8.593   <2e-16 ***
> #---
> #Signif. codes:  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â 
> â 1
>
> For
>  your question, I guess you need to add interaction term also in the
> model.  I am not sure whether you need random effects in the model.  If
> that is the case, you need ?lmer() from library(lme4). You could also
> post the question in R mixed models ([hidden 
> email]<http://user/SendEmail.jtp?type=node&node=4654269&i=0>)
> mailing list.
> A.K.
>
>
>
>
> ----- Original Message -----
> From: usha2013 <[hidden 
> email]<http://user/SendEmail.jtp?type=node&node=4654269&i=1>>
>
> To: [hidden email] <http://user/SendEmail.jtp?type=node&node=4654269&i=2>
> Cc:
> Sent: Friday, December 28, 2012 2:02 AM
> Subject: Re: [R] help with reshaping wide to long format
>
> Hi, Sorry, but how did you bring it out?
>
> Thanks
>
> On Fri, Dec 28, 2012 at 8:48 AM, arun kirshna [via R] <
> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654269&i=3>>
> wrote:
>
> > Hi,
> > bp.sub<- structure(list(CODEA = c(1L, 3L, 4L, 7L, 8L, 9L, 10L, 11L, 12L,
> > 13L, 14L, 16L, 17L), C45 = c(NA, 2L, 2L, 2L, 2L, 1L, NA, 1L,
> > 1L, 2L, 1L, 2L, 1L), ragek = c(3L, 3L, 3L, 4L, 4L, 3L, 3L, 3L,
> > 3L, 3L, 3L, 3L, 3L), ra80 = c(4L, 3L, 6L, 6L, 4L, 6L, 3L, 4L,
> > 4L, 4L, 4L, 4L, 4L), ra98 = c(1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L,
> > 2L, 2L, 2L, 3L, 1L), CBCLAggressionAt1410 = c(NA, 0L, NA, 0L,
> > 0L, 0L, NA, 0L, NA, NA, 0L, 0L, NA), CBCLInternalisingAt1410 = c(NA,
> > 0L, NA, 0L, 0L, 0L, NA, 0L, NA, NA, 0L, 0L, NA), Obese14 = c(NA,
> > 0L, NA, 0L, 0L, 0L, NA, NA, NA, NA, 0L, 0L, NA), Obese21 = c(NA,
> > 0L, NA, 1L, 0L, 0L, NA, 0L, NA, NA, 0L, 0L, NA), Overweight14 = c(NA,
> > 0L, NA, 0L, 0L, 0L, NA, NA, NA, NA, 0L, 0L, NA), Overweight21 = c(NA,
> > 1L, NA, 1L, 0L, 0L, NA, 0L, NA, NA, 1L, 0L, NA), hibp14 = c(NA,
> > 0L, NA, 0L, 0L, 0L, NA, NA, NA, NA, 1L, 1L, NA), hibp21 = c(NA,
> > 0L, NA, 0L, 0L, 0L, NA, 0L, NA, NA, 1L, NA, NA)), .Names = c("CODEA",
> > "C45", "ragek", "ra80", "ra98", "CBCLAggressionAt1410",
> > "CBCLInternalisingAt1410",
> > "Obese14", "Obese21", "Overweight14", "Overweight21", "hibp14",
> > "hibp21"), row.names = c(NA, 13L), class = "data.frame")
> >
> > BP.stack1<- reshape(bp.sub,sep="",timevar=
> >
> "time",direction="long",varying=list(names(bp.sub[8:9]),names(bp.sub[10:11]),names(bp.sub[12:13])),v.names=c("Obese","Overweight","HiBP"),idvar="CODEA")
>
> >
> > BP.stack1$time <- recode(BP.stack1$time,("1=14;2=21"))
> >  BP.stack1
> >  #    CODEA C45 ragek ra80 ra98 CBCLAggressionAt1410
> > CBCLInternalisingAt1410
> > #1.1      1  NA     3    4    1                   NA
> >  NA
> > #3.1      3   2     3    3    1                    0
> > 0
> > #4.1      4   2     3    6    1                   NA
> >  NA
> > #7.1      7   2     4    6    1                    0
> > 0
> > #8.1      8   2     4    4    1                    0
> > 0
> > #9.1      9   1     3    6    2                    0
> > 0
> > #10.1    10  NA     3    3    1                   NA
> >  NA
> > #11.1    11   1     3    4    1                    0
> > 0
> > #12.1    12   1     3    4    2                   NA
> >  NA
> > #13.1    13   2     3    4    2                   NA
> >  NA
> > #14.1    14   1     3    4    2                    0
> > 0
> > #16.1    16   2     3    4    3                    0
> > 0
> > #17.1    17   1     3    4    1                   NA
> >  NA
> > #1.2      1  NA     3    4    1                   NA
> >  NA
> > #3.2      3   2     3    3    1                    0
> > 0
> > #4.2      4   2     3    6    1                   NA
> >  NA
> > #7.2      7   2     4    6    1                    0
> > 0
> > #8.2      8   2     4    4    1                    0
> > 0
> > #9.2      9   1     3    6    2                    0
> > 0
> > #10.2    10  NA     3    3    1                   NA
> >  NA
> > #11.2    11   1     3    4    1                    0
> > 0
> > #12.2    12   1     3    4    2                   NA
> >  NA
> > #13.2    13   2     3    4    2                   NA
> >  NA
> > #14.2    14   1     3    4    2                    0
> > 0
> > #16.2    16   2     3    4    3                    0
> > 0
> > #17.2    17   1     3    4    1                   NA
> >  NA
> >   #   time Obese Overweight HiBP
> > #1.1    14    NA         NA   NA
> > #3.1    14     0          0    0
> > #4.1    14    NA         NA   NA
> > #7.1    14     0          0    0
> > #8.1    14     0          0    0
> > #9.1    14     0          0    0
> > #10.1   14    NA         NA   NA
> > #11.1   14    NA         NA   NA
> > #2.1   14    NA         NA   NA
> > #13.1   14    NA         NA   NA
> > #14.1   14     0          0    1
> > #16.1   14     0          0    1
> > #17.1   14    NA         NA   NA
> > #1.2    21    NA         NA   NA
> > #3.2    21     0          1    0
> > #4.2    21    NA         NA   NA
> > #7.2    21     1          1    0
> > #8.2    21     0          0    0
> > #9.2    21     0          0    0
> > #10.2   21    NA         NA   NA
> > #11.2   21     0          0    0
> > #12.2   21    NA         NA   NA
> > #13.2   21    NA         NA   NA
> > #14.2   21     0          1    1
> > #16.2   21     0          0   NA
> > #17.2   21    NA         NA   NA
> >
> names(bp.sub)[grep("\\d+",names(bp.sub))]<-gsub("(\\D+)(\\d+)","\\1_\\2",names(bp.sub)[grep("\\d+",names(bp.sub))])
>
> >
> >  BP.stack2<-reshape(bp.sub,dir="long",varying=8:13,sep="_")
> >  Bp.stack2
> >       CODEA C_45 ragek ra_80 ra_98 CBCLAggressionAt_1410
> > 1.14      1   NA     3     4     1                    NA
> > 2.14      3    2     3     3     1                     0
> > 3.14      4    2     3     6     1                    NA
> > 4.14      7    2     4     6     1                     0
> > 5.14      8    2     4     4     1                     0
> > 6.14      9    1     3     6     2                     0
> > 7.14     10   NA     3     3     1                    NA
> > 8.14     11    1     3     4     1                     0
> > 9.14     12    1     3     4     2                    NA
> > 10.14    13    2     3     4     2                    NA
> > 11.14    14    1     3     4     2                     0
> > 12.14    16    2     3     4     3                     0
> > 13.14    17    1     3     4     1                    NA
> > 1.21      1   NA     3     4     1                    NA
> > 2.21      3    2     3     3     1                     0
> > 3.21      4    2     3     6     1                    NA
> > 4.21      7    2     4     6     1                     0
> > 5.21      8    2     4     4     1                     0
> > 6.21      9    1     3     6     2                     0
> > 7.21     10   NA     3     3     1                    NA
> > 8.21     11    1     3     4     1                     0
> > 9.21     12    1     3     4     2                    NA
> > 10.21    13    2     3     4     2                    NA
> > 11.21    14    1     3     4     2                     0
> > 12.21    16    2     3     4     3                     0
> > 13.21    17    1     3     4     1                    NA
> >       CBCLInternalisingAt_1410 time Obese Overweight hibp id
> > 1.14                        NA   14    NA         NA   NA  1
> > 2.14                         0   14     0          0    0  2
> > 3.14                        NA   14    NA         NA   NA  3
> > 4.14                         0   14     0          0    0  4
> > 5.14                         0   14     0          0    0  5
> > 6.14                         0   14     0          0    0  6
> > 7.14                        NA   14    NA         NA   NA  7
> > 8.14                         0   14    NA         NA   NA  8
> > 9.14                        NA   14    NA         NA   NA  9
> > 10.14                       NA   14    NA         NA   NA 10
> > 11.14                        0   14     0          0    1 11
> > 12.14                        0   14     0          0    1 12
> > 13.14                       NA   14    NA         NA   NA 13
> > 1.21                        NA   21    NA         NA   NA  1
> > 2.21                         0   21     0          1    0  2
> > 3.21                        NA   21    NA         NA   NA  3
> > 4.21                         0   21     1          1    0  4
> > 5.21                         0   21     0          0    0  5
> > 6.21                         0   21     0          0    0  6
> > 7.21                        NA   21    NA         NA   NA  7
> > 8.21                         0   21     0          0    0  8
> > 9.21                        NA   21    NA         NA   NA  9
> > 10.21                       NA   21    NA         NA   NA 10
> > 11.21                        0   21     0          1    1 11
> > 12.21                        0   21     0          0   NA 12
> > 13.21                       NA   21    NA         NA   NA 13
> > A.K.
> >
> >
> >
> >
> >
> > ------------------------------
> >  If you reply to this email, your message will be added to the
> discussion
> > below:
> >
> >
>
> > .
> > NAML<
> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
> >
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/help-with-reshaping-wide-to-long-format-tp4653922p4654122.html
> Sent from the R help mailing list archive at Nabble.com.
>     [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654269&i=4>mailing 
> list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654269&i=5>mailing 
> list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://r.789695.n4.nabble.com/help-with-reshaping-wide-to-long-format-tp4653922p4654269.html
>  To unsubscribe from help with reshaping wide to long format, click 
> here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4653922&code=dXNoYS5uYXRoYW5AZ21haWwuY29tfDQ2NTM5MjJ8MjAyMjE1NDI0>
> .
> NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://r.789695.n4.nabble.com/help-with-reshaping-wide-to-long-format-tp4653922p4654305.html
Sent from the R help mailing list archive at Nabble.com.
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with reshaping wide to long format

Reply via email to