[R] repeating rows in R
I'm somewhat a new user and have been trying to figure out how to repeat rows a certain number of time based on a variable. Currently, the number of rows is not reflective of the number of observations. To get the number of observations (n=22 in this case), I have to multiply by the variable NoRecords for each row. So, there are actually 7 cases of cancer and 3 cases of HIV in my data, not 2 cases of HIV and 1 case of cancer. Is there an easy way to expand my data so that I actually end up with 22 rows instead of 10? Specifically, my data currently look like this: >my data AdmitYear Race Age.yrs. Insurance HIV Cancer NoRecords 1 1985A 20 0 0 0 1 2 1985A 21 0 0 0 1 3 1985A 22 1 1 0 1 4 1985A 23 0 0 0 2 5 1985A 24 0 1 0 2 6 1985A 24 1 0 0 1 7 1985A 25 1 0 0 3 8 1985A 26 0 0 0 2 9 1985A 26 1 0 1 7 10 1985A 27 0 0 0 2 thanks! Andy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeating rows in R
Thanks! On Tue, Jul 27, 2010 at 4:29 PM, wrote: > Is this the kind of thing you are looking for? > > > dat <- data.frame(x = 1:3, freq = 2:4) > > dat > x freq > 1 12 > 2 23 > 3 34 > > newDat <- dat[rep(rownames(dat), dat$freq), ] > > newDat >x freq > 1 12 > 1.1 12 > 2 23 > 2.1 23 > 2.2 23 > 3 34 > 3.1 34 > 3.2 34 > 3.3 34 > > > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Andrew Anglemyer > Sent: Wednesday, 28 July 2010 9:22 AM > To: r-help@r-project.org > Subject: [R] repeating rows in R > > I'm somewhat a new user and have been trying to figure out how to repeat > rows a certain number of time based on a variable. Currently, the number > of > rows is not reflective of the number of observations. To get the number of > observations (n=22 in this case), I have to multiply by the variable > NoRecords for each row. So, there are actually 7 cases of cancer and 3 > cases of HIV in my data, not 2 cases of HIV and 1 case of cancer. Is there > an easy way to expand my data so that I actually end up with 22 rows > instead > of 10? Specifically, my data currently look like this: > > >my data > AdmitYear Race Age.yrs. Insurance HIV Cancer NoRecords > 1 1985A 20 0 0 > 0 1 > 2 1985A 21 0 0 > 0 1 > 3 1985A 22 1 1 > 0 1 > 4 1985A 23 0 0 > 0 2 > 5 1985A 24 0 1 > 0 2 > 6 1985A 24 1 0 > 0 1 > 7 1985A 25 1 0 > 0 3 > 8 1985A 26 0 0 > 0 2 > 9 1985A 26 1 0 1 >7 > 10 1985A 27 0 0 > 0 2 > > thanks! > Andy > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- * Andrew Anglemyer, PhD MPH Methods and Statistics Editor Cochrane HIV/AIDS Group Institute for Global Health University of California, SF Research Postdoctoral Fellow Department of Pediatrics, Infectious Diseases Stanford University Stanford, CA 94305 email: andrew.anglem...@gmail.com ph: +1.510.717.3029 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with rbind when data frame contains an date-time variable "POSIXt" "POSIXlt"
I'm trying to rbind two data frames, both with the same columns names. One of the columns is a variable with date-time and this variable is causing the rbind to fail--giving the error "Error in names(value[[jj]])[ri] <- nm : 'names' attribute [7568] must be the same length as the vector [9]" Is there a way to stack or rbind these two data frames even with this extended date-time variable? The class of event.date.time in each data frame is POSIXt POSIXlt. x ID event.date.time 1 2009-07-23 00:20:00 2 2009-08-18 16:25:00 3 2009-08-13 08:30:00 y ID event.date.time 4 2009-08-25 10:25:00 5 2009-08-10 06:20:00 6 2009-10-09 08:20:00 I would like to get z ID event.date.time 1 2009-07-23 00:20:00 2 2009-08-18 16:25:00 3 2009-08-13 08:30:00 4 2009-08-25 10:25:00 5 2009-08-10 06:20:00 6 2009-10-09 08:20:00 I've looked at stripping the dates and times, but it would be really helpful for my purposes to keep the extended variable date-time variable (have to eventually get 24 hours from baseline.date.time). thanks for any and all help! Andy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with rbind when data frame contains an date-time variable "POSIXt" "POSIXlt"
Thanks again for your help. I was sniffing around this solution forever, but that did it cleanly and without any problems. On Thu, Feb 17, 2011 at 10:21 PM, wrote: > The solution is probably to make the data-time columns POSIXct: > > > x <- read.table(textConnection(" > + ID event.date.time > + 1 '2009-07-23 00:20:00' > + 2 '2009-08-18 16:25:00' > + 3 '2009-08-13 08:30:00' > + "), header = TRUE) > > y <- read.table(textConnection(" > + ID event.date.time > + 4 '2009-08-25 10:25:00' > + 5 '2009-08-10 06:20:00' > + 6 '2009-10-09 08:20:00' > + "), header = TRUE) > > closeAllConnections() > > > > x <- within(x, > + event.date.time <- as.POSIXct(as.character(event.date.time), > + format = "%Y-%m-%d %H:%M:%S")) > > y <- within(y, > + event.date.time <- as.POSIXct(as.character(event.date.time), > + format = "%Y-%m-%d %H:%M:%S")) > > z <- rbind(x, y) > > z > ID event.date.time > 1 1 2009-07-23 00:20:00 > 2 2 2009-08-18 16:25:00 > 3 3 2009-08-13 08:30:00 > 4 4 2009-08-25 10:25:00 > 5 5 2009-08-10 06:20:00 > 6 6 2009-10-09 08:20:00 > > > > No problems. > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Andrew Anglemyer > Sent: Friday, 18 February 2011 3:54 PM > To: r-help@r-project.org > Subject: [R] problem with rbind when data frame contains an date-time > variable "POSIXt" "POSIXlt" > > I'm trying to rbind two data frames, both with the same columns names. One > of the columns is a variable with date-time and this variable is causing > the > rbind to fail--giving the error > "Error in names(value[[jj]])[ri] <- nm : 'names' attribute [7568] must be > the same length as the vector [9]" > > Is there a way to stack or rbind these two data frames even with this > extended date-time variable? The class of event.date.time in each data > frame is POSIXt POSIXlt. > x > ID event.date.time > 1 2009-07-23 00:20:00 > 2 2009-08-18 16:25:00 > 3 2009-08-13 08:30:00 > > y > ID event.date.time > 4 2009-08-25 10:25:00 > 5 2009-08-10 06:20:00 > 6 2009-10-09 08:20:00 > > I would like to get > > z > ID event.date.time > 1 2009-07-23 00:20:00 > 2 2009-08-18 16:25:00 > 3 2009-08-13 08:30:00 > 4 2009-08-25 10:25:00 > 5 2009-08-10 06:20:00 > 6 2009-10-09 08:20:00 > > I've looked at stripping the dates and times, but it would be really > helpful > for my purposes to keep the extended variable date-time variable (have to > eventually get 24 hours from baseline.date.time). > > thanks for any and all help! > Andy > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining two columns into one column despite NAs
I am trying to combine two columns in a data frame into one column. Some values in either column are missing, but not in the same row for the two different columns. Additionally, when both columns in a row contain data, the data are identical. I want a new column with the identical data or the data from the column with observed data. For example: I have >data id xy 1 a 1 NA 2 b 22 3 c 33 4 d NA 4 And I want >new.data id xy z 1 a 1 NA 1 2 b 22 2 3 c 33 3 4 d NA 4 4 I've looked through the help and there are column combining solutions, but they don't seem to work well for this solution. Thanks for any help! Andy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining two columns into one column despite NAs
Thanks! Unfortunately, in my effort to simply the question, I didn't really adequately describe the problem. This solution is perfect in the numeric case I presented, but what about in the case of character classes! Let me try again: I have >character.data id xy 1 1"a"NA 2 2"b" "b" 3 3"c" "c" 4 4 NA "d" And I want first >new.character.data id xyz 1 1"a"NA "a" 2 2"b" "b" "b" 3 3"c" "c""c" 4 4 NA "d" "d" Thanks again! On Thu, Feb 24, 2011 at 4:27 PM, Ista Zahn wrote: > I think the easiest way is probably > > data$z <- rowMeans(data[, c("x", "y")], na.rm=TRUE) > > Best, > Ista > > On Fri, Feb 25, 2011 at 12:12 AM, Andrew Anglemyer > wrote: > > I am trying to combine two columns in a data frame into one column. Some > > values in either column are missing, but not in the same row for the two > > different columns. Additionally, when both columns in a row contain > data, > > the data are identical. I want a new column with the identical data or > the > > data from the column with observed data. For example: > > > > I have > >>data > > id xy > > 1 a 1 NA > > 2 b 22 > > 3 c 33 > > 4 d NA 4 > > > > And I want > >>new.data > > id xy z > > 1 a 1 NA 1 > > 2 b 22 2 > > 3 c 33 3 > > 4 d NA 4 4 > > > > I've looked through the help and there are column combining solutions, > but > > they don't seem to work well for this solution. > > Thanks for any help! > > Andy > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology > http://yourpsyche.org > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining two columns into one column despite NAs
Thanks for all the solutions! On Thu, Feb 24, 2011 at 5:37 PM, Sarah Goslee wrote: > What about: > > ifelse(is.na(x), y, x) > > as long as x and y are always the same where one is not NA. > > Sarah > > On Thu, Feb 24, 2011 at 7:53 PM, Andrew Anglemyer > wrote: > > Thanks! Unfortunately, in my effort to simply the question, I didn't > really > > adequately describe the problem. This solution is perfect in the numeric > > case I presented, but what about in the case of character classes! Let > me > > try again: > > > > I have > >>character.data > > id xy > > 1 1"a"NA > > 2 2"b" "b" > > 3 3"c" "c" > > 4 4 NA "d" > > > > > > And I want first > >>new.character.data > > id xyz > > 1 1"a"NA "a" > > 2 2"b" "b" "b" > > 3 3"c" "c""c" > > 4 4 NA "d" "d" > > > > Thanks again! > > > > > > > > > > On Thu, Feb 24, 2011 at 4:27 PM, Ista Zahn >wrote: > > > >> I think the easiest way is probably > >> > >> data$z <- rowMeans(data[, c("x", "y")], na.rm=TRUE) > >> > >> Best, > >> Ista > >> > >> On Fri, Feb 25, 2011 at 12:12 AM, Andrew Anglemyer > >> wrote: > >> > I am trying to combine two columns in a data frame into one column. > Some > >> > values in either column are missing, but not in the same row for the > two > >> > different columns. Additionally, when both columns in a row contain > >> data, > >> > the data are identical. I want a new column with the identical data > or > >> the > >> > data from the column with observed data. For example: > >> > > >> > I have > >> >>data > >> > id xy > >> > 1 a 1 NA > >> > 2 b 22 > >> > 3 c 33 > >> > 4 d NA 4 > >> > > >> > And I want > >> >>new.data > >> > id xy z > >> > 1 a 1 NA 1 > >> > 2 b 22 2 > >> > 3 c 33 3 > >> > 4 d NA 4 4 > >> > > >> > I've looked through the help and there are column combining solutions, > >> but > >> > they don't seem to work well for this solution. > >> > Thanks for any help! > >> > Andy > >> > > >> >[[alternative HTML version deleted]] > >> > > >> > __ > >> > R-help@r-project.org mailing list > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > > >> > >> > >> > >> -- > >> Ista Zahn > >> Graduate student > >> University of Rochester > >> Department of Clinical and Social Psychology > >> http://yourpsyche.org > >> > > > > > -- > Sarah Goslee > http://www.functionaldiversity.org > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.