Re: [R] Simple Stacking of Two Columns

Ebert,Timothy Aaron Tue, 04 Apr 2023 06:41:46 -0700

Originally this post was to just look at execution times for different 
approaches to solving this problem.
Now I have a question:
   I change the code for calculating a1 from c(c1, c2) to data.frame(c(c1,c2)). 
This changes the execution times of all the other variables. What am I missing?

Original
For efficiency, the answer Avi provided is still the best option with these 
data. The append() method is next. Both of these approaches avoid having to 
make a data frame in the wide format. The slowest method is pivot_longer(). 
Note that the order of elements is different in the pivot_longer() approach. If 
the order matters then some of these answers will need sorting to get the 
correct output. Also note that a1 and a2 are vectors, while the others are data 
frames. However, all of these appear correct from our understanding of the 
problem.

library(tidyverse)
library(microbenchmark)
c1 <- c("Tom","Dick")
c2 <- c("Larry","Curly")
res <- microbenchmark(a1 <- c(c1, c2),
                      a2 <- append(c1, c2),
                      a3 <- {c3 <- data.frame(Name1=c1, Name2=c2)
                        stack(c3)},
                      a4 <- {c3 <- data.frame(Name1=c1, Name2=c2)
                        data.frame(Names=with(c3, c(Name1, Name2)))},
                      a5 <- {c3 <- data.frame(Name1=c1, Name2=c2)
                        data.frame(Names=unlist(c3), row.names=NULL)},
                      a6 <- {c3 <- data.frame(Name1=c1, Name2=c2)
                        pivot_longer(c3, cols=everything(),names_to="Names")},
                      a7 <- {c3 <- data.frame(Name1=c1, Name2=c2)
                        data.frame(Names=c(c3$Name1,c3$Name2))},
                      times=100L)
print(res)

Mean execution times for seven different methods where a1 <- c(c1,c2)
Method  Mean(ms)        CLD
a1              1998                    a  
a2              5749                    a  
a3              1055501          b 
a4              592548           b 
a5              682491           b 
a6              6962660    c 
a7              608337           b

Mean execution times for seven different methods where a1 <- 
data.frame(c(c1,c2))
Method  mean            cld
a1              272.467   b
a2              5.768           a
a3              907.171       d
a4              561.863     c
a5              581.989     c
a6              6371.465                e
a7              552.208     c

-----Original Message-----
From: R-help <r-help-boun...@r-project.org> On Behalf Of Richard O'Keefe
Sent: Tuesday, April 4, 2023 8:21 AM
To: Sparks, John <jspa...@uic.edu>
Cc: r-help@r-project.org
Subject: Re: [R] Simple Stacking of Two Columns

[External Email]

Just to repeat:
you have

   NamesWide<-data.frame(Name1=c("Tom","Dick"),Name2=c("Larry","Curly"))

and you want

   NamesLong<-data.frame(Names=c("Tom","Dick","Larry","Curly"))

There must be something I am missing, because

   NamesLong <- data.frame(Names = c(NamesWide$Name1, NamesWide$Name2))

appears to do the job in the simplest possible manner.  There are all sorts of 
alternatives, such as
   data.frame(Name = as.vector(as.matrix(NamesWide[, 1:2])))

As for stack(), the main problem there was a typo (Names2 for Name2).

> stack(NamesWide)
  values   ind
1    Tom Name1
2   Dick Name1
3  Larry Name2
4  Curly Name2

If there were multiple columns, you might do

> stack(NamesWide[,c("Name1","Name2")])$values
[1] "Tom"   "Dick"  "Larry" "Curly"

On Tue, 4 Apr 2023 at 03:09, Sparks, John <jspa...@uic.edu> wrote:

> Hi R-Helpers,
>
> Sorry to bother you, but I have a simple task that I can't figure out 
> how to do.
>
> For example, I have some names in two columns
>
> NamesWide<-data.frame(Name1=c("Tom","Dick"),Name2=c("Larry","Curly"))
>
> and I simply want to get a single column
> NamesLong<-data.frame(Names=c("Tom","Dick","Larry","Curly"))
> > NamesLong
>   Names
> 1   Tom
> 2  Dick
> 3 Larry
> 4 Curly
>
>
> Stack produces an error
> NamesLong<-stack(NamesWide$Name1,NamesWide$Names2)
> Error in if (drop) { : argument is of length zero
>
> So does bind_rows
> > NamesLong<-dplyr::bind_rows(NamesWide$Name1,NamesWide$Name2)
> Error in `dplyr::bind_rows()`:
> ! Argument 1 must be a data frame or a named atomic vector.
> Run `rlang::last_error()` to see where the error occurred.
>
> I tried making separate dataframes to get around the error in 
> bind_rows but it puts the data in two different columns
> Name1<-data.frame(c("Tom","Dick"))
> Name2<-data.frame(c("Larry","Curly"))
> NamesLong<-dplyr::bind_rows(Name1,Name2)
> > NamesLong
>   c..Tom....Dick.. c..Larry....Curly..
> 1              Tom                <NA>
> 2             Dick                <NA>
> 3             <NA>               Larry
> 4             <NA>               Curly
>
> gather makes no change to the data
> NamesLong<-gather(NamesWide,Name1,Name2)
> > NamesLong
>   Name1 Name2
> 1   Tom Larry
> 2  Dick Curly
>
>
> Please help me solve what should be a very simple problem.
>
> Thanks,
> John Sparks
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu
> %7C1185a53a1a9d448bbd2f08db350715fa%7C0d4da0f84a314d76ace60a62331e1b84
> %7C0%7C0%7C638162076773455805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sda
> ta=EpchrCOzXyr05ruHu0OoOVdxRZoX6MMm3lodvtsSnGk%3D&reserved=0
> PLEASE do read the posting guide
> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r
> -project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C11
> 85a53a1a9d448bbd2f08db350715fa%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> 7C0%7C638162076773455805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mb
> iDzTsmt%2B2yeIIvjXE2vC5X7Xxnsttx0RhgCcqxNeg%3D&reserved=0
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7C1185a53a1a9d448bbd2f08db350715fa%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638162076773455805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EpchrCOzXyr05ruHu0OoOVdxRZoX6MMm3lodvtsSnGk%3D&reserved=0
PLEASE do read the posting guide 
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C1185a53a1a9d448bbd2f08db350715fa%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638162076773455805%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mbiDzTsmt%2B2yeIIvjXE2vC5X7Xxnsttx0RhgCcqxNeg%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple Stacking of Two Columns

Reply via email to