z <- with(Data2, tapply(Vendor,Account, I)) n <- vapply(z,length,1) data.frame (Vendor = unlist(z), Account = rep(names(z),n), NumVen = rep(n,n) )
## which gives: Vendor Account NumVen A1 V1 A1 1 A21 V2 A2 3 A22 V3 A2 3 A23 V1 A2 3 A3 V4 A3 1 A4 V2 A4 1 Of course this also works for Data1 Bill may be able to come up with a slicker version, however. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Nov 17, 2020 at 3:34 PM Tom Woolman <twool...@ontargettek.com> wrote: > Yes, good catch. Thanks > > > Quoting Bert Gunter <bgunter.4...@gmail.com>: > > > Why 0's in the data frame? Shouldn't that be 1 (vendor with that > account)? > > > > Bert > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > and > > sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Tue, Nov 17, 2020 at 3:29 PM Tom Woolman <twool...@ontargettek.com> > > wrote: > > > >> Hi Bill. Sorry to be so obtuse with the example data, I was trying > >> (too hard) not to share any actual values so I just created randomized > >> values for my example; of course I should have specified that the > >> random values would not provide the expected problem pattern. I should > >> have just used simple dummy codes as Bill Dunlap did. > >> > >> So per Bill's example data for Data1, the expected (hoped for) output > >> should be: > >> > >> Vendor Account Num_Vendors_Sharing_Bank_Acct > >> 1 V1 A1 0 > >> 2 V2 A2 3 > >> 3 V3 A2 3 > >> 4 V4 A2 3 > >> > >> > >> Where the new calculated variable is Num_Vendors_Sharing_Bank_Acct. > >> The value is 3 for V2, V3 and V4 because they each share bank account > >> A2. > >> > >> > >> Likewise, in the Data2 frame, the same logic applies: > >> > >> Vendor Account Num_Vendors_Sharing_Bank_Acct > >> 1 V1 A1 0 > >> 2 V2 A2 3 > >> 3 V3 A2 3 > >> 4 V1 A2 3 > >> 5 V4 A3 0 > >> 6 V2 A4 0 > >> > >> > >> > >> > >> > >> > >> Thanks! > >> > >> > >> Quoting Bill Dunlap <williamwdun...@gmail.com>: > >> > >> > What should the result be for > >> > Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"), > >> > Account=c("A1","A2","A2","A2")) > >> > ? > >> > > >> > Must each vendor have only one account? If not, what should the > result > >> be > >> > for > >> > Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"), > >> > Account=c("A1","A2","A2","A2","A3","A4")) > >> > ? > >> > > >> > -Bill > >> > > >> > On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <twool...@ontargettek.com > > > >> > wrote: > >> > > >> >> Hi everyone. I have a dataframe that is a collection of Vendor IDs > >> >> plus a bank account number for each vendor. I'm trying to find a way > >> >> to count the number of duplicate bank accounts that occur in more > than > >> >> one unique Vendor_ID, and then assign the count value for each row in > >> >> the dataframe in a new variable. > >> >> > >> >> I can do a count of bank accounts that occur within the same vendor > >> >> using dplyr and group_by and count, but I can't figure out a way to > >> >> count duplicates among multiple Vendor_IDs. > >> >> > >> >> > >> >> Dataframe example code: > >> >> > >> >> > >> >> #Create a sample data frame: > >> >> > >> >> set.seed(1) > >> >> > >> >> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID = > >> >> sample(1:10000)) > >> >> > >> >> > >> >> > >> >> > >> >> Thanks in advance for any help. > >> >> > >> >> ______________________________________________ > >> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> >> https://stat.ethz.ch/mailman/listinfo/r-help > >> >> PLEASE do read the posting guide > >> >> http://www.R-project.org/posting-guide.html > >> >> and provide commented, minimal, self-contained, reproducible code. > >> >> > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.