[R] counting duplicate items that occur in multiple groups
Tom Woolman
twoo|m@n @end|ng |rom ont@rgettek@com
Wed Nov 18 00:29:39 CET 2020
Hi Bill. Sorry to be so obtuse with the example data, I was trying
(too hard) not to share any actual values so I just created randomized
values for my example; of course I should have specified that the
random values would not provide the expected problem pattern. I should
have just used simple dummy codes as Bill Dunlap did.
So per Bill's example data for Data1, the expected (hoped for) output
should be:
Vendor Account Num_Vendors_Sharing_Bank_Acct
1 V1 A1 0
2 V2 A2 3
3 V3 A2 3
4 V4 A2 3
Where the new calculated variable is Num_Vendors_Sharing_Bank_Acct.
The value is 3 for V2, V3 and V4 because they each share bank account
A2.
Likewise, in the Data2 frame, the same logic applies:
Vendor Account Num_Vendors_Sharing_Bank_Acct
1 V1 A1 0
2 V2 A2 3
3 V3 A2 3
4 V1 A2 3
5 V4 A3 0
6 V2 A4 0
Thanks!
Quoting Bill Dunlap <williamwdunlap using gmail.com>:
> What should the result be for
> Data1 <- data.frame(Vendor=c("V1","V2","V3","V4"),
> Account=c("A1","A2","A2","A2"))
> ?
>
> Must each vendor have only one account? If not, what should the result be
> for
> Data2 <- data.frame(Vendor=c("V1","V2","V3","V1","V4","V2"),
> Account=c("A1","A2","A2","A2","A3","A4"))
> ?
>
> -Bill
>
> On Tue, Nov 17, 2020 at 1:20 PM Tom Woolman <twoolman using ontargettek.com>
> wrote:
>
>> Hi everyone. I have a dataframe that is a collection of Vendor IDs
>> plus a bank account number for each vendor. I'm trying to find a way
>> to count the number of duplicate bank accounts that occur in more than
>> one unique Vendor_ID, and then assign the count value for each row in
>> the dataframe in a new variable.
>>
>> I can do a count of bank accounts that occur within the same vendor
>> using dplyr and group_by and count, but I can't figure out a way to
>> count duplicates among multiple Vendor_IDs.
>>
>>
>> Dataframe example code:
>>
>>
>> #Create a sample data frame:
>>
>> set.seed(1)
>>
>> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
>> sample(1:10000))
>>
>>
>>
>>
>> Thanks in advance for any help.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
More information about the R-help
mailing list