[R] counting duplicate items that occur in multiple groups

Tom Woolman twoo|m@n @end|ng |rom ont@rgettek@com
Wed Nov 18 11:25:54 CET 2020


Thanks, everyone!



Quoting Jim Lemon <drjimlemon using gmail.com>:

> Oops, I sent this to Tom earlier today and forgot to copy to the list:
>
> VendorID=rep(paste0("V",1:10),each=5)
> AcctID=paste0("A",sample(1:5,50,TRUE))
> Data<-data.frame(VendorID,AcctID)
> table(Data)
> # get multiple vendors for each account
> dupAcctID<-colSums(table(Data)>0)
> Data$dupAcct<-NA
> # fill in the new column
> for(i in 1:length(dupAcctID))
>  Data$dupAcct[Data$AcctID == names(dupAcctID[i])]<-dupAcctID[i]
>
> Jim
>
> On Wed, Nov 18, 2020 at 8:20 AM Tom Woolman <twoolman using ontargettek.com>
> wrote:
>
>> Hi everyone.  I have a dataframe that is a collection of Vendor IDs
>> plus a bank account number for each vendor. I'm trying to find a way
>> to count the number of duplicate bank accounts that occur in more than
>> one unique Vendor_ID, and then assign the count value for each row in
>> the dataframe in a new variable.
>>
>> I can do a count of bank accounts that occur within the same vendor
>> using dplyr and group_by and count, but I can't figure out a way to
>> count duplicates among multiple Vendor_IDs.
>>
>>
>> Dataframe example code:
>>
>>
>> #Create a sample data frame:
>>
>> set.seed(1)
>>
>> Data <- data.frame(Vendor_ID = sample(1:10000), Bank_Account_ID =
>> sample(1:10000))
>>
>>
>>
>>
>> Thanks in advance for any help.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list