[R] find unique and summerize
Rui Barradas
ruipbarradas at sapo.pt
Sat Feb 3 20:05:43 CET 2018
Hello,
As for the first question, instead of writing a xlsx file, maybe it is
easier to write a csv file and then open it with Excel.
tbl2 <- addmargins(tbl1)
write.csv(tbl2, "tt1.csv")
As for the second question, the following does it.
inx <- apply(tbl1, 1, function(x) all(x != 0))
tbl1b <- addmargins(tbl1[inx, ])
tbl1b
Hope this helps,
Rui Barradas
On 2/3/2018 4:42 PM, Val wrote:
> Thank you so much Rui.
>
> 1. How do I export this table to excel file?
> I used this
> tbl1 <- table(Country, IDNum)
> tbl2=addmargins(tbl1)
> write.xlsx(tbl2,"tt1.xlsx"),sheetName="summary", row.names=FALSE)
> The above did not give me that table.
>
>
> 2. I want select those unique Ids that do have records in all countries.
> From the above data set, this ID "FIN1540166" should be excluded
> from the summary table and the table looks like as follow
>
> IDNum Country 1 33 358 44 Sum CAN1540164 47 141 248 90 526 USA1540165
> 290 757 321 171 1539 Sum 337 898 569 261 2065
>
> Thank you again
>
>
> On Fri, Feb 2, 2018 at 11:26 PM, Rui Barradas <ruipbarradas at sapo.pt
> <mailto:ruipbarradas at sapo.pt>> wrote:
>
> Hello,
>
> Thanks for the reproducible example.
> See if the following does what you want.
>
> IDNum <- sub("^(\\d+).*", "\\1", mydata$ID)
> Country <- sub("^\\d+(.*)", "\\1", mydata$ID)
>
> tbl1 <- table(Country, IDNum)
> addmargins(tbl1)
>
> tbl2 <- xtabs(Y ~ Country + IDNum, mydata)
> addmargins(tbl2)
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> On 2/3/2018 3:00 AM, Val wrote:
>
> Hi all,
>
> I have a data set need to be summarized by unique ID (count and
> sum of a
> variable)
> A unique individual ID (country name Abbreviation followed by
> an integer
> numbers) may have observation in several countries. Then the
> ID was
> changed by adding the country code as a prefix and new ID was
> constructed
> or recorded like (country code, + the original unique ID Example
> original ID "CAN1540164" , if this ID has an observation in
> CANADA then
> the ID was changed to "1CAN1540164". From this new ID I
> want get out
> the country code get the original unique ID and summarize
> the data by
> unique ID and country code
>
> The data set look like
> mydata <- read.table(textConnection("GR ID iflag Y
> A 1CAN1540164 1 20
> A 1CAN1540164 1 12
> A 1CAN1540164 1 15
> A 44CAN1540164 1 30
> A 44CAN1540164 1 24
> A 44CAN1540164 1 25
> A 44CAN1540164 1 11
> A 33CAN1540164 1 12
> A 33CAN1540164 1 23
> A 33CAN1540164 1 65
> A 33CAN1540164 1 41
> A 358CAN1540164 1 28
> A 358CAN1540164 1 32
> A 358CAN1540164 1 41
> A 358CAN1540164 1 54
> A 358CAN1540164 1 29
> A 358CAN1540164 1 64
> B 1USA1540165 1 125
> B 1USA1540165 1 165
> B 44USA1540165 1 171
> B 33USA1540165 1 254
> B 33USA1540165 1 241
> B 33USA1540165 1 262
> B 358USA1540165 1 321
> C 358FIN1540166 1 225 "),header = TRUE ,stringsAsFactors = FALSE)
>
> From the above data there are three unique IDs and four
> country codes (1,
> 44, 33 and 358)
>
> I want the following two tables
>
> Table 1. count the unique ID by country code
> 1 44 33 358 TOT
> CAN1540164 3 4 4 6 17
> USA1540165 2 1 3 1 7
> FIN1540166 - - - 1 1
> TOT 5 5 7 8 25
>
>
> Table 2 Sum of Y variable by unique ID and country. code
>
> 1 44 33 358 TOT
> CAN1540164 47 90 141 248 526
> USA1540165 290 171 757 321 1539
> FIN1540166 - - - 225 225
> TOT 337 261 898 794 2290
>
>
> How do I do it in R?
>
> The first step is to get the unique country codes unique ID
> by splitting
> the new ID
>
> Thank you in advance
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
> -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list