[R] aggregate dataframe by multiple factors
David Winsemius
dwinsemius at comcast.net
Fri Nov 18 09:30:11 CET 2016
> On Nov 17, 2016, at 11:27 PM, Karim Mezhoud <kmezhoud at gmail.com> wrote:
>
> Dear all,
>
> the dat has missing values NA,
>
> first.Name Name Department DCE DP date
> 5 Auction Videos YME 0.57 0.56 2013-09-30
> 18 Amish Wives TAS 0.59 0.56 2013-09-30
> 34 Ancient Nation QLH 0.54 0.58 2013-09-30
> 53 Auction Videos YME NA NA 2013-12-28
> 66 Amish Wives TAS NA NA 2013-12-28
> 82 Ancient Nation QLH 0.28 0.29 2013-12-28
> 102 Auction Videos YME 0.57 0.56 2014-03-30
> 115 Amish Wives TAS 0.59 0.56 2014-03-30
> 131 Ancient Nation QLH 0.54 0.58 2014-03-30
> 150 Auction Videos YME NA NA 2014-06-28
> 163 Amish Wives TAS NA NA 2014-06-28
> 179 Ancient Nation QLH 0.28 0.29 2014-06-28
>
>
> agg <- as.data.frame(aggregate(dat[ , c("DCE","DP")], by=
> list(dat$first.Name, dat$Name, dat$Department) , "sort"))
The closest I could get on a few attempts was:
(agg <- as.data.frame(aggregate(dat[ , c("DCE","DP")], by=
list(dat$first.Name, dat$Name, dat$Department) , function(d) { unlist(d)}))
)
Group.1 Group.2 Group.3 DCE.1 DCE.2 DCE.3 DCE.4 DP.1 DP.2 DP.3 DP.4
1 Ancient Nation QLH 0.54 0.28 0.54 0.28 0.58 0.29 0.58 0.29
2 Amish Wives TAS 0.59 NA 0.59 NA 0.56 NA 0.56 NA
3 Auction Videos YME 0.57 NA 0.57 NA 0.56 NA 0.56 NA
I think the sort operation might be somewhat ambiguous in this instance. I tried:
(agg <- as.data.frame(aggregate(dat[ , c("DCE","DP")], by=
list(dat$first.Name, dat$Name, dat$Department) , function(d) { unlist(lapply(d,sort))}))
)
With no success, not even a sorted result.
--
David.
>
>
> agg has list of value. I would separate value in different columns.
>
> Group.1 Group.2 Group.3 DCE DP
> 1 Ancient Nation QLH 0.28, 0.28, 0.54, 0.54 0.29, 0.29, 0.58, 0.58
> 2 Amish Wives TAS 0.59, 0.59 0.56, 0.56
> 3 Auction Videos YME 0.57, 0.57 0.56, 0.56
>
> The goal:
>
> Group.1 Group.2 Group.3 DCE.1 DCE.2 DCE.3 DCE.4 DP.1 DP.2 DP.3 DP.4
> 1 Ancient Nation QLH 0.28 0.28 0.54 0.54 0.29, 0.29,
> 0.58, 0.58
> 2 Amish Wives TAS NA NA 0.59, 0.59 NA
> NA 0.56, 0.56
> 3 Auction Videos YME NA NA 0.57, 0.57 NA
> NA 0.56, 0.56
>
>
>
> dat <- structure(list(first.Name = structure(c(3L, 1L, 2L, 3L, 1L, 2L,
> 3L, 1L, 2L, 3L, 1L, 2L), .Label = c("Amish", "Ancient", "Auction",
> "Ax", "Bachelorette", "Basketball", "BBQ", "Cake", "Celebrity",
> "Chef", "Clean", "Colonial", "Comedy", "Comic", "Crocodile",
> "Dog", "Empire", "Extreme", "Farm", "Half Pint", "Hollywood",
> "House", "Ice Road", "Jersey", "Justice", "Love", "Mega", "Model",
> "Modern", "Mountain", "Mystery", "Myth", "New York", "Paradise",
> "Pioneer", "Queer", "Restaurant", "Road", "Royal", "Spouse",
> "Star", "Storage", "Survival", "The Great American", "Tool",
> "Treasure", "Wedding", "Wife"), class = "factor"), Name = structure(c(43L,
> 47L, 29L, 43L, 47L, 29L, 43L, 47L, 29L, 43L, 47L, 29L), .Label =
> c("Aliens",
> "Behavior", "Casino", "Casting Call", "Challenge", "Contest",
> "Crashers", "Crew", "Dad", "Dancing", "Date", "Disasters", "Dynasty",
> "Family", "Garage", "Greenlight", "Gypsies", "Haul", "Hot Rod",
> "Inventor", "Jail", "Job", "Justice", "Marvels", "Master", "Mates",
> "Model", "Moms", "Nation", "Ninja", "Patrol", "People", "Pitmasters",
> "Queens", "Rescue", "Rivals", "Room", "Rooms", "Rules", "Star",
> "Stars", "Superhero", "Videos", "VIP", "Wars", "Wishes", "Wives",
> "Wrangler"), class = "factor"), Department = structure(c(8L,
> 6L, 2L, 8L, 6L, 2L, 8L, 6L, 2L, 8L, 6L, 2L), .Label = c("HXW",
> "QLH", "RAR", "RYC", "SYI", "TAS", "VUV", "YME"), class = "factor"),
> DCE = c(0.57, 0.59, 0.54, NA, NA, 0.28, 0.57, 0.59, 0.54,
> NA, NA, 0.28), DP = c(0.56, 0.56, 0.58, NA, NA, 0.29, 0.56,
> 0.56, 0.58, NA, NA, 0.29), date = structure(c(15978, 15978,
> 15978, 16067, 16067, 16067, 16159, 16159, 16159, 16249, 16249,
> 16249), class = "Date")), description = "", row.names = c(5L,
> 18L, 34L, 53L, 66L, 82L, 102L, 115L, 131L, 150L, 163L, 179L), class =
> "data.frame", .Names = c("first.Name",
> "Name", "Department", "DCE", "DP", "date"))
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list