[R] Subtraction with aggregate
jim holtman
jholtman at gmail.com
Fri Jul 29 00:49:22 CEST 2016
One thing to watch out for are there always two samples (one of each type)
for each subject? You had better sort by the emotion to make sure that
when you do the difference, it is always with the data in the same
order. Here is an example of some of these cases where they are ignored:
> library(dplyr)
> mydata <- read.table(text = "subject QM emotion yi
+ s0 123 neutral 321 # only one sample
+ s5 123 neutral 321 # three samples
+ s5 321 negative 345
+ s5 345 what 1234
+ s6 456 neutral 567 # two emotions the same
+ s6 567 neutral 123
+ s1 75.1017 neutral -75.928276
+ s2 -47.3512 neutral -178.295990
+ s3 -68.9016 neutral -134.753906
+ s1 17.2099 negative -104.168312
+ s2 -53.1114 negative -182.373474
+ s3 -33.0322 negative -137.420410", header = TRUE, as.is = TRUE)
>
> agg <- mydata %>%
+ arrange(desc(emotion)) %>% # sort
+ group_by(subject) %>%
+ filter(n() == 2 && emotion[1L] != emotion[2L]) %>% # test for 2
emotions that are different
+ summarise(QM = QM[1L] - QM[2L],
+ yi = yi[1L] - yi[2L]
+ )
>
>
> agg
# A tibble: 3 x 3
subject QM yi
<chr> <dbl> <dbl>
1 s1 57.8918 28.240036
2 s2 5.7602 4.077484
3 s3 -35.8694 2.666504
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
On Thu, Jul 28, 2016 at 5:21 PM, Gang Chen <gangchen6 at gmail.com> wrote:
> Hi Jim and Jeff,
>
> Thanks for the quick help!
>
> Sorry I didn't state the question clearly: I want the difference
> between 'neutral' and 'negative' for each subject. And another person
> offered a solution for it:
>
> aggregate(cbind(QM, yi) ~ subject, data = mydata, FUN = diff)
>
>
> On Thu, Jul 28, 2016 at 4:53 PM, jim holtman <jholtman at gmail.com> wrote:
> > Not sure what you mean by "nice way", but here is a dplyr solution:
> >
> >> library(dplyr)
> >> mydata <- read.table(text = "subject QM emotion yi
> > + s1 75.1017 neutral -75.928276
> > + s2 -47.3512 neutral -178.295990
> > + s3 -68.9016 neutral -134.753906
> > + s1 17.2099 negative -104.168312
> > + s2 -53.1114 negative -182.373474
> > + s3 -33.0322 negative -137.420410", header = TRUE)
> >> agg <- mydata %>%
> > + group_by(subject) %>%
> > + summarise(QM = mean(QM),
> > + yi = mean(yi)
> > + )
> >>
> >>
> >> agg
> > # A tibble: 3 x 3
> > subject QM yi
> > <fctr> <dbl> <dbl>
> > 1 s1 46.1558 -90.04829
> > 2 s2 -50.2313 -180.33473
> > 3 s3 -50.9669 -136.08716
> >
> >
> >
> > Jim Holtman
> > Data Munger Guru
> >
> > What is the problem that you are trying to solve?
> > Tell me what you want to do, not how you want to do it.
> >
> > On Thu, Jul 28, 2016 at 4:40 PM, Gang Chen <gangchen6 at gmail.com> wrote:
> >>
> >> With the following data in data.frame:
> >>
> >> subject QM emotion yi
> >> s1 75.1017 neutral -75.928276
> >> s2 -47.3512 neutral -178.295990
> >> s3 -68.9016 neutral -134.753906
> >> s1 17.2099 negative -104.168312
> >> s2 -53.1114 negative -182.373474
> >> s3 -33.0322 negative -137.420410
> >>
> >> I can obtain the average between the two emotions with
> >>
> >> mydata <- read.table('clipboard', header=TRUE)
> >> aggregate(mydata[,c('yi', 'QM')], by=list(subject=mydata$subject), mean)
> >>
> >> My question is, what is a nice way to get the difference between the
> >> two emotions?
> >>
> >> Thanks,
> >> Gang
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list