[R] Subtraction with aggregate

Fri Jul 29 00:49:22 CEST 2016

One thing to watch out for are there always two samples (one of each type)
for each subject?  You had better sort by the emotion to make sure that
when you do the difference, it is always with the data in the same
order.    Here is an example of some of these cases where they are ignored:

> library(dplyr)
> mydata <- read.table(text = "subject   QM    emotion     yi
+  s0  123  neutral   321  # only one sample
+  s5  123 neutral 321   # three samples
+  s5  321 negative  345
+  s5  345 what  1234
+  s6 456 neutral 567   # two emotions the same
+  s6 567 neutral 123
+    s1   75.1017   neutral  -75.928276
+    s2  -47.3512   neutral -178.295990
+    s3  -68.9016   neutral -134.753906
+    s1   17.2099  negative -104.168312
+    s2  -53.1114  negative -182.373474
+    s3  -33.0322  negative -137.420410", header = TRUE, as.is = TRUE)
>
> agg <- mydata %>%
+         arrange(desc(emotion)) %>%  # sort
+         group_by(subject) %>%
+         filter(n() == 2 && emotion[1L] != emotion[2L]) %>%  # test for 2
emotions that are different
+         summarise(QM = QM[1L] - QM[2L],
+                   yi = yi[1L] - yi[2L]
+                   )
>
>
> agg
# A tibble: 3 x 3
  subject       QM        yi
    <chr>    <dbl>     <dbl>
1      s1  57.8918 28.240036
2      s2   5.7602  4.077484
3      s3 -35.8694  2.666504

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Thu, Jul 28, 2016 at 5:21 PM, Gang Chen <gangchen6 at gmail.com> wrote:

> Hi Jim and Jeff,
>
> Thanks for the quick help!
>
> Sorry I didn't state the question clearly: I want the difference
> between 'neutral' and 'negative' for each subject. And another person
> offered a solution for it:
>
> aggregate(cbind(QM, yi) ~ subject, data = mydata, FUN = diff)
>
>
> On Thu, Jul 28, 2016 at 4:53 PM, jim holtman <jholtman at gmail.com> wrote:
> > Not sure what you mean by "nice way", but here is a dplyr solution:
> >
> >> library(dplyr)
> >> mydata <- read.table(text = "subject   QM    emotion     yi
> > +    s1   75.1017   neutral  -75.928276
> > +    s2  -47.3512   neutral -178.295990
> > +    s3  -68.9016   neutral -134.753906
> > +    s1   17.2099  negative -104.168312
> > +    s2  -53.1114  negative -182.373474
> > +    s3  -33.0322  negative -137.420410", header = TRUE)
> >> agg <- mydata %>%
> > +         group_by(subject) %>%
> > +         summarise(QM = mean(QM),
> > +                   yi = mean(yi)
> > +                   )
> >>
> >>
> >> agg
> > # A tibble: 3 x 3
> >   subject       QM         yi
> >    <fctr>    <dbl>      <dbl>
> > 1      s1  46.1558  -90.04829
> > 2      s2 -50.2313 -180.33473
> > 3      s3 -50.9669 -136.08716
> >
> >
> >
> > Jim Holtman
> > Data Munger Guru
> >
> > What is the problem that you are trying to solve?
> > Tell me what you want to do, not how you want to do it.
> >
> > On Thu, Jul 28, 2016 at 4:40 PM, Gang Chen <gangchen6 at gmail.com> wrote:
> >>
> >> With the following data in data.frame:
> >>
> >> subject   QM    emotion     yi
> >>   s1   75.1017   neutral  -75.928276
> >>   s2  -47.3512   neutral -178.295990
> >>   s3  -68.9016   neutral -134.753906
> >>   s1   17.2099  negative -104.168312
> >>   s2  -53.1114  negative -182.373474
> >>   s3  -33.0322  negative -137.420410
> >>
> >> I can obtain the average between the two emotions with
> >>
> >> mydata <- read.table('clipboard', header=TRUE)
> >> aggregate(mydata[,c('yi', 'QM')], by=list(subject=mydata$subject), mean)
> >>
> >> My question is, what is a nice way to get the difference between the
> >> two emotions?
> >>
> >> Thanks,
> >> Gang
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

	[[alternative HTML version deleted]]