[R] [FORGED] Grouping Question

peter dalgaard pd@|gd @end|ng |rom gm@||@com
Sun Mar 22 12:54:06 CET 2020


Or even split -> lapply -> unsplit, in cases where you want the results put back in the original order. 

(Doesn't matter here, but it would, had it been, say, ordered 1,2,3,1,2,2,3).

-pd

> On 22 Mar 2020, at 08:44 , Deepayan Sarkar <deepayan.sarkar using gmail.com> wrote:
> 
> Another possible approach is to use split -> lapply -> rbind, which I
> often find to be conceptually simpler:
> 
> d <- data.frame(Serial = c(1, 1, 2, 2, 2, 3, 3),
>                Measurement = c(17, 16, 12, 8, 10, 19, 13))
> 
> dlist <- split(d, d$Serial)
> dlist <- lapply(dlist, within,
> {
>    Serial_test <- if (all(Measurement <= 16)) "pass" else "fail"
>    Meas_test <- ifelse(Measurement <= 16, "pass", "fail")
> })
> do.call(rbind, dlist)
> 
> -Deepayan
> 
> On Sun, Mar 22, 2020 at 12:29 PM Rolf Turner <r.turner using auckland.ac.nz> wrote:
>> 
>> 
>> On 22/03/20 4:01 pm, Thomas Subia via R-help wrote:
>> 
>>> Colleagues,
>>> 
>>> Here is my dataset.
>>> 
>>> Serial        Measurement     Meas_test       Serial_test
>>> 1     17              fail            fail
>>> 1     16              pass            fail
>>> 2     12              pass            pass
>>> 2     8               pass            pass
>>> 2     10              pass            pass
>>> 3     19              fail            fail
>>> 3     13              pass            pass
>>> 
>>> If a measurement is less than or equal to 16, then Meas_test is pass. Else
>>> Meas_test is fail
>>> This is easy to code.
>>> 
>>> Serial_test is a pass, when all of the Meas_test are pass for a given
>>> serial. Else Serial_test is a fail.
>>> I'm at a loss to figure out how to do this in R.
>>> 
>>> Some guidance would be appreciated.
>> 
>> In future, please present your data using dput(); makes life much easier
>> for those trying to help you.  Your data are really the first two
>> columns of what you presented --- the last two columns are your desired
>> output.
>> 
>> Let "X" be these first two columns.  Define
>> 
>> foo <- function (X) {
>> a <- with(X,Measurement <= 16)
>> a <- ifelse(a,"pass","fail")
>> b <- with(X,tapply(Measurement,Serial,function(x){all(x<=16)}))
>> i <- match(X$Serial,names(b))
>> b <- ifelse(b[i],"pass","fail")
>> data.frame(Meas_test=a,Serial_test=b)
>> }
>> 
>> foo(X) gives:
>> 
>>>  Meas_test Serial_test
>>> 1      fail        fail
>>> 2      pass        fail
>>> 3      pass        pass
>>> 4      pass        pass
>>> 5      pass        pass
>>> 6      fail        fail
>>> 7      pass        fail
>> 
>> If you want input and output combined, as in the way that you presented
>> your data use cbind(X,foo(X)).
>> 
>> cheers,
>> 
>> Rolf Turner
>> 
>> --
>> Honorary Research Fellow
>> Department of Statistics
>> University of Auckland
>> Phone: +64-9-373-7599 ext. 88276
>> 
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com



More information about the R-help mailing list