[Rd] Improved Data Aggregation and Summary Statistics in R
Sebastian Martin Krantz
@eb@@t|@n@kr@ntz @end|ng |rom gr@du@te|n@t|tute@ch
Wed Feb 27 11:44:05 CET 2019
Dear Iñaki and Joris,
thank you for the positive feedback! I had attached a code file to the
post, but apparently it was removed.
I will attach it again to this e-mail, otherwise both vignette and code can
be downloaded from the following link:
https://www.dropbox.com/sh/s0k1tiz7el55g1q/AACpri-nruXjcMwUnNcHoycKa?dl=0
Best,
Sebastian
On Wed, 27 Feb 2019 at 11:14, Joris Meys <jorismeys using gmail.com> wrote:
> Dear Sebastian,
>
> Initially I was a bit hesitant to think about yet another way to summarize
> data, but your illustrations convinced me this is actually a great addition
> to the toolset currently available in different R packages. Many of us have
> written custom functions to get the required tables for specific data sets,
> but this would reduce that effort to simply using the right collap() call.
>
> Like Inaki, I'm very interested in trying it out if you have the code
> available somewhere.
>
> Cheers
> Joris
>
>
>
>
>
> On Wed, Feb 27, 2019 at 9:01 AM Sebastian Martin Krantz <
> sebastian.krantz using graduateinstitute.ch> wrote:
>
>> Dear Developers,
>>
>> Having spent time developing and thinking about how data aggregation and
>> summary statistics can be enhanced in R, I would like to present my
>> ideas/efforts in the form of two commands:
>>
>> The first, which for now I called 'collap', is an upgrade of aggregate
>> that
>> accommodates and extends the functionality of aggregate in various
>> respects, most importantly to work with multilevel and multi-type data,
>> multiple function calls, highly customized aggregation tasks, a much
>> greater flexibility in the passing of inputs and tidy output.
>>
>> The second function, 'qsu', is an advanced and flexible summary command
>> for
>> cross-sectional and multilevel (panel) data (i.e. it can provide overall,
>> between and within entities statistics, and allows for grouping, custom
>> functions and transformations). It also provides a quick method to compute
>> and output within-transformed data.
>>
>> Both commands are efficiently built from core R, but provide for optional
>> integration with data.table, which renders them extremely fast on large
>> datasets. An explanation of the syntax, a demonstration and benchmark
>> results are provided in the attached vignette.
>>
>> Since both commands accommodate existing functionality while adding
>> significant basic functionality, I though that their addition to the stats
>> package would be a worthwhile consideration. I am happy for your feedback.
>>
>> Best regards,
>>
>> Sebastian Krantz
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
>
> -----------
> Biowiskundedagen 2018-2019
> http://www.biowiskundedagen.ugent.be/
>
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
More information about the R-devel
mailing list