[R] subsetting like in SAS
Petr Pikal
petr.pikal at precheza.cz
Thu Jan 13 14:23:38 CET 2005
Hi Denis
maybe unique() can choose unique entries from your data set
without need for sorting.
Cheers
Petr
On 13 Jan 2005 at 11:52, Denis Chabot wrote:
> Hi,
>
> Being in the process of translating some of my SAS programs to R, I
> encountered one difficulty. I have a solution, but it is not elegant
> (and not pleasant to implement).
>
> I have a large dataset with many variables needed to identify the
> origin of a sample, many to describe sample characteristics, others to
> describe site characteristics.
>
> I want only a (shorter) list of sites and their characteristics.
>
> If "origin", "ship_cat", "ship_nb", "trip" and "set" are needed to
> identify a site, in SAS you'd sort on those variables, then read the
> data with:
>
> data sites;
> set alldata;
> by origin ship_cat ship_nb trip set;
> if first.set;
> keep list-of-variables-detailing-sites;
> run;
>
> In R I did this with the Lag function of Hmisc, and the original data
> set also needs to be sorted first:
>
> oL <- Lag(origin)
> scL <- Lag(ship_cat)
> snL <- Lag(ship_nb)
> tL <- Lag(trip)
> sL <- Lag(set)
> same <- origin==oL & ship_cat==scL & ship_nb==snL & trip==tL & set==sL
> sites <- subset(alldata, !same,
> select=c(list-of-variables-detailing-sites)
>
> Could I do better than this?
>
> Thanks in advance,
>
> Denis Chabot
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
Petr Pikal
petr.pikal at precheza.cz
More information about the R-help
mailing list