[R] subsetting with condition
David Winsemius
dwinsemius at comcast.net
Thu Jun 2 01:56:22 CEST 2011
On Jun 1, 2011, at 7:00 PM, kristina p wrote:
> Dear R Team,
>
> I am a new R user and I am currently trying to subset my data under a
> special condition. I have went through several pages of the subsetting
> section here on the forum, but I was not able to find an answer.
>
> My data is as follows:
>
> ID NAME MS Pol. Party
> 1 John x F
> 2 Mary s S
> 3 Katie x O
> 4 Sarah p L
> 5 Martin x O
> 6 Angelika x F
> 7 Smith x O
> ....
Assume this is in a dataframe, 'pol', and that you have corrected the
error in colnames, so that it is Pol_Party. the ave function is
particularly useful when you need to have a vector that "lines up
along side" the other columns
pol[ave(seq_along(pol$ID), pol$Pol_Party, FUN=length) >= 3 , ]
ID NAME MS Pol_Party
3 3 Katie x O
5 5 Martin x O
7 7 Smith x O
(The use of seq_along ensures you will get duplicates of ID that are
in any qualifying Parties.
Another way to generate the values would be to table()-ulate and pick
out the names of qualifying Parties:
> pol[ pol$Pol_Party %in% names(tabl.party)[tabl.party >= 3], ]
ID NAME MS Pol_Party
3 3 Katie x O
5 5 Martin x O
7 7 Smith x O
> I am intested in only those observations, where there are at least
> three
> members of 1 political party. That is, I need to throw out all cases
> in the
> example above, except for members of party "O".
Both methods use logical indexing with the "[.data.frame" function,
>
> Would really appreciate your help.
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list