[R] using complete.cases() with nested factors
hadley wickham
h.wickham at gmail.com
Fri Sep 5 00:51:11 CEST 2008
On Thu, Sep 4, 2008 at 4:19 PM, Ken Knoblauch <ken.knoblauch at inserm.fr> wrote:
> Andrew Barr <wabarr <at> gmail.com> writes:
>> This maybe a newbie question. I have a dataframe
> that looks like the sample
>> at the bottom of the email. I have monthly
> precipitation data from several
>> sites over several years. For each site,
> I need to extract years that have
>> a complete series of 12 monthly precipitation
> values, while excluding that
>> year for sites with incomplete data.
> I can't figure out how to do this
>> gracefully (i.e. without a silly for loop).
> Any help will be appreciate,
>> thanks!
>> SiteID year month precip(mm)
>> 670090 1941 jan 2998
>> 670090 1941 feb 1299
>> 670090 1941 mar 1007
>> 670090 1941 apr 354
>> 670090 1941 may 88
>> 670090 1941 jun 156
>> 670090 1941 jul 8
>> 670090 1941 aug 4
>> 670090 1941 sep 8
>> 670090 1941 oct 58
>> 670090 1941 nov 397
>> 670090 1941 dec 248
>> 670090 1942 jan NA
>> 670090 1942 feb 380
>> 670090 1942 mar 797
>> 670090 1942 apr 142
>> 670090 1942 may 43
>> 670090 1942 jun 14
>> 670090 1942 jul 70
>> 670090 1942 aug 51
>> 670090 1942 sep 0
>> 670090 1942 oct 10
>> 670090 1942 nov 235
>> 670090 1942 dec 405
>>
> There are likely more elegant solutions but this seems to work.
> If the data frame is in a variable named dd
>
> lapply(unique(dd$year), function(x) {s <- subset(dd, year == x)
> if (nrow(s) == 12) s})
I think this is slightly more elegant, and follows the
split-apply-combine strategy:
years <- split(dd, dd$year)
full_years <- Filter(function(df) nrow(df) == 12, years)
do.call("cbind", full_years)
Hadley
--
http://had.co.nz/
More information about the R-help
mailing list