[R] Subsetting Timestamped data

MacQueen, Don macqueen1 at llnl.gov
Mon Oct 7 22:52:18 CEST 2013


Here is an approach using base R tools (not tested, so I hope I don't
embarrass myself!)

dayid <- format(data$TimeStamp, '%Y-%m-%d')
day.counts <- table(dayid)
good.days <- names(day.counts)[day.counts == 48]
subset(data, dayid %in% good.days)

This could be written in a one-liner, but it's much easier to understand
and to check if done step by step.

(And I'll indulge in a side comment ... as a matter of personal opinion, I
think it's beneficial to learn how to do basic data manipulation using
base R tools before delving into the use of more sophisticated functions
from various packages. This helps build R skills.)

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 10/4/13 8:03 AM, "aj409 at bath.ac.uk" <aj409 at bath.ac.uk> wrote:

>
>Hi,
>
>I have a data frame, data, containing two columns: one- the TimeStamp
>(formatted using data$TimeStamp <-
>as.POSTIXct(as.character(data$TimeStamp), format = "%d/%m/%Y %H:%M") )
>and two- the data value.
>
>The data frame has been read from a .csv file and should contain 48
>values for each day of the year (values sampled at 30 minute
>intervals). However, there are only 15,948 observations i.e. only
>approx 332 days worth of data. I therefore would like to remove any
>days that do not contain the 48 values.
>
>My question, how would I go about doing this?
>
>Many thanks,
>
>-A.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list