[R] Finding values in a dataframe at a specified hour

Alexandra Catena amc5981 at gmail.com
Fri Apr 10 23:06:42 CEST 2015


Update:

I have this so far.  * The first column of windHW is the wind speed.
The 5th column of the dataframe, spring, is the 5*sigma value of every
hour.  hourRow gives out all the rows of wind speed at a given hour.

for (i in 0:23){
  hourRow = which(windHW$hour==i,arr.ind=TRUE)
  for (h in hourRow){
    if (windHW[h,1]>=spring[spring$hour==i,5]){
      windHW[h,1]<-NA}
  }
}

This then gives the error: Error in if (windHW[h, 1] >=
spring[spring$hour == i, 5]) { : argument is of length zero

*Note: The dataframe for each of the seasons have 24 rows
corresponding to each hour of the day 0:23.

Thanks,
Alexandra


On Fri, Apr 10, 2015 at 1:07 PM, Alexandra Catena <amc5981 at gmail.com> wrote:
> Hello,
>
> I have a large dataframe (windHW) of wind speeds (ws) at each hour
> from many days over a set of years.  Some of these values are
> obviously wrong (600 m/s) and I want to get rid of all the values that
> are larger than 5*sigma for each hour.  The 5*sigma (variable name
> sigma5) values are located in different dataframes for each season,
> with each dataframe titled as a season.  For example, in the
> dataframe, spring, the 5*sigma value is 79.6 m/s for hour 1.
>
> So my question is as follows: how can I get it so that the code will
> be able to find all the wind speed values in the dataframe, windHW, of
> a specific hour be higher than the 5*sigma value at that hour?
> For example, I would like to find if any of the wind speed values at
> hour 1 are higher than 79.6 m/s, and if so, then replace that value
> with NA.
>
> I have something like this but I can't seem to figure out how to get
> it for specific hours:
>
> windHW$ws[windHW$ws>=spring$sigma5] <- NA
>
> I imported the data using readLines and into the dataframe windHW.  I
> also have R version 3.1.1
>
> Any help would be appreciated!
>
> Thanks,
> Alexandra



More information about the R-help mailing list