[R] Finding values in a dataframe at a specified hour
Jim Lemon
drjimlemon at gmail.com
Sat Apr 11 05:05:20 CEST 2015
Hi Alexandra,
I answered too quickly. Your response made me look for a deeper error: The
value of i doesn't matter, as it isn't being used as an index. However, the
first value of i=0 may cause the error in the second loop, where h is used
as an index.
for (i in 0:23){
hourRow = which(windHW$hour==i,arr.ind=TRUE)
for (h in hourRow){
if (windHW[h+1,1]>=spring[spring$hour==i,5]){
windHW[h+1,1]<-NA}
}
}
Jim
On Sat, Apr 11, 2015 at 9:24 AM, Alexandra Catena <amc5981 at gmail.com> wrote:
> Hi Jim,
>
> Thanks for the response, but unfortunately it results in the same
> error. I think it is something wrong with the if statement. I tried
> it out manually for the first row and hour that it's testing and
> indeed, the wind speed is not higher than the 5*sigma value. Since it
> is not higher than the 5*sigma value, I would think it would just pass
> to the next loop, yet it doesn't. I will keep trying!
>
> Thanks,
> Alexandra
>
> On Fri, Apr 10, 2015 at 3:43 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
> > Hi Alexandra,
> > The error probably comes from the first iteration of i in 0:23. As
> indexing
> > in R begins at 1, there is no element 0. Try using:
> >
> > for(i in 1:24) {
> > ...
> >
> > and see what happens.
> >
> > Jim
> >
> >
> > On Sat, Apr 11, 2015 at 7:06 AM, Alexandra Catena <amc5981 at gmail.com>
> wrote:
> >>
> >> Update:
> >>
> >> I have this so far. * The first column of windHW is the wind speed.
> >> The 5th column of the dataframe, spring, is the 5*sigma value of every
> >> hour. hourRow gives out all the rows of wind speed at a given hour.
> >>
> >> for (i in 0:23){
> >> hourRow = which(windHW$hour==i,arr.ind=TRUE)
> >> for (h in hourRow){
> >> if (windHW[h,1]>=spring[spring$hour==i,5]){
> >> windHW[h,1]<-NA}
> >> }
> >> }
> >>
> >> This then gives the error: Error in if (windHW[h, 1] >=
> >> spring[spring$hour == i, 5]) { : argument is of length zero
> >>
> >> *Note: The dataframe for each of the seasons have 24 rows
> >> corresponding to each hour of the day 0:23.
> >>
> >> Thanks,
> >> Alexandra
> >>
> >>
> >> On Fri, Apr 10, 2015 at 1:07 PM, Alexandra Catena <amc5981 at gmail.com>
> >> wrote:
> >> > Hello,
> >> >
> >> > I have a large dataframe (windHW) of wind speeds (ws) at each hour
> >> > from many days over a set of years. Some of these values are
> >> > obviously wrong (600 m/s) and I want to get rid of all the values that
> >> > are larger than 5*sigma for each hour. The 5*sigma (variable name
> >> > sigma5) values are located in different dataframes for each season,
> >> > with each dataframe titled as a season. For example, in the
> >> > dataframe, spring, the 5*sigma value is 79.6 m/s for hour 1.
> >> >
> >> > So my question is as follows: how can I get it so that the code will
> >> > be able to find all the wind speed values in the dataframe, windHW, of
> >> > a specific hour be higher than the 5*sigma value at that hour?
> >> > For example, I would like to find if any of the wind speed values at
> >> > hour 1 are higher than 79.6 m/s, and if so, then replace that value
> >> > with NA.
> >> >
> >> > I have something like this but I can't seem to figure out how to get
> >> > it for specific hours:
> >> >
> >> > windHW$ws[windHW$ws>=spring$sigma5] <- NA
> >> >
> >> > I imported the data using readLines and into the dataframe windHW. I
> >> > also have R version 3.1.1
> >> >
> >> > Any help would be appreciated!
> >> >
> >> > Thanks,
> >> > Alexandra
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list