[R-sig-Geo] Subsetting dataframe by all factor levels

Justin H. j@tnhllrd @ending from gm@il@com
Fri Sep 14 20:10:03 CEST 2018


One more thing. Let's still assume you want to interpolate it yearly. The
below code will assign names to the output during the loop.

for (i in levels(rainfall.year[,#year])) {
assign ( paste (i,"interpolation output",sep = "_")
, interpolation_function()
}


Cheers,
Justin

On Fri, Sep 14, 2018 at 2:03 PM Justin H. <jstnhllrd using gmail.com> wrote:

> Hi Rich,
>
> For the sake of example, here's a solution for a simple aggregation.
>
> >aggregate(rainfall, list(rainfall$name), mean)  #This will aggregate all
> columns and determine their mean. You're left with 58 rows.
> >aggregate( rainfall[, #:#], list(rainfall$name), mean)  #In case you only
> want to aggregate over select columns.
>
>
> I am assuming you want rows with every combination of year and station
> with their average precipitations. To aggregate it in that way you will
> need to create a new column that represents the year (or month/year if the
> data are appropriate for that resolution).
>
> >rainfall.year<-with(rainfall, tapply(prcp, list(name, year), mean))
> #This does the aggregation.
> >rainfall.year<-data.frame(as.table(rainfall.year))  #However, you are
> given a "wide" data frame. This makes it "long" as you probably want it.
>
> A for-do-done loop option.
>
> for (i in levels(rainfall.year[,#year])) {
> print(i)
> print(mean(rainfall.year[rainfall.year$year==i,#prcp]))
> }
>
> The loop will return the mean rainfall per year, where #year is the number
> for the year column and #prcp is for precipitation.
> Try running that loop to see that it is properly looping through the
> factor you want and then stick in the interpolation function.
>
> I hope that helps!
>
> Cheers,
> Justin
>
> On Fri, Sep 14, 2018 at 1:13 PM Rich Shepard <rshepard using appl-ecosys.com>
> wrote:
>
>>    I need to learn geospatial analyses in R to complement my GIS
>> knowledge.
>> I've just re-read the subsetting chapter in Hadley's 'Advanced R' without
>> seeing how to create separate data frames based by extracting all rows for
>> each site name in the parent data frame in one step. I believe that what I
>> need to do is create a list of the factor names and feed them to a loop
>> subsetting each to a new dataframe. Perhaps there's a better way unknown
>> to
>> me and I need advice, suggestions, and recommendations how to proceed.
>>
>>    The inclusive data frame has this structure:
>>
>> str(rainfall)
>> 'data.frame':   113569 obs. of  6 variables:
>>   $ name    : Factor w/ 58 levels "Blazed Alder",..: 20 20 20 20 20 20 20
>> ...
>>   $ easting : num  2370575 2370575 2370575 2370575 2370575 ...
>>   $ northing: num  199338 199338 199338 199338 199338 ...
>>   $ elev    : num  228 228 228 228 228 228 228 228 228 228 ...
>>   $ sampdate: Date, format: "2005-01-01" "2005-01-02" ...
>>   $ prcp    : num  0.59 0.08 0.1 0 0 0.02 0.05 0.1 0 0.02 ...
>>
>>    My goal is to use the monthly mean rainfall at each of the 58 reporting
>> stations to interpolate/extrapolate rainfall over the entire county for
>> selected years to show variability. The data points are not evenly
>> distributed but clustered in more populated areas and dispersed in rural
>> areas. My geochemical data typically are like this and I need to also
>> learn
>> how this distribution affects how the data are analyzed.
>>
>> TIA,
>>
>> Rich
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo using r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list