[R] Creating a new column inside a function.
Frederic Ntirenganya
ntfredo at gmail.com
Tue Nov 25 12:44:59 CET 2014
Dear All,
i need a help on how I can create a new column on my dataset and use it as
argument inside the following function. The column i want to create and
vary is "Evapolation". It varies that'S why I need it as argument.
When I make it like this is not working:
water_blnce=function(data,capacity_max=100, rain_col=NULL, day_col=NULL,
year_col=NULL, month_col=NULL, Evaporation = 5)
## function for water balance
require(reshape2)
water_blnce=function(data,capacity_max=100, rain_col=NULL, day_col=NULL,
year_col=NULL, month_col=NULL){
#==========================================================================================================
# This function computes water balance of a dataset.
# Input: data, variable(s)= capacity_max
# It adds two new columns Water_balance and Evaporation to the dataset.
# The function use the formula: Water balance of today = water balance
of yesterday + Rainfall -Evaporation.
#If Water balance today < 0 then Water balance today = 0
#If Water balance today > 100 then Water balance today = 100
# the NAs due to non recording values are considered as zero.
#===========================================================================================================
# finding the indices of all NAs in the data
indicNAs <- which(data[[rain_col]] %in% NA)
ind_nonleap = c() # NAs due to non leap years
ind_nonrecord = c() # NAs due to non recording values
for (i_NA in indicNAs ){
if(data[[day_col]][i_NA] == 60){
ind_nonleap <- append(ind_nonleap,i_NA)
}
else {
ind_nonrecord<-append(ind_nonrecord,i_NA)
}
#cat(ind_nonrecord)
#cat( ind_nonleap)
}
#ind_nonleap
#ind_nonrecord
# assign the NAs due to non recording values to be zero.
for(j in ind_nonrecord){
data[[rain_col]][j]=0
}
# remove the rows which has missing values
data<-na.omit(data)
# Adding a new column for water balance and evaporation to the data frame
data$Water_Balance <- NA
data$Evaporation<-5
# initialization
data$Water_Balance[1]=0
# loop for calculating water balance for a given dataset
ndays <- nrow(data)
for (iday in 2:ndays) {
data$Water_Balance[iday] <- data$Water_Balance[iday-1] +
data[[rain_col]][iday] - data$Evaporation[iday]
if (data$Water_Balance[iday]<0){
data$Water_Balance[iday]=0
}else if(data$Water_Balance[iday]>capacity_max){
data$Water_Balance[iday]=capacity_max
}
}
# Table of water balance for a specific year.
#subset the data for each year
out = list() # list of output
for (year in unique(data[[year_col]])){
dat<-subset(data, data[[year_col]]==year)
out[[year -(min(unique(data[[year_col]]))-1)]] <- dcast(dat,
dat[[day_col]]~dat[[month_col]], value.var="Water_Balance")
#add column names as month
colnames(out[[year
-(min(unique(data[[year_col]]))-1)]])[2:13]<-month.abb[1:12]
}
out
}
Any help is appreciated!!!
Regards,
Frederic.
Frederic Ntirenganya
Maseno University,
African Maths Initiative,
Kenya.
Mobile:(+254)718492836
Email: fredo at aims.ac.za
https://sites.google.com/a/aims.ac.za/fredo/
[[alternative HTML version deleted]]
More information about the R-help
mailing list