[R] Creating a new column inside a function.

Frederic Ntirenganya ntfredo at gmail.com
Tue Nov 25 12:44:59 CET 2014

Dear All,

i need a help on how I can create a new column on my dataset and use it as
argument inside the following function. The column i want to create and
vary is "Evapolation". It varies that'S why I need it as argument.

When I make it like this is not working:
water_blnce=function(data,capacity_max=100, rain_col=NULL, day_col=NULL,
year_col=NULL, month_col=NULL, Evaporation = 5)

## function for water balance
water_blnce=function(data,capacity_max=100, rain_col=NULL, day_col=NULL,
year_col=NULL, month_col=NULL){

  # This function computes water balance of a dataset.
  # Input: data, variable(s)= capacity_max
  # It adds two new columns Water_balance and Evaporation to the dataset.
  # The function use the formula:  Water balance of today = water balance
of yesterday + Rainfall -Evaporation.
  #If Water balance today < 0 then  Water balance today = 0
  #If Water balance today > 100 then Water balance today = 100
  # the NAs due to non recording values are considered as zero.

  # finding the indices of all NAs in the data
  indicNAs <- which(data[[rain_col]] %in% NA)
  ind_nonleap = c() # NAs due to non leap years
  ind_nonrecord = c() # NAs due to non recording values
  for (i_NA in indicNAs ){
    if(data[[day_col]][i_NA] == 60){
      ind_nonleap <- append(ind_nonleap,i_NA)
    else {
   #cat( ind_nonleap)
  # assign the NAs due to non recording values to be zero.
  for(j in ind_nonrecord){
  # remove the rows which has missing values
  # Adding a new column for water balance  and evaporation to the data frame
  data$Water_Balance <- NA
  # initialization
  # loop for calculating water balance for a given dataset
  ndays <- nrow(data)
  for (iday in 2:ndays) {
    data$Water_Balance[iday] <- data$Water_Balance[iday-1] +
data[[rain_col]][iday] - data$Evaporation[iday]
    if (data$Water_Balance[iday]<0){
    }else if(data$Water_Balance[iday]>capacity_max){
  # Table of water balance for a specific year.
  #subset the data for each year

  out = list() # list of output
  for (year in unique(data[[year_col]])){
    dat<-subset(data, data[[year_col]]==year)
    out[[year -(min(unique(data[[year_col]]))-1)]] <- dcast(dat,
dat[[day_col]]~dat[[month_col]], value.var="Water_Balance")
    #add column names as month

Any help is appreciated!!!


Frederic Ntirenganya
Maseno University,
African Maths Initiative,
Email: fredo at aims.ac.za

	[[alternative HTML version deleted]]

More information about the R-help mailing list