[R] if else statement for rain data to define zero for dry and one to wet

roslinazairimah zakaria roslinaump at gmail.com
Sun Jun 7 16:57:52 CEST 2015


Dear all,

All works well. Thank you so much for your help.

D## Function 1
wet_dry1 <- function(x,thresh=0.1)
 { for(column in 1:dim(x)[2]) x[,column] <- ifelse(x[,column]>=thresh,1,0)
 return(x)
 }

wet_dry1(dt)


## Function 2
wet_dry2 <- ( dt >= 0.1)*1
wet_dry2

wet_total <- colSums(wet_dry2)
pp <- wet_total/nrow(dt)
pp


## Function 3
rain <- dt
wet_dry3 <- ifelse(rain >= 0.1, 1, 0)
wet_dry3

On Sun, Jun 7, 2015 at 5:48 AM, William Dunlap <wdunlap at tibco.com> wrote:

> Your f1() has an unneeded for loop in it.
>    f1a <- function(mat) mat > 0.1, 1, 0)
> would do the same thing in a bit less time.
>
> However, I think that a simple
>    mat > 0.1
> would be preferable.  The resulting TRUEs and FALSEs
> are easier to interpret than the 1s and 0s that f1a()
> produces and arithmetic functions treat them TRUE
> as 1 and FALSE as 0 internally.  E.g., mean(mat>0.1)
> gives the proportion of wet(tish) days.
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Sat, Jun 6, 2015 at 1:55 PM, Dennis Murphy <djmuser at gmail.com> wrote:
>
>> I'm sorry, but I have to take issue with this particular use case of
>> ifelse(). When the goal is to generate a logical vector, ifelse() is
>> very inefficient. It's better to apply a logical condition directly to
>> the object in question and multiply the result by 1 to make it
>> numeric/integer rather than logical.
>>
>> To illustrate this, consider the following toy example. The function
>> f1 replicates the suggestion to apply ifelse() columnwise (with the
>> additional overhead of preallocating storage for the result), whereas
>> the function f2 applies the logical condition on the matrix itself
>> using vectorization, with the recognition that a matrix is an atomic
>> vector with a dim attribute.
>>
>> set.seed(5290)
>>
>> # 1000 x 1000 matrix
>> m <- matrix(sample(c(0, 0.05, 0.2), 1e6, replace = TRUE), ncol = 1000)
>>
>> f1 <- function(mat)
>>   {
>>      newmat <- matrix(NA, ncol = ncol(mat), nrow = nrow(mat))
>>      for(i in seq_len(ncol(mat)))
>>          newmat[, i] <- ifelse(mat[, i] > 0.1, 1, 0)
>>      newmat
>>   }
>>
>> f2 <- function(mat) 1 * (mat > 0.1)
>>
>>
>> On my system, I got
>>
>> > system.time(m1 <- f1(m))
>>    user  system elapsed
>>    0.14    0.00    0.14
>>
>> > system.time(m2 <- f2(m))
>>    user  system elapsed
>>    0.01    0.00    0.01
>>
>> > identical(m1, m2)
>> [1] TRUE
>>
>> The all too common practice of using  ifelse(condition, 1, 0) on an
>> atomic vector is easily replaced by 1 * (condition), where the result
>> of condition is a logical atomic object coerced to numeric.
>>
>> To reduce memory, one should better define f2 as
>>
>> f2 <- function(mat) 1L * (mat > 0.1)
>>
>> but doing so in this example no longer creates identical objects since
>>
>> > typeof(m1)
>> [1] "double"
>>
>> Thus, f1 is not only inefficient in terms of execution time, it's also
>> inefficient in terms of storage.
>>
>> Given several recent warnings in this forum about the inefficiency of
>> ifelse() and the dozens of times I've seen the idiom implemented in f1
>> as a solution over the last several years (to which I have likely
>> contributed in my distant past as an R-helper), I felt compelled to
>> say something about this practice, which BTW extends not just to 0/1
>> return values but to
>> 0/x return values, where x is a nonzero real number.
>>
>> Dennis
>>
>>
>> On Sat, Jun 6, 2015 at 12:50 AM, Jim Lemon <drjimlemon at gmail.com> wrote:
>> > Hi rosalinazairimah,
>> > I think the problem is that you are using "if" instead of "ifelse". Try
>> this:
>> >
>> > wet_dry<-function(x,thresh=0.1) {
>> >  for(column in 1:dim(x)[2]) x[,column]<-ifelse(x[,column]>=thresh,1,0)
>> >  return(x)
>> > }
>> > wet_dry(dt)
>> >
>> > and see what you get.
>> >
>> > Also, why can I read your message perfectly while everybody else can't?
>> >
>> > Jim
>> >
>> >>> -----Original Message-----
>> >>> From: roslinaump at gmail.com
>> >>> Sent: Fri, 5 Jun 2015 16:49:08 +0800
>> >>> To: r-help at r-project.org
>> >>> Subject: [R] if else statement for rain data to define zero for dry
>> and
>> >>> one to wet
>> >>>
>> >>> Dear r-users,
>> >>>
>> >>> I have a set of rain data:
>> >>>
>> >>> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960
>> X1961
>> >>> X1962
>> >>>
>> >>> 1   0.0   0.0  14.3   0.0  13.5  13.2   4.0     0   3.3     0     0
>>  0.0
>> >>>
>> >>>
>> >>> 2   0.0   0.0  21.9   0.0  10.9   6.6   2.1     0   0.0     0     0
>>  0.0
>> >>>
>> >>>
>> >>> 3  25.3   6.7  18.6   0.8   2.3   0.0   8.0     0   0.0     0     0
>> 11.0
>> >>>
>> >>>
>> >>> 4  12.7   3.4  37.2   0.9   8.4   0.0   5.8     0   0.0     0     0
>>  5.5
>> >>>
>> >>>
>> >>> 5   0.0   0.0  58.3   3.6  21.1   4.2   3.0     0   0.0     0     0
>> 15.9
>> >>>
>> >>>
>> >>> I would like to go through each column and define each cell with value
>> >>> greater than 0.1 mm will be 1 and else zero. Hence I would like to
>> attach
>> >>> the rain data and the category side by side:
>> >>>
>> >>>
>> >>> 1950   state
>> >>>
>> >>> 1 0.0    0
>> >>>
>> >>> 2 0.0    0
>> >>>
>> >>> 3 25.3   1
>> >>>
>> >>> 4 12.7   1
>> >>>
>> >>> 5 0.0    0
>> >>>
>> >>>
>> >>> ...
>> >>>
>> >>>
>> >>> This is my code:
>> >>>
>> >>>
>> >>> wet_dry  <- function(dt)
>> >>>
>> >>> { cl   <- length(dt)
>> >>>
>> >>>   tresh  <- 0.1
>> >>>
>> >>>
>> >>>   for (i in 1:cl)
>> >>>
>> >>>   {  xi <- dt[,i]
>> >>>
>> >>>      if (xi < tresh ) 0 else 1
>> >>>
>> >>>   }
>> >>>
>> >>> dd <- cbind(dt,xi)
>> >>>
>> >>> dd
>> >>>
>> >>> }
>> >>>
>> >>>
>> >>> wet_dry(dt)
>> >>>
>> >>>
>> >>> Results:
>> >>>
>> >>>> wet_dry(dt)
>> >>>
>> >>>    X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960
>> >>> X1961
>> >>> X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972
>> X1973
>> >>> X1974 X1975 X1976 X1977
>> >>>
>> >>> 1    0.0   0.0  14.3   0.0  13.5  13.2   4.0   0.0   3.3   0.0   0.0
>> >>> 0.0
>> >>>   4.2   0.0   2.2   0.0   4.4   5.1     0   7.2   0.0   0.0   0.0
>>  5.1
>> >>> 0   0.0     0   0.3
>> >>>
>> >>> 2    0.0   0.0  21.9   0.0  10.9   6.6   2.1   0.0   0.0   0.0   0.0
>> >>> 0.0
>> >>>   8.4   0.0   4.0   0.0   4.9   0.7     0   0.0   0.0   0.0   0.0
>>  5.4
>> >>> 0   3.3     0   0.3
>> >>>
>> >>> 3   25.3   6.7  18.6   0.8   2.3   0.0   8.0   0.0   0.0   0.0   0.0
>> >>> 11.0
>> >>>   4.2   0.0   2.0   0.0  14.2  17.1     0   0.0   0.0   0.0   0.0
>>  2.1
>> >>> 0   1.7     0   4.4
>> >>>
>> >>> 4   12.7   3.4  37.2   0.9   8.4   0.0   5.8   0.0   0.0   0.0   0.0
>> >>> 5.5
>> >>>   0.0   0.0   5.4   0.0   6.4  14.9     0  10.1   2.9 143.4   0.0
>>  6.1
>> >>> 0   0.0     0  33.5
>> >>>
>> >>>
>> >>> It does not work and give me the original data.  Why is that?
>> >>>
>> >>>
>> >>> Thank you so much for your help.
>> >>>
>> >>>       [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list