[R] substitute values

Mark Wardle mark at wardle.org
Tue Apr 3 12:24:52 CEST 2007


Sergio Della Franca wrote:
> I have much more problem in the following case:
>  
>  Years   Products  New Column
>  1          10           0
>  2          25           0
>  3          40           0
>  4          NA          0
>  5          35           0
> <NA>    23           1
>  6         NA           0
>  7         67            0
>  8         NA           0
> NA       NA           *NA*
> NA       NA           *NA*
>  
> When i hane NA in both columns the results of the procedure give me NA.
> I'd like to obtain 0.
>  
> 
>  
> 2007/4/3, Mark Wardle <mark at wardle.org <mailto:mark at wardle.org>>:
> 
>     Sergio Della Franca wrote:
>     > Dear R-Helpers,
>     >
>     > I have the following data set(y):
>     >
>     > Years   Products
>     > 1          10
>     > 2          25
>     > 3          40
>     > 4          NA
>     > 5          35
>     > <NA>   23
>     > 6         NA
>     > 7         67
>     > 8         NA
>     >
>     > I want to create a new column into my dataset(y) under the following
>     > conditions:
>     > if years =NA and products >20 then new column=1 else new column=0;
>     > to obtain the following results:
>     >
>     > Years   Products New Column
>     > 1          10          0
>     > 2          25          0
>     > 3          40          0
>     > 4          NA         0
>     > 5          35          0
>     > <NA>   23          1
>     > 6         NA          0
>     > 7         67           0
>     > 8         NA          0
>     >
> 
>     How about using ifelse():
>     year = c(1,2,3,4,5,NA,6,7,8)
>     products = c(10,25,40,NA,35,23,NA,67,NA)
>     ifelse( is.na(year) & products>20,1,0)
>

Did you try to investigate why that happened? It's because the term
(products>20) is evaluated to NA if products == NA. Try typing that by
itself and experiment - it is the best way of learning!

For example:

> year = c(1,2,3,4,5,NA,6,7,8,NA)
> products = c(10,25,40,NA,35,23,NA,67,NA,NA)

Experiment and see what happens with

> year > 4

> products > 25

> is.na(year)

> is.na(year) & year>4


And finally, try the ifelse command, and read the help!

> ifelse(T, 1, 0)
> ifelse(NA, 1, 0)


And so:

> ifelse( is.na(year) & !is.na(products) & products>20,1,0)


Did you look at ?ifelse, and ?is.na     Really do try experiment with
things - it is all quite logical (no pun intended). Suddenly, you'll
understand what is actually going on, and won't have to keep asking for
help with minor variations on the same theme.

Best wishes,

Mark
-- 
Specialist registrar, Neurology,
Cardiff, UK



More information about the R-help mailing list