[R] Question with "apply"

David Winsemius dwinsemius at comcast.net
Thu May 9 23:25:29 CEST 2013


On May 9, 2013, at 8:50 AM, Pooya Lalehzari wrote:

> Hello,
> When I use "apply" on a data frame, it seems like I get an error when I have a column that is not numeric. Via trial and error I realized that if I remove that column, I can get it to run. Is there a better way to tell the function not to worry about the character columns, especially since I am not even trying to do any calculations on it?

Not really. 
> R_Cat is the character column that is causing error and I am trying to do calculations on t5R.
> 
>  apply(Data_F[,names(Data_F) != "R_Cat"],1,function(row) {
>    ifelse( abs(row["t5R"]) <Thresh1, 0,
>    ifelse( abs(row["t5R"]) <Thresh2, ifelse( row["t5R"] <0, -1, 1),
>    ifelse( abs(row["t5R"]) <Thresh3, ifelse( row["t5R"] <0, -2, 2),
>    ifelse( abs(row["t5R"]) <Thresh4, ifelse( row["t5R"] <0, -3, 3),
>    ifelse( abs(row["t5R"]) <0, -4, 4)))))

There would be a better way of writing that code and avoid all those ugly and inefficient nested 'ifelse' calls by replacing with vectorized operations:

Data_F_Cat$t5R_Cat <- findInterval( abs( Data_F_Cat$t5R ) , c(Thresh1, Thresh2, Thresh3, Thresh4) )
Data_F_Cat$t5R_Cat <- sign(Data_F_Cat$t5R) * Data_F_Cat$t5R_Cat

The last clause is clearly not coded correctly since abs( anything) is never less than 0. Note that there could be a re-coding difference if t5R is >= Thresh4, which was not a condition you were testing for. I suspect you wanted tehm coded 4 or -4 and that is what my code should accomplish.

-- 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list