[R] How to select multiple cells in a matrix and perform an operation on corresponding cells in another matrix of the same size?

Bert Gunter gunter.berton at gene.com
Tue Mar 25 14:49:10 CET 2014


Couple of comments, but as I have not followed this closely, take them
with salt.

1. I would have thought that no flow or pollutants should be recorded
as zeros, not NA's. So don't you want to first change these to zeros?
If some of the NA's are actually missings, then I think you have
potentially crippling problems with the data, and are likely to
produce wrong results. Consult a local statistician in that case to
see if something can be done.

2. A small reproducible example as requested by the posting guide
would be very useful here. It is very difficult to wade through many
lines of code and grasp what's going on without it.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Tue, Mar 25, 2014 at 5:36 AM, Vermeulen, Lucie
<lucie.vermeulen at wur.nl> wrote:
> Hi Jim,
>
> Thanks for your help! I tried it on my own data, but there seems to be a problem with NA values. For regions where there is no flow or pollutant, my cells are NA. Using your script below changed to include NA's in flowmat, I get the error message "Error in if (selected_cells[row, col]) { :   missing value where TRUE/FALSE needed"
>
> I tried several things to solve this, indicated below in the script, but none of them work. Do you perhaps know a solution to this?
> Lucie
>
> # assume three flow rates
> flowmat<-matrix(sample(c(1,2,3,NA),100,TRUE),nrow=10)
> # directions are 1 to 8 starting at North and going clockwise
> dirmat<-matrix(sample(1:8,100,TRUE),nrow=10)
> # some values for pollutant concentration (?)
> pollmat<-matrix(runif(100),nrow=10)
> dimmat<-dim(flowmat)
> # these translate the "direction" value into row and column changes
> rowdirs<-c(0,1,1,1,0,-1,-1,-1)
> coldirs<-c(1,1,0,-1,-1,-1,0,1)
> # the matrix to be updated
> newpollmat<-matrix(0,nrow=dimmat[1],ncol=dimmat[2])
> for(flow in unique(flowmat)) {                                  #I tried [!is.na(FAMT)] or [complete.cases(FAMT)] or na.exclude(unique(FAMT)) or na.pass(unique(FAMT))
>         # this yields a matrix where all cells of a given flow are TRUE
>         selected_cells<-flowmat == flow
>         for(row in 1:dimmat[1]) {
>                 for(col in 1:dimmat[2]) {
>                         # only update the selected cells
>                         if(selected_cells[row,col]) {                   #I tried [!is.na(selected_cells[row,col])]  or (na.pass(selected_cells[row,col]))
>                                 newrow<-row+rowdirs[dirmat[row,col]]
>                                 newcol<-col+coldirs[dirmat[row,col]]
>                                 # don't go out of the matrix
>                                 if(newrow >= 1 && newrow <= 10 && newcol >= 1 && newcol <= 10)
>                                         # update the new pollutant matrix
>                                         newpollmat[newrow,newcol]<-pollmat[row,col]*flowmat[row,col]
>                         }
>                 }
>         }
> }
>
> -----Original Message-----
> From: Jim Lemon [mailto:jim at bitwrit.com.au]
> Sent: maandag 24 maart 2014 22:45
> To: Vermeulen, Lucie
> Cc: 'r-help at R-project.org'
> Subject: Re: [R] How to select multiple cells in a matrix and perform an operation on corresponding cells in another matrix of the same size?
>
> On 03/25/2014 01:02 AM, Vermeulen, Lucie wrote:
>> I am trying to write an R script to do pollution routing in world rivers, and need some help on selecting matrix cell coordinates and applying these to other matrices of the same dimension.
>>
>> My data: I have several matrices corresponding to hydrological parameters of world rivers on a half degree grid (360 rows, 720 columns). These matrices represent flow accumulation (how many cells flow into this cell), flow direction (which of the 8 surrounding cells does the load of certain cell flow to) and pollutant load.
>>
>> My idea: compute pollutant load in each grid cell from the start to the end of a river. I can base this on flow accumulation (low to high). However, each river basin can have multiple cells with the same flow accumulation value.
>>
>> The problem: I need to select all matrix cells of each value of flow accumulation (low to high), find their coordinates (row,column), and transfer the corresponding pollutant load to the correct adjacent cell using the flow direction matrix. I have tried various ways, but selecting the coordinates of the correct cells and applying these to another matrix I cannot get to work.
>> ...
>
> Hi Lucie,
> I had a look at this and I think what you want to do is to step through time intervals, transferring a combination of flow rates and pollution loads from cell to cell. At each time interval, the downstream cells will be updated with a value that reflects the amount of pollutants in those cells. It seems that you want to ignore evaporation and absorption.
>
> This might do what you want.
>
> # assume three flow rates
> flowmat<-matrix(sample(1:3,100,TRUE),nrow=10)
> # directions are 1 to 8 starting at North and going clockwise
> dirmat<-matrix(sample(1:8,100,TRUE),nrow=10)
> # some values for pollutant concentration (?)
> pollmat<-matrix(runif(100),nrow=10)
> dimmat<-dim(flowmat)
> # these translate the "direction" value into row and column changes
> rowdirs<-c(0,1,1,1,0,-1,-1,-1)
> coldirs<-c(1,1,0,-1,-1,-1,0,1)
> # the matrix to be updated
> newpollmat<-matrix(0,nrow=dimmat[1],ncol=dimmat[2])
> for(flow in unique(flowmat)) {
>   # this yields a matrix where all cells of a given flow are TRUE
>   selected_cells<-flowmat == flow
>   for(row in 1:dimmat[1]) {
>    for(col in 1:dimmat[2]) {
>     # only update the selected cells
>     if(selected_cells[row,col]) {
>      newrow<-row+rowdirs[dirmat[row,col]]
>      newcol<-col+coldirs[dirmat[row,col]]
>      # don't go out of the matrix
>      if(newx >= 1 && newx <= 10 && newy > = 1 && newy <= 10)
>       # update the new pollutant matrix
>       newpollmat[newrow,newcol]<-pollmat[row,col]*flowmat[row,col]
>     }
>    }
>   }
> }
>
> The above can be done in the inner two loops as it accounts for the flow rate in the calculation. You will probably want to update the "edge"
> cells with their original values or wrap values from the opposite edge if that is appropriate.
>
> Jim
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list