[R] building a spatial matrix

A M Lavezzi mario.lavezzi at unipa.it
Fri May 13 16:36:26 CEST 2016


*PLEASE IGNORE THE PREVIOUS EMAIL, IT WAS SENT BY MISTAKE*

Hello Sarah
thanks a lot for your advice.

I followed your suggestions unitil the creation of "result"

The allocation of the values of result$distance to the matrix result.m,
however ,does not seem to work: it produces a matrix with identical columns
corresponding to the last values of result$distance. Maybe my description
of the dataset was not clear enough.

I produced the final matrix spat_dist with a loop, that I report below (it
takes about 1 hour on my macbook pro),

set_i = -1   # create a variable to store the i values already examined

for(i in unique(result$id)){

  set_i=c(set_i,i) # store the value of the i

  set_neigh = result$id_neigh[result$id==i & !result$id_neigh %in% set_i] #
identify the locations connected to i. If the distance between i and j was
examined before, don't look for the distance between j and i

  for(j in set_neigh){
    if(i!=j){
      spat_dist[i,j] = result$distance[result$id==i &  result$id_neigh==j]
      spat_dist[j,i] = spat_dist[i,j]
    }
    else{
      spat_dist[i,j]=0
    }
  }
}

It is not the most elegant and efficient solution in the world, that's for
sure.

I would be grateful, if you could suggest an alternative instruction to:

result.m[factor(result$fcell), factor(result$cellneigh)] <- result$distance

so I will learn a faster procedure (I tried many times but to modify this
structure but I did not make it). I don't want to abuse of your time, so
forget it if you are busy

Thank you so much anyway,
Mario

ps I attach the data. Notice that the 1327 units in id_cell are firms,
indexed by id, located in location f_cell. Different firms can be located
in the same f_cell. With respect to your suggestion, I added two columns to
"result" with the id of the firms.

On Fri, May 13, 2016 at 3:26 PM, A M Lavezzi <mario.lavezzi at unipa.it> wrote:

>
> Hello Sarah
> thanks a lot for your advice.
>
> I followed your suggestions unitl the creation of "result"
>
> The allocation of the values of result$distance to the matrix result.m,
> however ,does not seem to work: it produces a matrix with identical columns
> corresponding to the last values of result$distance. Maybe my description
> of the dataset was not clear enough.
>
> I produced the final matrix with a loop, that I report below (it takes
> about 1 hour on my macbook pro),
>
> set_i = -1   # create a variable to store the i values already examined
>
> for(i in unique(result$id)){
>
>   set_i=c(set_i,i) # store the value of the i
>
>   set_neigh = result$id_neigh[result$id==i & !result$id_neigh %in% set_i]
> # identify the locations connected to i. Exclude                  those
>
>   for(j in set_neigh){
>     if(i!=j){
>       spat_dist[i,j] = result$distance[result$id==i &  result$id_neigh==j]
>       spat_dist[j,i] = spat_dist[i,j]
>     }
>     else{
>       spat_dist[i,j]=0
>     }
>   }
> }
>
> It not the most elegant and efficient solution in the world, that's for
> sure
>
>
>
> On Thu, May 12, 2016 at 2:51 PM, Sarah Goslee <sarah.goslee at gmail.com>
> wrote:
>
>> I don't see any reason why a loop is out of the question, and
>> answering would have been much easier if you'd included the requested
>> reproducible data, but what about this?
>>
>> This solution is robust to pairs from idcell being absent in censDist,
>> and to the difference from A to B being different than the distance
>> from B to A, but not to A-B appearing twice. If that's possible,
>> you'll need to figure out how to manage it.
>>
>> # create some fake data
>>
>> idcell <- data.frame(
>>   id = seq_len(5),
>>   fcell = sample(1:100, 5))
>>
>> censDist <- expand.grid(fcell=seq_len(100), cellneigh=seq_len(100))
>> censDist$distance <- runif(nrow(censDist))
>>
>> # assemble the non-symmetric distance matrix
>> result <- subset(censDist, fcell %in% idcell$fcell & cellneigh %in%
>> idcell$fcell)
>> result.m <- matrix(NA, nrow=nrow(idcell), ncol=nrow(idcell))
>> result.m[factor(result$fcell), factor(result$cellneigh)] <-
>> result$distance
>>
>> Sarah
>>
>> On Thu, May 12, 2016 at 5:26 AM, A M Lavezzi <mario.lavezzi at unipa.it>
>> wrote:
>> > Hello,
>> >
>> > I have a sample of 1327  locations, each one idetified by an id and a
>> > numerical code.
>> >
>> > I need to build a spatial matrix, say, M, i.e. a 1327x1327 matrix
>> > collecting distances among the locations.
>> >
>> > M(i,i) should be 0, M(i,j) should contain the distance among location i
>> and
>> > j
>> >
>> > I shoud use data organized in the following way:
>> >
>> > 1) id_cell contains the identifier (id) of each location (1...1327) and
>> the
>> > numerical code of the location (f_cell) (see head of id_cell below)
>> >
>> >> head(id_cell)
>> >      id  f_cell
>> > 1    1   2120
>> > 12  2     204
>> > 22  3   2546
>> > 24  4   1327
>> > 34  5   1729
>> > 43  6   2293
>> >
>> > 2) censDist contains, for each location identified by its numerical
>> code,
>> > the distance to other locations (censDist has 1.5 million rows). The
>> > head(consist) below, for example, reads like this:
>> >
>> > location 2924 has a distance to 2732 of 1309.7525
>> > location 2924 has a distance to 2875 of 696.2891,
>> > etc.
>> >
>> >> head(censDist)
>> >   f_cell f  _cell_neigh  distance
>> > 1   2924         2732   1309.7525
>> > 2   2924         2875     696.2891
>> > 3   2924         2351   1346.0561
>> > 4   2924         2350   1296.9804
>> > 5   2924         2725   1278.1877
>> > 6   2924         2721   1346.9126
>> >
>> >
>> > Basically, for every location in  id_cell I should pick up the distance
>> to
>> > other locations in id_cell from censDist, and allocate it in M
>> >
>> > I have not come up with a satisfactory vectorizion of this problem and
>> > using a loop is out of question.
>> >
>> > Thanks for your help
>> > Mario
>> >
>> >
>>
>
>
>
> --
> Andrea Mario Lavezzi
> DiGi,Sezione Diritto e Società
> Università di Palermo
> Piazza Bologni 8
> 90134 Palermo, Italy
> tel. ++39 091 23892208
> fax ++39 091 6111268
> skype: lavezzimario
> email: mario.lavezzi (at) unipa.it
> web: http://www.unipa.it/~mario.lavezzi
>



-- 
Andrea Mario Lavezzi
DiGi,Sezione Diritto e Società
Università di Palermo
Piazza Bologni 8
90134 Palermo, Italy
tel. ++39 091 23892208
fax ++39 091 6111268
skype: lavezzimario
email: mario.lavezzi (at) unipa.it
web: http://www.unipa.it/~mario.lavezzi


More information about the R-help mailing list