[R] building a spatial matrix
A M Lavezzi
mario.lavezzi at unipa.it
Fri May 13 16:36:26 CEST 2016
*PLEASE IGNORE THE PREVIOUS EMAIL, IT WAS SENT BY MISTAKE*
Hello Sarah
thanks a lot for your advice.
I followed your suggestions unitil the creation of "result"
The allocation of the values of result$distance to the matrix result.m,
however ,does not seem to work: it produces a matrix with identical columns
corresponding to the last values of result$distance. Maybe my description
of the dataset was not clear enough.
I produced the final matrix spat_dist with a loop, that I report below (it
takes about 1 hour on my macbook pro),
set_i = -1 # create a variable to store the i values already examined
for(i in unique(result$id)){
set_i=c(set_i,i) # store the value of the i
set_neigh = result$id_neigh[result$id==i & !result$id_neigh %in% set_i] #
identify the locations connected to i. If the distance between i and j was
examined before, don't look for the distance between j and i
for(j in set_neigh){
if(i!=j){
spat_dist[i,j] = result$distance[result$id==i & result$id_neigh==j]
spat_dist[j,i] = spat_dist[i,j]
}
else{
spat_dist[i,j]=0
}
}
}
It is not the most elegant and efficient solution in the world, that's for
sure.
I would be grateful, if you could suggest an alternative instruction to:
result.m[factor(result$fcell), factor(result$cellneigh)] <- result$distance
so I will learn a faster procedure (I tried many times but to modify this
structure but I did not make it). I don't want to abuse of your time, so
forget it if you are busy
Thank you so much anyway,
Mario
ps I attach the data. Notice that the 1327 units in id_cell are firms,
indexed by id, located in location f_cell. Different firms can be located
in the same f_cell. With respect to your suggestion, I added two columns to
"result" with the id of the firms.
On Fri, May 13, 2016 at 3:26 PM, A M Lavezzi <mario.lavezzi at unipa.it> wrote:
>
> Hello Sarah
> thanks a lot for your advice.
>
> I followed your suggestions unitl the creation of "result"
>
> The allocation of the values of result$distance to the matrix result.m,
> however ,does not seem to work: it produces a matrix with identical columns
> corresponding to the last values of result$distance. Maybe my description
> of the dataset was not clear enough.
>
> I produced the final matrix with a loop, that I report below (it takes
> about 1 hour on my macbook pro),
>
> set_i = -1 # create a variable to store the i values already examined
>
> for(i in unique(result$id)){
>
> set_i=c(set_i,i) # store the value of the i
>
> set_neigh = result$id_neigh[result$id==i & !result$id_neigh %in% set_i]
> # identify the locations connected to i. Exclude those
>
> for(j in set_neigh){
> if(i!=j){
> spat_dist[i,j] = result$distance[result$id==i & result$id_neigh==j]
> spat_dist[j,i] = spat_dist[i,j]
> }
> else{
> spat_dist[i,j]=0
> }
> }
> }
>
> It not the most elegant and efficient solution in the world, that's for
> sure
>
>
>
> On Thu, May 12, 2016 at 2:51 PM, Sarah Goslee <sarah.goslee at gmail.com>
> wrote:
>
>> I don't see any reason why a loop is out of the question, and
>> answering would have been much easier if you'd included the requested
>> reproducible data, but what about this?
>>
>> This solution is robust to pairs from idcell being absent in censDist,
>> and to the difference from A to B being different than the distance
>> from B to A, but not to A-B appearing twice. If that's possible,
>> you'll need to figure out how to manage it.
>>
>> # create some fake data
>>
>> idcell <- data.frame(
>> id = seq_len(5),
>> fcell = sample(1:100, 5))
>>
>> censDist <- expand.grid(fcell=seq_len(100), cellneigh=seq_len(100))
>> censDist$distance <- runif(nrow(censDist))
>>
>> # assemble the non-symmetric distance matrix
>> result <- subset(censDist, fcell %in% idcell$fcell & cellneigh %in%
>> idcell$fcell)
>> result.m <- matrix(NA, nrow=nrow(idcell), ncol=nrow(idcell))
>> result.m[factor(result$fcell), factor(result$cellneigh)] <-
>> result$distance
>>
>> Sarah
>>
>> On Thu, May 12, 2016 at 5:26 AM, A M Lavezzi <mario.lavezzi at unipa.it>
>> wrote:
>> > Hello,
>> >
>> > I have a sample of 1327 locations, each one idetified by an id and a
>> > numerical code.
>> >
>> > I need to build a spatial matrix, say, M, i.e. a 1327x1327 matrix
>> > collecting distances among the locations.
>> >
>> > M(i,i) should be 0, M(i,j) should contain the distance among location i
>> and
>> > j
>> >
>> > I shoud use data organized in the following way:
>> >
>> > 1) id_cell contains the identifier (id) of each location (1...1327) and
>> the
>> > numerical code of the location (f_cell) (see head of id_cell below)
>> >
>> >> head(id_cell)
>> > id f_cell
>> > 1 1 2120
>> > 12 2 204
>> > 22 3 2546
>> > 24 4 1327
>> > 34 5 1729
>> > 43 6 2293
>> >
>> > 2) censDist contains, for each location identified by its numerical
>> code,
>> > the distance to other locations (censDist has 1.5 million rows). The
>> > head(consist) below, for example, reads like this:
>> >
>> > location 2924 has a distance to 2732 of 1309.7525
>> > location 2924 has a distance to 2875 of 696.2891,
>> > etc.
>> >
>> >> head(censDist)
>> > f_cell f _cell_neigh distance
>> > 1 2924 2732 1309.7525
>> > 2 2924 2875 696.2891
>> > 3 2924 2351 1346.0561
>> > 4 2924 2350 1296.9804
>> > 5 2924 2725 1278.1877
>> > 6 2924 2721 1346.9126
>> >
>> >
>> > Basically, for every location in id_cell I should pick up the distance
>> to
>> > other locations in id_cell from censDist, and allocate it in M
>> >
>> > I have not come up with a satisfactory vectorizion of this problem and
>> > using a loop is out of question.
>> >
>> > Thanks for your help
>> > Mario
>> >
>> >
>>
>
>
>
> --
> Andrea Mario Lavezzi
> DiGi,Sezione Diritto e Società
> Università di Palermo
> Piazza Bologni 8
> 90134 Palermo, Italy
> tel. ++39 091 23892208
> fax ++39 091 6111268
> skype: lavezzimario
> email: mario.lavezzi (at) unipa.it
> web: http://www.unipa.it/~mario.lavezzi
>
--
Andrea Mario Lavezzi
DiGi,Sezione Diritto e Società
Università di Palermo
Piazza Bologni 8
90134 Palermo, Italy
tel. ++39 091 23892208
fax ++39 091 6111268
skype: lavezzimario
email: mario.lavezzi (at) unipa.it
web: http://www.unipa.it/~mario.lavezzi
More information about the R-help
mailing list