[R] organizing data in a matrix avoiding loop
Duncan Murdoch
murdoch.duncan at gmail.com
Fri May 26 14:20:16 CEST 2017
On 26/05/2017 7:46 AM, A M Lavezzi wrote:
> Dear R-Users
>
> I have data on bilateral trade flows among countries in the following form:
>
>> head(dataTrade)
>
> iso_o iso_d year FLOW
> 1 ABW AFG 1985 NA
> 2 ABW AFG 1986 NA
> 3 ABW AFG 1987 NA
> 4 ABW AFG 1988 NA
> 5 ABW AFG 1989 NA
> 6 ABW AFG 1990 NA
>
> where:
> iso_o: code of country of origin
> iso_d: code of country of destination
> year: 1985:2015
> FLOW: amount of trade (values are "NA", 0s, or positive numbers)
>
> I have 215 countries. I would like to create a 215x215 matrix , say M, in
> which element M(i,j) is the total trade between countries i and j between
> 1985 and 2015 (i.e. the sum of annual amounts of trade).
>
> After collecting the country codes in a variable named "my_iso", I can
> obtain M in a straightforward way using a loop such as:
>
> for (i in my_iso){
> for(j in my_iso)
> if(i!=j){
> M[seq(1:length(my_iso))[my_iso==i],seq(1:length(my_iso))[my_iso==j]]
> <-
> sum(dataTrade[dataTrade$iso_o==i &
> dataTrade$iso_d==j,"FLOW"],na.rm=TRUE)
> }
> }
>
> However, it takes ages.
>
> Is there a way to avoid these loops?
Assuming that you have unique entries for each of the first 3 columns,
you could so something like this:
# Put all the data into an array, indexed by origin, destination, year:
dataMatrix <- as.matrix(dataTrade) # Converts everything to character
dataArray <- array(0, c(215, 215, 31))
dimnames(dataArray) <- list(unique(dataMatrix[,1]),
unique(dataMatrix[,2]), unique(dataMatrix[,3]))
dataArray[dataMatrix[,1:3]] <- dataTrade$FLOW
# Sum across years
apply(dataArray, 3, sum)
I haven't tried this (you didn't give a reproducible example...), so you
may need to tweak it a bit.
Duncan Murdoch
More information about the R-help
mailing list