[R] organizing data in a matrix avoiding loop

David L Carlson dcarlson at tamu.edu
Fri May 26 16:51:02 CEST 2017


How about?

Trade <- xtabs(FLOW ~ iso_o + iso_d + year, dta)

Gives you a 3d table with FLOW as the cell entry. Then

apply(Trade, 1:2, sum, na.rm=TRUE)

Gives you a 2d table with the total flow


David L. Carlson
Department of Anthropology
Texas A&M University

-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of S Ellison
Sent: Friday, May 26, 2017 8:28 AM
To: A M Lavezzi <mario.lavezzi at unipa.it>; r-help <r-help at r-project.org>
Subject: Re: [R] organizing data in a matrix avoiding loop

> -----Original Message-----
> From: A M > Lavezzi
> 
> I have data on bilateral trade flows among countries in the following form:
> 
>       iso_o iso_d year FLOW
> 1   ABW   AFG 1985   NA
> 2   ABW   AFG 1986   NA
> 3   ABW   AFG 1987   NA
> 4   ABW   AFG 1988   NA
> 5   ABW   AFG 1989   NA
> 6   ABW   AFG 1990   NA
> 
>...
>
> I have 215 countries. I would like to create a 215x215 matrix , say M, in which
> element M(i,j) is the total trade between countries i and j between
> 1985 and 2015 (i.e. the sum of annual amounts of trade).
> 
> After collecting the country codes in a variable named "my_iso", I can obtain
> M in a straightforward way using a loop 
> 
> Is there a way to avoid these loops?

Using core R:
#Use aggregate() to aggregate across years:

dataTrade.ag <- aggregate (dataTrade[,'Flow',drop=FALSE], by=dataTrade[, c('iso_o', 'iso_d')], FUN=sum, na.rm=TRUE)

#where na.rm=TRUE (passed to sum()) essentially treats NAs as 0. If you really want NA leave it out or set it to FALSE
#This gives you one row per origin/destination pair that contains the total trade in Flow.
#If the years you want are a subset, subset the data frame first.

#Form an empty matrix with suitable dimnames:
N_iso <- length(my_iso)
dT.m <- matrix(rep(NA, N_iso*N_iso), ncol=N_iso, dimnames=list(my_iso, my_iso))

#Then use matrix indexing by name to populate your matrix with the available flow data
dT.m[as.matrix(dataTrade.ag[1:2]) ] <- dataTrade.ag$Flow	
	#This relies on a default conversion from data frame factors to a character matrix, together
	#with R's facility for matrix indexing by 2-column matrix

#Then
dataTrade.ag[1:10, 1:10]

#should have what you seem to want


S Ellison




*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:14}}



More information about the R-help mailing list