[R-sig-Geo] R equivalent of join between two data frames

Agustin Lobo alobolistas at gmail.com
Tue Nov 1 13:38:11 CET 2011


I have had problems using merge() for a similar purpose (see [1])
and I use this function to join a data frame to an existing Spatial
Polygons or Points Data frame.
Extending this to 2 SPDF should be easy:
I wish these type of commands existed in official sp

"mijoin" <- function(SPDF1,tabla,by.x,by.y)
{
#Joins tabla to the dataframe of SPDF1
#beeing SPDF1 an Spatial Points (or Polygons) Data Frame
tabladata = SPDF1 at data
x = tabladata[[by.x]]
y = tabla[[by.y]]
tabla2 <- tabla[match(x, y),]
tabladata3 = cbind(tabladata,tabla2)
row.names(tabladata3)=row.names(SPDF1 at data)
SPDF2 = SPDF1
SPDF2 at data = tabladata3
SPDF2
}

Agus

[1] The problem I found using merge()
(files can be downloaded from [2] and [3])

require(rgdal)
require(foreign)
fireLC =
readOGR(dsn="/media/Iomega_HDD/JACO/ExFireBorneo",layer="WFA_Borneo_1997_1998_LC",stringsAsFactors=F)
plot(fireLC)
str(fireLC,max.level=2)
fireLCdata = fireLC at data
fireLCdata[1:10,]
names(fireLCdata)[1]="LC"

legendLC =
read.csv("/media/Iomega_HDD/JACOB/ExFireBorneo/ExFireBorneoDATA/legend_SEA2007LC.txt",header=T,sep=",")
legendLC

Now a merging with merge()

x = merge(fireLCdata,legendLC,by.x="LC",by.y="LC",sort=F,all.x=T)

but note that despite sort=F the ordering has changed:
x[1:10,]
fireLCdata[1:10,]

fixing the ordering:
x = x[order(x$ID),]
x[1:10,]
fireLC at data = fireLCdata

If the ordering had not been fixed, the spatial links in fireLC would
have been changed and fireLC at data = fireLCdata would be wrong

[2] https://sites.google.com/site/filestemp2/home/fireLC.rda
[3] https://sites.google.com/site/filestemp2/home/legendLC.rda

2011/11/1 Ralf Schäfer <senator at ecotoxicology.de>:
> Hi Erin,
>
> as pointed out you can query a left inner join with all.x=T and a right inner join with all.y=T, incomparable is probaly to distinguish an inner and outer join, though I have not found the specifications in the merge functions.
>
> Apart from that you can use SQL statements in R directly - here is a nice example:
> http://gettinggeneticsdone.blogspot.com/2010/05/use-sql-queries-to-manipulate-data.html
>
> Best regards
> Ralf
>
> ------------------------------------------------------------
>
> Prof. Dr. rer. nat. Ralf Bernhard Schäfer
> Juniorprofessor for Quantitative Landscape Ecology
> Environmental Scientist (M.Sc.)
> Institute for Environmental Sciences
> University Koblenz-Landau
> Fortstrasse 7
> 76829 Landau
> Germany
> Mail: schaefer-ralf at uni-landau.de
> Phone: ++49 (0) 6341 280-31536
> Web: http://tinyurl.com/6dnpxna
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>



More information about the R-sig-Geo mailing list