[R-sig-Geo] Merging data frame for SpPolDF

Agustin Lobo alobolistas at gmail.com
Thu Mar 19 20:21:25 CET 2009


Thanks. I might be wrong, but I think that your example is different.
The problem comes up when the second dataframe does not
have values for all cases  that are present in the first one. For example

 > a1 <-data.frame(letras=c("A","B","C","D"),nums=c("1","2","3","4"))
 > a2 <-data.frame(letras=c("A","C","D"),nums=c("10","30","40"))
 > a1
  letras nums
1      A    1
2      B    2
3      C    3
4      D    4
 > a2
  letras nums
1      A   10
2      C   30
3      D   40
 > a2 <-data.frame(letters=c("A","C","D"),cods=c("10","30","40"))
 > merge(a1,a2,by.x="letras",by.y="letters",all.x=T,sort=F)
  letras nums cods
1      A    1   10
2      C    3   30
3      D    4   40
4      B    2 <NA>

which disrupts the ordering in a1 and thus creates a risk for
puting the merged dataframe in the SpPolDF

And what you say would be:

 > a2[match(a1$letras, a2$letters), ]
   letters cods
1        A   10
NA    <NA> <NA>
2        C   30
3        D   40

which would not solve the problem.

Perhaps I did not correctly interpret your solution?

Agus

Torleif Markussen Lunde wrote:
> Hi
>
> Maybe this can help? Please correct me if this is not what you wanted.
>
> require(maptools)
>
> nc <- readShapePoly(system.file("shapes/sids.shp", package="maptools")[1], 
> 		    proj4string=CRS("+proj=longlat +datum=NAD27"))
>
> #Create dummy data. Do some changes to make it look different (subset and 
> order)
> extra <- data.frame(ID=slot(nc, 'data')$CNTY_ID, 
> Ndata=runif(length(slot(nc, 'data')$CNTY_ID)))
> extra <- extra[4:67, 1:2]
> extra <- extra[order(extra$ID, decreasing=TRUE),]
> extra[1,1] <- 342
>
> #add the dummy data(.frame) (this part is what you want to do)
> extra <- extra[match(slot(nc, 'data')$CNTY_ID, extra$ID), 1:2]
>
> slot(nc, 'data')$Ndata <- extra$Ndata
> #or for the data frame
> slot(nc, 'data') <- cbind(slot(nc, 'data'), extra[-1])
>
> Best wishes
> Torleif
>
>
>
> On Thursday 19 March 2009 01:08:15 pm Agustin Lobo wrote:
>   
>> Hi!
>>
>> I often have to add more information to the data slot of
>> a SpPolDF imported from a shp file. I do it in this way, don't like
>> it too much and would like feed-back on a better way-
>>
>> #Import shp
>> MMAMBmuni <- readOGR("C:/Pruebas/DUNS/MMAMBmuni", layer="MMAMBmuni")
>> #Extract the DF
>> MMAMBmuniDFori <- MMAMBmuni at data
>>
>> #Make a new dataframe by merging with another DF
>> MMAMBmuniDFnew <-
>> merge(MMAMBmuniDFori,MMAMBempleados,by.x="MUNICIPI",by.y="CODMUN",all.x=T,s
>> ort=F)
>>
>> The problem here is that there are a couple of towns in the by.x field
>> for which we do not any in by.y
>> As we have set all.x=T, we get a line for which the values from the
>> second dataframe are NA. But, despite stating sort=F, those cases are
>> not in the same row as they are in the first data.frame but appended at
>> the end of the new dataframe. This is bad news for us, as breaks
>> the order required for including the new dataframe as the data slot
>> of a new SpPolDF. Therefore, I have to reorder the new dataframe, thanks
>> to another field, IDgrafic:
>>
>> MMAMBmuniDFnew<- MMAMBmuniDFnew[order(MMAMBmuniDFnew$ID_GRAFIC),]
>>
>> and then copy the original row.names, required because the row.names are
>> the ones
>> making the link to the polygons in the future SpPolDF:
>>
>> row.names(MMAMBmuniDFnew) <- row.names(MMAMBmuniDFori)
>>
>> #Now we put the new DF in lieu of the older one:
>> MMAMBmuni2 at data <- MMAMBmuniDFnew
>>
>> #and finally save as shp
>> writeOGR(MMAMBmuni2,dsn="C:/Pruebas/DUNS/MMAMBmuni2",layer="MMAMBmuni2",
>> driver="ESRI Shapefile")
>>
>> Any suggestions on a better procedure? The problem is that sometimes I
>> forget reordering and get a wrong shp. Until now, I have always realized
>> the error, but I'm terrified by the idea of not realizing the error
>> sometime and using true garbage after that point...
>>
>> Thanks
>>
>> Agus
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>     
>
>
>
>   

-------------- next part --------------
A non-text attachment was scrubbed...
Name: alobolistas.vcf
Type: text/x-vcard
Size: 251 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20090319/6e9d21b3/attachment.vcf>


More information about the R-sig-Geo mailing list