[R-sig-Geo] merging tables by columns AND row names (coordinates)

Mikkel Grum mi2kelgrum at yahoo.com
Sat Sep 9 20:17:25 CEST 2006


Thanks Roger, I've been using this to bind rows
together, where I don't necessarily have the exact
same columns in each table. It seems to work fine for
that, so I hadn't noticed that it didn't actually
merge floating points. However, it looks like merging
will work, if you can afford to round off the
coordinates and convert them to a character variable:

Table1$XCOORD <- as.charater(round(Table1$XCOORD, 12))

The number of digits you need to round to will depend
on your data set and you might have trouble spotting
errors in a large data set. If the points belong to a
regular grid and you can attach the row and column
indices, that would be more robust.

Mikkel

--- Roger Bivand <Roger.Bivand at nhh.no> wrote:

> On Fri, 8 Sep 2006, Mikkel Grum wrote:
> 
> > merge(Table1, Table2, 
> >    by = intersect(c("XCOORD", "YCOORD"), 
> >    c("XCOORD", "YCOORD")), all = TRUE)
> > 
> > It might not handle the amount of data you have,
> but,
> > if your tables are normal dataframes, it would do
> the
> > job with a smaller dataset. It doesn't work with
> > Spatial*DataFrames (yet?).
> 
> I would be wary of this with coords as floating
> point, because they ought
> to be snapped together. I believe that the original
> data were from a
> regular grid with missing cells. If that is the
> case, and the coordinates
> can be mapped to integer row and column IDs, then
> certainly your route
> will work. You are right that there is as yet no
> cbind/rbind/merge 
> facility for Spatial*DataFrames.
> 
> Roger
> 
> > 
> > Mikkel
> > 
> > --- Roger Bivand <Roger.Bivand at nhh.no> wrote:
> > 
> > > On Fri, 8 Sep 2006, Michael Sumner wrote:
> > > 
> > > > Hello, I can think of a couple of
> simple-minded
> > > approaches that would 
> > > > take some time - either relying on direct
> > > string-matching for the unique 
> > > > coordinates, or by some contrived overlay.
> > > > 
> > > > However, there's probably far better
> approaches -
> > > a couple of questions:
> > > > 
> > > > Can you predefine the set of all unique
> > > coordinates without reading all 
> > > > the tables from file? 
> > > >  - if so you might simplify the identification
> of
> > > each individual 
> > > > coordinate, for matching the records
> > > > 
> > > > Are the coordinates (intended to be) on a
> regular
> > > grid?  (This seems 
> > > > unlikely, although it is nearly true given
> your X
> > > coordinates).
> > > 
> > > The key question is what the data are. To me
> they
> > > look like a global 
> > > regular grid with some slippage in the print() -
> the
> > > underlying diff() of 
> > > the unique x's and y's is almost certainly
> regular.
> > > I'm not sure why they 
> > > are in text files either (model output?). But
> some
> > > bits of the grid may be 
> > > missing, the question being whether this is
> regular.
> > > If as an earlier 
> > > response indicated different data sets have
> > > different grid cells 
> > > missing, then we need the overall grid to start
> > > with, then grab the row 
> > > and column indices (and/or grid index), and
> attach
> > > these to the data rows. 
> > > 
> > > If the solution needs to be robust, and have a
> > > longer term utility, I 
> > > would go for using MySQL, Terralib, and aRT. The
> > > data representation is 
> > > that of the Terralib Cell object, so the
> question
> > > would be how to upload 
> > > to the database from the text files.
> > > 
> > > aRT is at:
> > > 
> > > http://www.est.ufpr.br/aRT/
> > > 
> > > By the way, 1M by 100 by 8 bytes is pushing
> 32-bit R
> > > - but handing off a 
> > > lot of the data storage to a database relieves
> this
> > > greatly.
> > > 
> > > Roger
> > > 
> > > > 
> > > > Cheers, Mike.
> > > > 
> > > > 
> > > > isidora k wrote:
> > > > > Hi everyone!
> > > > > I have 100 tables of the form:
> > > > > XCOORD,YCOORD,OBSERVATION
> > > > > 27.47500,42.52641,177
> > > > > 27.48788,42.52641,177
> > > > > 27.50075,42.52641,179
> > > > > 27.51362,42.52641,178
> > > > > 27.52650,42.52641,180
> > > > > 27.53937,42.52641,178
> > > > > 27.55225,42.52641,181
> > > > > 27.56512,42.52641,177
> > > > > 27.57800,42.52641,181
> > > > > 27.59087,42.52641,181
> > > > > 27.60375,42.52641,180
> > > > > 27.61662,42.52641,181
> > > > > ..., ..., ...
> > > > > with approximately 1000000 observations for
> > > each. All
> > > > > these tables have the same xcoord and ycoord
> and
> > > I
> > > > > would like to get a table of the form
> > > > > XCOORD,YCOORD,OBSERVATION1,OBSERVATION2,... 
> > > > > 27.47500,42.52641,177,233,...
> > > > > 27.48788,42.52641,177,345,...
> > > > > 27.50075,42.52641,179,233,...
> > > > > 27.51362,42.52641,178,123,...
> > > > > 27.52650,42.52641,180,178,...
> > > > > 27.53937,42.52641,178,...,...
> > > > > 27.55225,42.52641,181,...
> > > > > 27.56512,42.52641,177,...
> > > > > 27.57800,42.52641,181,...
> > > > > 27.59087,42.52641,181,...
> > > > > 27.60375,42.52641,180,...
> > > > > 27.61662,42.52641,181,...
> > > > > In other words I would like to merge all the
> > > tables
> > > > > taking into account the common row names of
> > > their
> > > > > xcoords AND ycoords.
> > > > > Not all tables have the same number of
> > > observations
> > > > > which means that not all pairs of x and y
> coords
> > > > > match.
> > > > > Is there a way to do this in R?
> > > > > I would be grateful for any advice.
> > > > > Many thanks
> > > > > Isidora
> > > > >
> > > > >
> _______________________________________________
> > > > > R-sig-Geo mailing list
> > > > > R-sig-Geo at stat.math.ethz.ch
> > > > >
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > >
> > > > >
> > > > >
> > > > 
> > > >
> _______________________________________________
> > > > R-sig-Geo mailing list
> > > > R-sig-Geo at stat.math.ethz.ch
> > > >
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > > 
> > > 
> > > -- 
> > > Roger Bivand
> > > Economic Geography Section, Department of
> Economics,
> > > Norwegian School of
> > > Economics and Business Administration,
> Helleveien
> > > 30, N-5045 Bergen,
> > > Norway. voice: +47 55 95 93 55; fax +47 55 95 95
> 43
> > > e-mail: Roger.Bivand at nhh.no
> > > 
> > > _______________________________________________
> > > R-sig-Geo mailing list
> > > R-sig-Geo at stat.math.ethz.ch
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > 
> > 
> > 
> > __________________________________________________
> > Do You Yahoo!?

> protection around 
> 
=== message truncated ===




More information about the R-sig-Geo mailing list