[R-sig-Geo] Mapping Geographical Coordinate Data

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Mon Jul 6 11:47:36 CEST 2020


On Mon, 6 Jul 2020, Graham Leask wrote:

> Thank you. That’s helpful and confirms my thoughts that this does not
> follow the standard structure that can be read by the available existing
> functions within R.


For your subset of the input data, this appears to work:

# o is your data subset
geoms <- o[,2]
library(sf)
l_out <- lapply(geoms, function(geom) {
   o1 <- gsub("([0-9])(,)([0-9])", "\\1 \\3", geom)
     # between lon and lat
   o2 <- gsub("\\]", ")", gsub("\\[", "(", o1))
     # all brackets to parentheses
   o3 <- gsub("([0-9])\\),\\(([0-9])", "\\1,\\2", o2)
     # between coordinate pairs
   o4 <- gsub("\\(\\(\\(", "(", gsub("\\)\\)\\)", ")", o3))
     # three ((())) to one ()
   st_as_sfc(paste0("POLYGON", o4))
})
out <- do.call("c", l_out)
plot(out, col=1:6)
st_is_valid(out)
# [1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE
out1 <- st_make_valid(out)
plot(out1, col=1:6)

but having only seen the first 6, there may be further problems. 
Refreshing regular expressions knowledge is I hope an effective mental 
exercise ...

Roger

>
> Kind regards
>
>
> Graham
>
> On Mon, 6 Jul 2020 at 10:23, Roger Bivand <Roger.Bivand using nhh.no> wrote:
>
>> On Sun, 5 Jul 2020, Graham Leask wrote:
>>
>>> Hi Roger
>>>
>>> Here is the file imported from the original .csv file. I suspect it may
>>> have originated as a .qvd file converted to .csv.
>>
>> The geometry string column contains [] separated geographical coordinates
>> of somewhat varying formatting (the string versions of first and last
>> ring coordinates are not always equal, I think), in an undocumented
>> format.
>>
>> There is certainly no existing function to read this. Use string handling
>> to convert to a format that can be read, most likely Well-Known Text. This
>> will involve quite advanced regular expression handling. The sample data
>> look like POLYGON objects (an exterior ring and interior rings), but they
>> might also be MULTIPOLYGON objects. From there, sf::st_as_sfc(). WKT does
>> not group coordinates as ((1,1),(2,2)), rather as (1 1, 2 2).
>>
>> It would still be very useful to know the provenance of the file in some
>> detail. It is not likely that anyone will help you write the regular
>> expression code to handle this data unless it can be generalised to a
>> common use case (you suggested QVD (QlikView)).
>>
>> Roger
>>
>>>
>>> dput(head(Geog1))
>>> structure(list(BrickCode = c("101;", "102;", "103;", "104;",
>>> "105;", "106;"), X.Key_Brick_Geometry =
>> c("[[[[15.066294,54.986481],[15.08849170010846,54.98916685060565],[15.109724384490239,55.009579959623146],[15.120340726681128,55.014414643337815],[15.120340726681128,55.02945588156123],[15.112619750542299,55.03805087483176],[15.123236092733189,55.04503430686406],[15.125166336767895,55.05040617765814],[15.138678045010845,55.05792679676985],[15.140608289045552,55.06759616419919],[15.158945607375271,55.081025841184385],[15.158945607375271,55.089620834454905],[15.151224631236442,55.10036457604306],[15.151224631236442,55.11110831763122],[15.147364143167028,55.11540581426648],[15.158945607375271,55.12829830417227],[15.154119997288504,55.13420736204576],[15.138678045010845,55.144951103633915],[15.117445360629068,55.146025477792726],[15.078840479934925,55.156769219380884],[15.046026331344903,55.17288483176312],[15.017072670824295,55.18040545087483],[14.990049254338395,55.19759543741588],[14.984258522234274,55.20726480484522],[14.968816569956616,55.215859798115744],[14.939862909436009,55.212636675639295],[14.926351201193059,55.220157294751004],[14.889676564533623,55.22767791386271],[14.872304368221258,55.24164477792732],[14.835629731561822,55.25131414535666],[14.820187779284165,55.25990913862718],[14.810536559110629,55.276024751009416],[14.781582898590022,55.29214036339165],[14.779652654555314,55.29966098250336],[14.766140946312365,55.29966098250336],[14.750698994034707,55.291065989232834],[14.749733872017353,55.2679669448183],[14.733326797722343,55.24916539703903],[14.715954601409978,55.23519853297442],[14.703408015184381,55.212099488559886],[14.703408015184381,55.203504495289366],[14.699547527114968,55.1992069986541],[14.699547527114968,55.17557076716016],[14.703408015184381,55.17127327052489],[14.703408015184381,55.12829830417227],[14.699547527114968,55.12400080753701],[14.699547527114968,55.11325706594885],[14.6947219170282,55.10734800807536],[14.685070696854664,55.10412488559892],[14.685070696854664,55.09660426648721],[14.70823362527115,55.081563028263794],[14.7217453335141,55.078339905787345],[14.746838505965293,55.06222429340511],[14.77000143438178,55.05362930013459],[14.79992021691974,55.05094336473755],[14.824048267353579,55.04288555854643],[14.880025344360087,55.03321619111709],[14.89739754067245,55.023546823687745],[14.959165349783081,55.00420808882907],[15.008386572668112,54.99507590847914],[15.043130965292843,54.9929271601615],[15.066294,54.986481]]],[[[15.182108999999999,55.323834],[15.19369,55.321686],[15.19369,55.319537],[15.185969,55.317387999999994],[15.182108999999999,55.319537],[15.182108999999999,55.323834]]],[[[15.182108999999999,55.323834],[15.174387999999999,55.323834],[15.174387999999999,55.325983],[15.182108999999999,55.325983],[15.182108999999999,55.323834]]]]",
>>>
>> "[[[[12.529952999999999,55.631105],[12.525127622017353,55.62519635262449],[12.507755425704989,55.61552698519515],[12.506790303687636,55.60102293405114],[12.52705786605206,55.57577514131897],[12.552151038503252,55.57255201884253],[12.565662746746202,55.55858515477792],[12.598476895336225,55.55643640646029],[12.631291043926247,55.57470076716016],[12.667965680585683,55.58222138627187],[12.679547144793926,55.58866763122476],[12.680512266811279,55.59457668909825],[12.673756412689805,55.602634495289365],[12.684372754880693,55.61606417227456],[12.684372754880693,55.62465916554508],[12.679547144793926,55.6327169717362],[12.66217494848156,55.63594009421265],[12.654453972342733,55.64238633916554],[12.64866324023861,55.65850195154778],[12.638046898047723,55.66763413189771],[12.649628362255964,55.676229125168234],[12.639012020065074,55.683212557200534],[12.637081776030367,55.69073317631224],[12.619709579718004,55.68858442799461],[12.610058359544468,55.678915060565274],[12.59461640726681,55.670320067294746],[12.577244210954445,55.669245693135935],[12.564698,55.661187999999996],[12.563732502711495,55.65527882907133],[12.552151038503252,55.64453508748317],[12.529952999999999,55.631105]]],[[[12.734558999999999,55.609618],[12.749035930043384,55.606931991924625],[12.777024468546637,55.59027919246299],[12.774129102494577,55.58866763122476],[12.75096617407809,55.595113876177656],[12.739384709869848,55.60156012113055],[12.734558999999999,55.609618]]],[[[12.792466,55.607468999999995],[12.762547638286334,55.60585761776581],[12.749035930043384,55.61767573351278],[12.739384709869848,55.618750107671595],[12.730698611713665,55.63755165545087],[12.743245197939261,55.6670969448183],[12.769303492407808,55.671931628532974],[12.779919834598697,55.66494819650067],[12.784745444685466,55.65689039030955],[12.784745444685466,55.62895666218034],[12.792466420824294,55.613915423956925],[12.792466,55.607468999999995]]]]",
>>>
>> "[[[[12.545395,55.684824],[12.564698,55.684824],[12.568558,55.689122],[12.552151038503252,55.70792316285329],[12.545395,55.708459999999995],[12.541535,55.706312],[12.541535,55.702014],[12.537673999999999,55.702014],[12.537673999999999,55.697717],[12.529952999999999,55.697717],[12.529952999999999,55.695567999999994],[12.537673999999999,55.695567999999994],[12.537673999999999,55.691269999999996],[12.541535,55.691269999999996],[12.541535,55.686972999999995],[12.545395,55.684824]]]]",
>>>
>> "[[[[12.510651,55.635403],[12.526093,55.635403],[12.529952999999999,55.631105],[12.552151038503252,55.64453508748317],[12.563732502711495,55.65527882907133],[12.564698,55.661187999999996],[12.556976648590021,55.66548538358008],[12.529952999999999,55.665485],[12.525127622017353,55.6531300807537],[12.511615913774403,55.65205570659488],[12.50293,55.641849],[12.506789999999999,55.6397],[12.506789999999999,55.635403],[12.510651,55.635403]]]]",
>>>
>> "[[[[12.50293,55.641849],[12.511615913774403,55.65205570659488],[12.525127622017353,55.6531300807537],[12.529952999999999,55.665485],[12.538639330260303,55.682138183041715],[12.545395,55.684824],[12.541535,55.686972999999995],[12.541535,55.691269999999996],[12.537673999999999,55.691269999999996],[12.537673999999999,55.695567999999994],[12.529952999999999,55.695567999999994],[12.503894937635573,55.68858442799461],[12.479766999999999,55.67408],[12.480732009219087,55.65742757738896],[12.490383229392624,55.65205570659488],[12.49231347342733,55.6466838358008],[12.50293,55.641849]]]]",
>>>
>> "[[[[12.510651,55.635403],[12.506789999999999,55.635403],[12.506789999999999,55.6397],[12.50293,55.641849],[12.49231347342733,55.6466838358008],[12.490383229392624,55.65205570659488],[12.480732009219087,55.65742757738896],[12.479766999999999,55.67408],[12.469150545010844,55.683212557200534],[12.464324934924077,55.69341911170928],[12.464324934924077,55.70201410497981],[12.453708592733188,55.70684878869448],[12.452743,55.714907],[12.43826664045553,55.71436940780619],[12.425720054229934,55.7084603499327],[12.391940783622559,55.70577441453566],[12.387115,55.699864999999996],[12.393871027657266,55.68965880215343],[12.372638343275487,55.68428693135935],[12.371673221258133,55.661187886944816],[12.36781273318872,55.65474164199192],[12.382289563449023,55.64346071332436],[12.406417999999999,55.613915],[12.432475908351408,55.61337823687752],[12.460464446854663,55.60102293405114],[12.494243717462037,55.60156012113055],[12.50292981561822,55.611766675639295],[12.499069327548806,55.62251041722745],[12.510651,55.635403]]]]"
>>> )), row.names = c(NA, 6L), class = "data.frame")
>>>
>>>> On 5 Jul 2020, at 15:00, Roger Bivand <Roger.Bivand using nhh.no> wrote:
>>>>
>>>> On Sat, 4 Jul 2020, Graham Leask wrote:
>>>>
>>>>> Dear List,
>>>>>
>>>>> I have a postcode file containing geographical coordinates but this is
>> not in the format of a standard shape file. I list some information below;
>>>>
>>>> Is the smoking gun: 'format.stata = "%9s"'? What generated the data -
>> was it for example read into R using foreign, haven, or some other function
>> or package for reading stata objects? What function in Stata generated that
>> file (if made in Stata)? Could you please provide the full context
>> including the lines of the Stata do file used? Some seem to use Stata for
>> mapping:
>>>>
>>>>
>> https://blog.stata.com/2020/04/07/how-to-create-choropleth-maps-using-the-covid-19-data-from-johns-hopkins-university/
>> <
>> https://blog.stata.com/2020/04/07/how-to-create-choropleth-maps-using-the-covid-19-data-from-johns-hopkins-university/
>>>
>>>>
>>>> so we need to know what made the object, and whether it could have made
>> something more readable.
>>>>
>>>> Roger
>>>>
>>>>>
>>>>>
>>>>> structure(list(Postcode = structure(c("101", "102", "103", "104",
>>>>> "105", "106"), label = "Brick code", format.stata = "%9s"),
>> Postcode_geometry =
>> structure(c("[[[[15.066294,54.986481],[15.08849170010846,54.98916685060565],[15.109724384490239,55.009579959623146],[15.120340726681128,55.014414643337815],[15.120340726681128,55.02945588156123],[15.112619750542299,55.03805087483176],[15.123236092733189,55.04503430686406],[15.125166336767895,55.05040617765814],[15.138678045010845,55.05792679676985],[15.140608289045552,55.06759616419919],[15.158945607375271,55.081025841184385],[15.158945607375271,55.089620834454905],[15.151224631236442,55.10036457604306],[15.151224631236442,55.11110831763122],[15.147364143167028,55.11540581426648],[15.158945607375271,55.12829830417227],[15.154119997288504,55.13420736204576],[15.138678045010845,55.144951103633915],[15.117445360629068,55.146025477792726],[15.078840479934925,55.156769219380884],[15.046026331344903,55.17288483176312],[15.017072670824295,55.18040545087483],[14.990049254338395,55.19759543741588],[14.984258522234274,55.20726480484522],[14.968816569956616,55.215859798115744],[14.939862909436009,55.212636675639295],[14.926351201193059,55.220157294751004],[14.889676564533623,55.22767791386271],[14.872304368221258,55.24164477792732],[14.835629731561822,55.25131414535666],[14.820187779284165,55.25990913862718],[14.810536559110629,55.276024751009416],[14.781582898590022,55.29214036339165],[14.779652654555314,55.29966098250336],[14.766140946312365,55.29966098250336],[14.750698994034707,55.291065989232834],[14.749733872017353,55.2679669448183],[14.733326797722343,55.24916539703903],[14.715954601409978,55.23519853297442],[14.703408015184381,55.212099488559886],[14.703408015184381,55.203504495289366],[14.699547527114968,55.1992069986541],[14.699547527114968,55.17557076716016],[14.703408015184381,55.17127327052489],[14.703408015184381,55.12829830417227],[14.699547527114968,55.12400080753701],[14.699547527114968,55.11325706594885],[14.6947219170282,55.10734800807536],[14.685070696854664,55.10412488559892],[14.685070696854664,55.09660426648721],[14.70823362527115,55.081563028263794],[14.7217453335141,55.078339905787345],[14.746838505965293,55.06222429340511],[14.77000143438178,55.05362930013459],[14.79992021691974,55.05094336473755],[14.824048267353579,55.04288555854643],[14.880025344360087,55.03321619111709],[14.89739754067245,55.023546823687745],[14.959165349783081,55.00420808882907],[15.008386572668112,54.99507590847914],[15.043130965292843,54.9929271601615],[15.066294,54.986481]]],[[[15.182108999999999,55.323834],[15.19369,55.321686],[15.19369,55.319537],[15.185969,55.317387999999994],[15.182108999999999,55.319537],[15.182108999999999,55.323834]]],[[[15.182108999999999,55.323834],[15.174387999999999,55.323834],[15.174387999999999,55.325983],[15.182108999999999,55.325983],[15.182108999999999,55.323834]]]]",
>>>>>
>> "[[[[12.529952999999999,55.631105],[12.525127622017353,55.62519635262449],[12.507755425704989,55.61552698519515],[12.506790303687636,55.60102293405114],[12.52705786605206,55.57577514131897],[12.552151038503252,55.57255201884253],[12.565662746746202,55.55858515477792],[12.598476895336225,55.55643640646029],[12.631291043926247,55.57470076716016],[12.667965680585683,55.58222138627187],[12.679547144793926,55.58866763122476],[12.680512266811279,55.59457668909825],[12.673756412689805,55.602634495289365],[12.684372754880693,55.61606417227456],[12.684372754880693,55.62465916554508],[12.679547144793926,55.6327169717362],[12.66217494848156,55.63594009421265],[12.654453972342733,55.64238633916554],[12.64866324023861,55.65850195154778],[12.638046898047723,55.66763413189771],[12.649628362255964,55.676229125168234],[12.639012020065074,55.683212557200534],[12.637081776030367,55.69073317631224],[12.619709579718004,55.68858442799461],[12.610058359544468,55.678915060565274],[12.59461640726681,55.670320067294746],[12.577244210954445,55.669245693135935],[12.564698,55.661187999999996],[12.563732502711495,55.65527882907133],[12.552151038503252,55.64453508748317],[12.529952999999999,55.631105]]],[[[12.734558999999999,55.609618],[12.749035930043384,55.606931991924625],[12.777024468546637,55.59027919246299],[12.774129102494577,55.58866763122476],[12.75096617407809,55.595113876177656],[12.739384709869848,55.60156012113055],[12.734558999999999,55.609618]]],[[[12.792466,55.607468999999995],[12.762547638286334,55.60585761776581],[12.749035930043384,55.61767573351278],[12.739384709869848,55.618750107671595],[12.730698611713665,55.63755165545087],[12.743245197939261,55.6670969448183],[12.769303492407808,55.671931628532974],[12.779919834598697,55.66494819650067],[12.784745444685466,55.65689039030955],[12.784745444685466,55.62895666218034],[12.792466420824294,55.613915423956925],[12.792466,55.607468999999995]]]]",
>>>>>
>> "[[[[12.545395,55.684824],[12.564698,55.684824],[12.568558,55.689122],[12.552151038503252,55.70792316285329],[12.545395,55.708459999999995],[12.541535,55.706312],[12.541535,55.702014],[12.537673999999999,55.702014],[12.537673999999999,55.697717],[12.529952999999999,55.697717],[12.529952999999999,55.695567999999994],[12.537673999999999,55.695567999999994],[12.537673999999999,55.691269999999996],[12.541535,55.691269999999996],[12.541535,55.686972999999995],[12.545395,55.684824]]]]",
>>>>>
>> "[[[[12.510651,55.635403],[12.526093,55.635403],[12.529952999999999,55.631105],[12.552151038503252,55.64453508748317],[12.563732502711495,55.65527882907133],[12.564698,55.661187999999996],[12.556976648590021,55.66548538358008],[12.529952999999999,55.665485],[12.525127622017353,55.6531300807537],[12.511615913774403,55.65205570659488],[12.50293,55.641849],[12.506789999999999,55.6397],[12.506789999999999,55.635403],[12.510651,55.635403]]]]",
>>>>>
>> "[[[[12.50293,55.641849],[12.511615913774403,55.65205570659488],[12.525127622017353,55.6531300807537],[12.529952999999999,55.665485],[12.538639330260303,55.682138183041715],[12.545395,55.684824],[12.541535,55.686972999999995],[12.541535,55.691269999999996],[12.537673999999999,55.691269999999996],[12.537673999999999,55.695567999999994],[12.529952999999999,55.695567999999994],[12.503894937635573,55.68858442799461],[12.479766999999999,55.67408],[12.480732009219087,55.65742757738896],[12.490383229392624,55.65205570659488],[12.49231347342733,55.6466838358008],[12.50293,55.641849]]]]",
>>>>>
>> "[[[[12.510651,55.635403],[12.506789999999999,55.635403],[12.506789999999999,55.6397],[12.50293,55.641849],[12.49231347342733,55.6466838358008],[12.490383229392624,55.65205570659488],[12.480732009219087,55.65742757738896],[12.479766999999999,55.67408],[12.469150545010844,55.683212557200534],[12.464324934924077,55.69341911170928],[12.464324934924077,55.70201410497981],[12.453708592733188,55.70684878869448],[12.452743,55.714907],[12.43826664045553,55.71436940780619],[12.425720054229934,55.7084603499327],[12.391940783622559,55.70577441453566],[12.387115,55.699864999999996],[12.393871027657266,55.68965880215343],[12.372638343275487,55.68428693135935],[12.371673221258133,55.661187886944816],[12.36781273318872,55.65474164199192],[12.382289563449023,55.64346071332436],[12.406417999999999,55.613915],[12.432475908351408,55.61337823687752],[12.460464446854663,55.60102293405114],[12.494243717462037,55.60156012113055],[12.50292981561822,55.611766675639295],[12.499069327548806,55.62251041722745],[12.510651,55.635403]]]]"
>>>>> ), label = "%Key_Brick_Geometry", format.stata = "%9s")), row.names =
>> c(NA,
>>>>> -6L), class = c("tbl_df", "tbl", "data.frame”))
>>>>>
>>>>> How can I map this file using R? I’ve tried using the sf package with
>> st_multipolygon and st_multilinestring without success.
>>>>>
>>>>> Any help as to which package and appropriate commands to successfully
>> map this data using R will be appreciated.
>>>>>
>>>>> Kind regards
>>>>>
>>>>>
>>>>> Graham
>>>>> _______________________________________________
>>>>> R-sig-Geo mailing list
>>>>> R-sig-Geo using r-project.org <mailto:R-sig-Geo using r-project.org>
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo <
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo>
>>>>>
>>>>
>>>> --
>>>> Roger Bivand
>>>> Department of Economics, Norwegian School of Economics,
>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no <mailto:
>> Roger.Bivand using nhh.no>
>>>> https://orcid.org/0000-0003-2392-6140 <
>> https://orcid.org/0000-0003-2392-6140>
>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en <
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en>
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
>> https://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en


More information about the R-sig-Geo mailing list