[R-sig-Geo] Using duckdb spatial module from R (with sf)?
Carl Boettiger
cboett|g @end|ng |rom gm@||@com
Wed Aug 16 02:56:35 CEST 2023
thanks! of course the polygon can be a triangle, but I think the
difference here is that (a) I wanted st_within() if I'm giving my
points first and then my filter, and I think st_contains / st_within
have slightly different opinions about being on the exact boundary
than st_filter().
Anyway, fixing my filter does indeed work! Thanks Josiah for the help
here. This is very appealing to me thanks to the ability of duckdb to
open very large remote csv / parquet files without downloading them,
which often contain lat / long columns in my work.
I'll stop bugging the rest of you :-)
here's the working filter for comparison in case anyone is following along.
## a trivial search polygon to filter:
p2 <- rbind(c(0,0), c(0,3), c(3,3), c(3,0), c(0,0))
pol <-st_polygon(list(p2))
txt <- st_as_text(pol)
## Can we use this to filter duckdb?
sql <- tbl(conn, "test") |>
mutate(geometry = ST_Point(longitude, latitude),
geom = ST_AsHEXWKB(geometry)) |>
filter(ST_Within(geometry, ST_GeomFromText({txt}))) |>
dbplyr::sql_render()
sql
ex <- st_read(conn, query=sql, geometry_column = "geom", EWKB=FALSE)
ex
(note this still differs with sf::st_filter() regarding the 3,3 point,
which I believe is well documented I just overlooked it).
---
Carl Boettiger
http://carlboettiger.info/
On Tue, Aug 15, 2023 at 5:39 PM Josiah Parry <josiah.parry using gmail.com> wrote:
>
> Quick note on the contains: a polygon has 5 points! The first has to be the same as the last. And they shuold be going in Counter clock wise order if memory serves ! :)
>
> On Tue, Aug 15, 2023 at 8:34 PM Carl Boettiger <cboettig using gmail.com> wrote:
>>
>> Hi list,
>>
>> One more go at this and I'll stop for other ideas. Below is a
>> slightly modified example using dbplyr::sql_render() to construct the
>> query, which works and to me feels a little cleaner, though maybe is
>> ill-advised. Then I try to construct a spatial filter (for points in
>> a polygon) this way with ST_Contains, but my effort comes back empty.
>> Not sure where I went wrong. Any ideas?
>>
>>
>> ## boilerplate setup
>> library(duckdb)
>> conn <- DBI::dbConnect(duckdb::duckdb())
>> status <- DBI::dbExecute(conn, "INSTALL 'spatial';")
>> status <- DBI::dbExecute(conn, "LOAD 'spatial';")
>> test <- data.frame(site = letters[1:10], latitude = 1:10, longitude = 1:10)
>> DBI::dbWriteTable(conn, "test", test)
>>
>> ## Here we go, works!
>> sql <- tbl(con, "test") |>
>> mutate(geom = ST_AsHEXWKB(ST_Point(longitude, latitude))) |>
>> dbplyr::sql_render()
>> ex <- st_read(conn, query=sql, geometry_column = "geom", EWKB=FALSE)
>> ex
>>
>> ## a trivial search polygon to filter:
>> p2 <- rbind(c(1,1), c(1,2), c(2,2), c(1,1))
>> pol <-st_polygon(list(p2))
>> txt <- st_as_text(pol)
>>
>> ## Can we use this to filter duckdb?
>> sql <- tbl(conn, "test") |>
>> mutate(geometry = ST_Point(longitude, latitude),
>> geom = ST_AsHEXWKB(geometry)) |>
>> filter(ST_Contains(geometry, ST_GeomFromText({txt}))) |>
>> dbplyr::sql_render()
>> sql
>> ex <- st_read(conn, query=sql, geometry_column = "geom", EWKB=FALSE)
>> ex
>> ## oh no! our result comes back empty? did I need CRS on this? Or do
>> I missunderstand "ST_Contains()"?
>>
>> ## Here's what the desired result would be from pure sf:
>> sf <- st_as_sf(test, coords = c(3,2))
>> filter <- st_as_sf(tibble(sites = "A", geometry = list(pol)))
>> sf |> st_filter(filter)
>>
>>
>>
>>
>> ---
>> Carl Boettiger
>> http://carlboettiger.info/
>>
>> On Tue, Aug 15, 2023 at 4:42 PM Carl Boettiger <cboettig using gmail.com> wrote:
>> >
>> > Ah ha. if we ask duckdb to coerce it's geometry format to hex, it
>> > appears sf can understand it just fine!
>> >
>> > replacing the above query with the following we are good to go.
>> >
>> >
>> >
>> > query <- paste(
>> > 'SELECT *, ST_AsHEXWKB(ST_Point(longitude, latitude)) AS geom',
>> > 'FROM "test"'
>> > )
>> > ex <- st_read(con, query=query, geometry_column = "geom", EWKB=FALSE)
>> > ex
>> >
>> >
>> > I'll experiment further. I'm terrible at SQL, and to my eyes it looks
>> > needlessly verbose. I'm keen to understand how I can better leverage
>> > sf's notation to compose the sql queries to duckdb.... but it seems
>> > to work! I'm also still trying to determine if duckdb is using EWKB
>> > or vanilla WKB....
>> >
>> > ---
>> > Carl Boettiger
>> > http://carlboettiger.info/
>> >
>> > On Tue, Aug 15, 2023 at 4:02 PM Carl Boettiger <cboettig using gmail.com> wrote:
>> > >
>> > > Hi folks,
>> > >
>> > >
>> > > I'm curious if anyone has explored the relatively new spatial
>> > > extension in duckdb (https://duckdb.org/docs/extensions/spatial.html)
>> > > or has any pointers/tutorials regarding its use from R?
>> > >
>> > > Consider the following minimal example that seeks to use the sf
>> > > library to speak to duckdb:
>> > >
>> > > library(duckdb)
>> > > library(sf)
>> > > conn <- DBI::dbConnect(duckdb::duckdb())
>> > > status <- DBI::dbExecute(conn, "INSTALL 'spatial';")
>> > > status <- DBI::dbExecute(conn, "LOAD 'spatial';")
>> > >
>> > > test <- data.frame(site = letters[1:10], latitude = 1:10, longitude = 1:10)
>> > > DBI::dbWriteTable(conn, "test", test)
>> > >
>> > > # Ok, let's try and make a geometry column
>> > > query <- paste(
>> > > 'SELECT *, ST_Point(longitude, latitude) AS geom',
>> > > 'FROM "test"'
>> > > )
>> > >
>> > > ex <- st_read(con, query=query, geometry_column = "geom")
>> > > ## error: reading wkb type 0 is not supported
>> > >
>> > >
>> > > ex <- st_read(con, query=query, geometry_column = "geom", EWKB = FALSE)
>> > > ## prints: wkbType: 1572864
>> > > ## Error in CPL_read_wkb(x, EWKB, spatialite) : unsupported wkbType
>> > > dim in switch
>> > >
>> > > We seem to get closer than I might have hoped (sf doesn't just insist
>> > > that conn isn't postgresgis), but is getting stuck on the encoding of
>> > > the wkb. Is this something we can work around? The queries seem to
>> > > run successfully, I just seem to be getting some unsupported ecoding
>> > > of the WKB geometry column....
>> > >
>> > > ---
>> > > Carl Boettiger
>> > > http://carlboettiger.info/
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo using r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
More information about the R-sig-Geo
mailing list