[R-sig-Geo] BIG DATABASE

Tom Philippi tephilippi @ending from gm@il@com
Fri May 25 06:35:39 CEST 2018


What Roger said (as always).

Note that if you use tidyverse and magrittr, dplyr and tidyverse tools work
well with databases via DBI.  sqldf also works with multiple SQL database
backends if you're an ol dog like me and don't use tidyverse much.

Also, since this is r-sig-*GEO*, note that postgreSQL has postGIS for
spatial data, which does far more than the automatic tiling of large
rasters in package raster.  I'm seeing wonderful performance working with a
340M observation >100GB dataset of bird observation data in R via postGIS,
even with "only" 32GB RAM and constrained to running win7, not linux/unix.

One alternative is that if your database is running on massive hardware
(tons of memory, many cores, etc.), it is possible to run R within both
postgreSQL and now MS SQL Server, the first free, the second an additional
cost add-on, and both usually at the cost of painful negotiations with DA
administrators for permissions to run your ad hoc R code on their SQL
server.  If you have the hardware, you can even run R with hadoop, although
I've never done that with spatial data.

Tom 0


On Thu, May 24, 2018 at 5:04 AM, Roger Bivand <Roger.Bivand using nhh.no> wrote:

> On Thu, 24 May 2018, Yaya Bamba wrote:
>
> Thanks to all of you. I will try with the package  RMySQL and see.
>>
>
> Maybe look more generally through the packages depending on and importing
> from DBI (https://cran.r-project.org/package=DBI) to see what is
> available - there are many more than RMySQL.
>
> and use the Official Statistics and HPC Task Views:
>
> https://cran.r-project.org/view=OfficialStatistics
>
> https://cran.r-project.org/view=HighPerformanceComputing
>
> to see how typical workflows (not necessarily DB-based) can be handled.
> The HPC TV has a section on large memory and out-of-memory approaches. If
> your data are spatial in raster format, the raster package provides some
> out-of-memory functionality. In sf, spatial vector data may be read from
> databases too.
>
> Roger
>
>
>
>> 2018-05-24 11:33 GMT+00:00 Andres Diaz Loaiza <madiazl using gmail.com>:
>>
>> Hello Yaya,
>>>
>>> Many years ago I work with a database in MySQL connected to R through the
>>> package RMySQL​. The data was stored in the MySQL and I was connecting
>>> and
>>> using the data from R
>>>
>>> you should have a look in:
>>>
>>> https://cran.r-project.org/web/packages/RMySQL/index.html
>>>
>>> Cheers,
>>>
>>> Andres
>>>
>>>
>>
>>
>>
>>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
> http://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list