[R-sig-Geo] Odd behavior of dismo's extract function
Dan Warren
dan.l.warren at gmail.com
Mon Jul 25 04:29:32 CEST 2016
Updating to R 3.3.1 fixed it. Thanks! Still baffled as to why the sudden
dropoff between 250 and 251, but as long as it's working all is well.
Cheers!
On Mon, Jul 25, 2016 at 12:24 PM, Dan Warren <dan.l.warren at gmail.com> wrote:
> How very odd. I'm using R 3.3.0, but as far as I can tell I'm using the
> same package versions as you. I've tried this on two machines (12 core Mac
> Pro and an older Macbook Pro) and I'm getting the same phenomenon on both.
> Could it be a weird OSX thing? I'll try updating R and then if it still
> persists I'll bootcamp over into Windows and see if it's happening for me
> there.
>
>
> My session info (sorry for not including that the first time):
>
> Session info
> ----------------------------------------------------------------------------------------------------------------------------------------
> setting value
> version R version 3.3.0 (2016-05-03)
> system x86_64, darwin13.4.0
> ui RStudio (0.99.491)
> language (EN)
> collate en_AU.UTF-8
> tz Australia/Sydney
> date 2016-07-25
>
> Packages
> --------------------------------------------------------------------------------------------------------------------------------------------
> package * version date source
> colorspace 1.2-6 2015-03-11 CRAN (R 3.3.0)
> devtools 1.12.0 2016-06-24 CRAN (R 3.3.0)
> digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
> dismo * 1.1-1 2016-06-16 CRAN (R 3.3.0)
> ENMTools * 0.1 2016-07-25 local
> ggplot2 * 2.1.0 2016-03-01 CRAN (R 3.3.0)
> gridExtra * 2.2.1 2016-02-29 CRAN (R 3.3.0)
> gtable 0.2.0 2016-02-26 CRAN (R 3.3.0)
> highr 0.6 2016-05-09 CRAN (R 3.3.0)
> knitr * 1.13 2016-05-09 CRAN (R 3.3.0)
> lattice 0.20-33 2015-07-14 CRAN (R 3.3.0)
> memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
> munsell 0.4.3 2016-02-13 CRAN (R 3.3.0)
> plyr * 1.8.4 2016-06-08 CRAN (R 3.3.0)
> raster * 2.5-8 2016-06-02 CRAN (R 3.3.0)
> Rcpp 0.12.5 2016-05-14 CRAN (R 3.3.0)
> rgeos * 0.3-19 2016-04-04 CRAN (R 3.3.0)
> rJava 0.9-8 2016-01-07 CRAN (R 3.3.0)
> scales 0.4.0 2016-02-26 CRAN (R 3.3.0)
> sp * 1.2-3 2016-04-14 CRAN (R 3.3.0)
> viridis * 0.3.4 2016-03-12 CRAN (R 3.3.0)
> withr 1.0.2 2016-06-20 CRAN (R 3.3.0)
>
>
> On Mon, Jul 25, 2016 at 12:15 PM, Michael Sumner <mdsumner at gmail.com>
> wrote:
>
>>
>>
>> On Mon, 25 Jul 2016 at 11:35 Dan Warren <dan.l.warren at gmail.com> wrote:
>>
>>> Just realized I pasted in the results backwards. It should have been
>>>
>>> system.time(extract.test(env, 250))
>>>
>>> user system elapsed
>>> 124.562 0.516 125.061
>>>
>>> system.time(extract.test(env, 251))
>>>
>>> user system elapsed
>>> 2.807 0.084 2.891
>>>
>>>
>>>
>>>
>> I don't see the effect.
>>
>> Perhaps it was fixed in recent version of raster?
>>
>> Please post reproducible details, I downloaded your data files to
>> "test/testdata/" to try this.
>>
>> Cheers, Mike.
>>
>>
>> library(raster)
>> library(dismo)
>> extract.test <- function(env, N){
>> extract(env, dismo::randomPoints(env, N))
>> }
>>
>> env.files <- list.files(path = "test/testdata/", pattern = "pc",
>> full.names =
>> TRUE)
>> env <- raster::stack(env.files)
>>
>> library(rbenchmark)
>> benchmark(n250 = extract.test(env, 250),
>> n251 = extract.test(env, 251), replications = 4)
>> # test replications elapsed relative user.self sys.self user.child
>> sys.child
>> # 1 n250 4 6.31 1.008 5.13 1.14 NA
>> NA
>> # 2 n251 4 6.26 1.000 5.02 1.22 NA
>> NA
>> devtools::session_info()
>> # Session info
>> -------------------------------------------------------------------------------------------------------------------------------
>> # setting value
>> # version R version 3.3.1 Patched (2016-07-09 r70874)
>> # system x86_64, mingw32
>> # ui RStudio (0.99.1261)
>> # language (EN)
>> # collate English_Australia.1252
>> # tz Australia/Hobart
>> # date 2016-07-25
>> #
>> # Packages
>> -----------------------------------------------------------------------------------------------------------------------------------
>> # package * version date source
>> # devtools * 1.12.0 2016-06-24 CRAN (R 3.3.1)
>> # digest 0.6.9 2016-01-08 CRAN (R 3.3.1)
>> # dismo * 1.1-1 2016-06-16 CRAN (R 3.3.1)
>> # evaluate 0.9 2016-04-29 CRAN (R 3.3.1)
>> # htmltools 0.3.5 2016-03-21 CRAN (R 3.3.1)
>> # knitr 1.13 2016-05-09 CRAN (R 3.3.1)
>> # lattice 0.20-33 2015-07-14 CRAN (R 3.3.1)
>> # magrittr 1.5 2014-11-22 CRAN (R 3.3.1)
>> # memoise 1.0.0 2016-01-29 CRAN (R 3.3.1)
>> # raster * 2.5-8 2016-06-02 CRAN (R 3.3.1)
>> # rbenchmark * 1.0.0 2012-08-30 CRAN (R 3.3.0)
>> # Rcpp 0.12.5 2016-05-14 CRAN (R 3.3.1)
>> # rgdal 1.1-10 2016-05-12 CRAN (R 3.3.1)
>> # rmarkdown 1.0.2 2016-07-19 Github (rstudio/rmarkdown at b65e177)
>> # sp * 1.2-3 2016-04-14 CRAN (R 3.3.1)
>> # stringi 1.1.1 2016-05-27 CRAN (R 3.3.0)
>> # stringr 1.0.0 2015-04-30 CRAN (R 3.3.1)
>> # withr 1.0.2 2016-06-20 CRAN (R 3.3.1)
>>
>>
>>
>>
>>
>>> Dan Warren, Ph.D.
>>> Department of Biology
>>> Macquarie University
>>> Email: dan.warren at mq.edu.au <dan.warren at anu.edu.au>
>>> Phone (US): 530-848-3809
>>> Phone (Australia): 0468 696 897
>>> Phone (Work): 02 9850 8587
>>> Skype: dan.l.warren
>>> Google Scholar
>>> <https://scholar.google.com/citations?user=NTzu9c8AAAAJ&hl=en> Orcid
>>> <http://orcid.org/0000-0002-8747-2451> ResearcherID
>>> <http://www.researcherid.com/rid/B-3821-2010> Scopus
>>> <http://www.scopus.com/authid/detail.url?authorId=7202133982>
>>>
>>>
>>> On Mon, Jul 25, 2016 at 10:34 AM, Dan Warren <dan.l.warren at gmail.com>
>>> wrote:
>>>
>>> > This is not an error per se so much as just something very weird that I
>>> > have noticed with a project I've been working on recently. I'm
>>> wondering
>>> > if anyone here has any insight as to what may be causing this
>>> behavior. I
>>> > haven't yet been able to duplicate it with simulated rasters (more
>>> info on
>>> > that below), but it appears very reliably with real environmental data
>>> > including the PC rasters for Cuba I have hosted here:
>>> >
>>> > https://github.com/danlwarren/ENMTools/tree/master/test/testdata
>>> >
>>> > What's happening is this: if I go to extract data from those rasters
>>> using
>>> > occurrence points, the amount of time it takes increases very rapidly
>>> up to
>>> > exactly 250 points, and falls dramatically after that. So dramatically
>>> > that it takes over two minutes to extract data for 250 points but just
>>> > under three seconds for 251. I've established that it's not a
>>> question of
>>> > the points themselves being wonky, because it happens with random
>>> points as
>>> > well.
>>> >
>>> >
>>> > extract.test <- function(env, N){
>>> > extract(env, randomPoints(env, N))
>>> > }
>>> >
>>> > env.files <- list.files(path = "testdata/", pattern = "pc", full.names
>>> =
>>> > TRUE)
>>> > env <- stack(env.files)
>>> >
>>> > system.time(extract.test(env, 250))
>>> >
>>> > user system elapsed
>>> > 2.807 0.084 2.891
>>> >
>>> > system.time(extract.test(env, 251))
>>> >
>>> > user system elapsed
>>> > 124.562 0.516 125.061
>>> >
>>> > numpoints,time
>>> > 1,1.54
>>> > 5,3.93
>>> > 10,6.764
>>> > 50,29.939
>>> > 100,61.431
>>> > 150,79.295
>>> > 200,110.283
>>> > 250,120.118
>>> > 251,2.748
>>> > 252,2.756
>>> > 254,2.767
>>> > 500,2.876
>>> > 1000,3.153
>>> >
>>> > The data being extracted looks perfectly reasonable in all cases. It's
>>> > not just these layers, either. Although (as I mentioned above) I have
>>> yet
>>> > to come up with simulated rasters that show this behavior, I see this
>>> > behavior for both of the sets of rasters for real environmental data
>>> that
>>> > I've tried. The results above are from a PCA on Worldclim data for
>>> Cuba,
>>> > but I just tried them on some Climond data I've got for Australia and
>>> I get
>>> > the same behavior. Those rasters are much larger, though, and a
>>> result the
>>> > times are longer; 251 points took about 43 seconds, whereas I just had
>>> to
>>> > give up and stop the 250 point extraction after about 30 minutes.
>>> >
>>> > As for those simulated rasters, I've tried the following:
>>> >
>>> > Plain grids of sequential numbers
>>> > As above, but with a bunch of NAs added
>>> > Filling the Cuban rasters with sequential numbers
>>> > Filling the Cuban rasters with random numbers from a uniform (0,1)
>>> > distribution
>>> >
>>> > None of those show this issue. Anyone have any thoughts about what
>>> might
>>> > be going on here?
>>> >
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>> --
>> Dr. Michael Sumner
>> Software and Database Engineer
>> Australian Antarctic Division
>> 203 Channel Highway
>> Kingston Tasmania 7050 Australia
>>
>>
>
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list