[R-sig-Geo] Missing local R-squared and residuals in gwr output
Roger Bivand
Roger.Bivand at nhh.no
Mon May 7 14:48:57 CEST 2012
On Mon, 7 May 2012, Maximilian Sproß wrote:
> Dear Roger!
>
> Thank you very much for your fast reply and work!
>
> I'm not really an expert in HPC-computing, but i will try to report as goog
> as i can.
>
> I updated spgwr and started a job on the cluster which takes normally 1,5 h.
> So far, it run for 5 hours, which indicates that the parallelization does not
> work efficient anymore. The function makeCluster(64, type="MPI") worked fine.
> Our cluster runs with openMPI.
Correct. I'll try to add back an option to use snow instead of parallel.
When it reaches R-forge, its revision number will be > 1252.
Roger
>
> In that context, i found on the CRAN Task view: High-Performance and Parallel
> Computing with R the following:
> "<http://www.dict.cc/englisch-deutsch/parallelization.html>Direct support in
> R is starting with release 2.14.0 which includes a new package parallel
> incorporating (slightly revised) copies of packages multicore and snow (*but
> excluding MPI, PVM and NWS clusters*). Does the new parallel support works
> still in the openMPI environment?
>
> regards,
>
> Max
>
> fyi:
>
> sessionInfo()
> R version 2.14.0 (2011-10-31)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US
> [4] LC_COLLATE=en_US LC_MONETARY=en_US LC_MESSAGES=en_US
> [7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] spgwr_0.6-15 spdep_0.5-45 coda_0.14-6 deldir_0.0-16
> [5] maptools_0.8-10 foreign_0.8-46 nlme_3.1-102 MASS_7.3-16
> [9] Matrix_1.0-1 lattice_0.20-0 boot_1.3-3 gstat_1.0-10
> [13] spacetime_0.5-7 xts_0.8-2 zoo_1.7-6 sp_0.9-98
> [17] snow_0.3-8 Rmpi_0.5-9
>
> loaded via a namespace (and not attached):
> [1] grid_2.14.0
>
>
> On 05/05/2012 04:24 PM, Roger Bivand wrote:
>> On Fri, 4 May 2012, Maximilian Sproß wrote:
>>
>>> Dear r-sig-geo list!
>>>
>>> I run gwr on a multi-node cluster(on 64 slots). In the gwr output (slot
>>> "SDF"), the gwr residuals and the local R-squared are missing. When
>>> performing the same model on the local machine, these components are
>>> included. Unfortunately, the calculation in this way takes about 5 days
>>> instead of few hours when using the cluster.
>>>
>>> Perhaps, that problem arises due to the argument "fit.points", which has
>>> to be passed if the local coefficient estimates should be made on a
>>> multi node cluster.
>>>
>>> Does anyone have an idea how to solve that problem with the missing
>>> local R-squared and residuals if the gwr is calculated on a cluster?
>>
>> The understanding for use on a cluster was that the data points and the fit
>> points are different, so there is no observed dependent variable at the fit
>> point, hence no local R2. I've added logic in the code that checks for
>> equality between the fit and data points, and this for me resolves the
>> problem, but may break other things. I've committed to R-forge, project
>> rspatial, module spgwr. The source tarball and binary packages should be
>> available later this evening European time from:
>>
>> https://r-forge.r-project.org/R/?group_id=1014
>>
>> Could you please try it out, and report back? I should also migrate spgwr
>> from snow to parallel before I release it.
>>
>> Best wishes,
>>
>> Roger
>>
>>>
>>>
>>> Thank you very much in advance!
>>>
>>> Kind regards,
>>>
>>> Max
>>>
>>>
>>> selected R-code:
>>>
>>> ### gwr on local machine:
>>>
>>> gwr_50 <-
>>> gwr(hef at data$DIF~hef at data$ELEVATION+hef at data$SKY+hef at data$SLOPE+hef at data$SOLAR,
>>> data=hef, bandwidth=50, gweight=gwr.Gauss)
>>>
>>>
>>> # part of the str(gwr_50) output...
>>>
>>>
>>> List of 11
>>> $ SDF :Formal class 'SpatialPointsDataFrame' [package "sp"] with 5
>>> slots
>>> .. ..@ data :'data.frame': 286288 obs. of 9 variables:
>>> .. .. ..$ sum.w : num [1:286288] 2009 2003 2091 2089 2086 ...
>>> .. .. ..$ (Intercept): num [1:286288] -28.7 -28.5 -29.9 -29.7 -29.5 ...
>>> .. .. ..$ elevation : num [1:286288] 0.0139 0.0138 0.014 0.014 0.014
>>> ...
>>> .. .. ..$ sky : num [1:286288] -0.153 -0.155 -0.146 -0.148 -0.149
>>> ...
>>> .. .. ..$ slope : num [1:286288] -2.58 -2.61 -2.42 -2.45 -2.48 ...
>>> .. .. ..$ solar : num [1:286288] -0.00139 -0.00136 -0.0015 -0.00147
>>> -0.00144 ...
>>> .. .. ..$ gwr.e : num [1:286288] -0.461 -0.683 -0.5987 -0.2692
>>> 0.0406 ...
>>> .. .. ..$ pred : num [1:286288] 0.806 0.833 0.507 0.514 0.576 ...
>>> .. .. ..$ localR2 : num [1:286288] 0.621 0.618 0.638 0.635 0.632 ...
>>>
>>>
>>>
>>>
>>> ### gwr on cluster :
>>>
>>> cl <- makeCluster(32, type="MPI")
>>>
>>> coords <- coordinates(hef)
>>>
>>> gw <-
>>> gwr(hef at data$DIF~hef at data$ELEVATION+hef at data$SKY+hef at data$SLOPE+hef at data$SOLAR,
>>> data=hef, bandwidth=50, gweight=gwr.Gauss,fit.points=coords,
>>> hatmatrix=FALSE, cl=cl)
>>>
>>> # part of the str(gwr_50) output...
>>>
>>> List of 11
>>> $ SDF :Formal class 'SpatialPointsDataFrame' [package "sp"] with 5
>>> slots
>>> .. ..@ data :'data.frame': 286288 obs. of 6 variables:
>>> .. .. ..$ sum.w : num [1:286288] 1 1 1 1 1 ...
>>> .. .. ..$ (Intercept): num [1:286288] 12541 1970 2057 -1505 -1030 ...
>>> .. .. ..$ elevation : num [1:286288] -3.891 -0.602 -0.738 0.465 0.309
>>> ...
>>> .. .. ..$ sky : num [1:286288] -0.954 -0.425 3.714 0.159 0.152
>>> ...
>>> .. .. ..$ slope : num [1:286288] 62.19 NA -27.21 1.95 16.03 ...
>>> .. .. ..$ solar : num [1:286288] NA NA NA NA 0.042 ...
>>>
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>
>
>
--
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list