[R-sig-Geo] Missing local R-squared and residuals in gwr output

Maximilian Sproß Maximilian.Spross at uibk.ac.at
Wed May 9 12:27:08 CEST 2012


Thank you Roger! The gwr on the MPI cluster works fine.

However, now the output object includes the intially missing three data 
slots: "gwr.e","pred" and "localR2". Unfortunately, the latter contains 
only NA's.
Sorry for of any inconvenience, but do you think you can solve that?

Thanks in advance and all the best,

Max


On 05/07/2012 08:45 PM, Roger Bivand wrote:
> On Mon, 7 May 2012, "Sproß, Johann" wrote:
>
>>
>>
>>
>> -- 
>> Mag. J. Maximilian Sproß
>> Institute of Geography, University of Innsbruck
>> Innrain 52
>> A-6020 INNSBRUCK
>>
>> Tel. +43 (0)512 507 5413
>> web: http://www.uibk.ac.at/geographie/projects/lidar/
>>
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Roger Bivand [mailto:Roger.Bivand at nhh.no]
>> Gesendet: Mo 07.05.2012 14:48
>> An: Maximilian Sproß
>> Cc: r-sig-geo
>> Betreff: Re: [R-sig-Geo] Missing local R-squared and residuals in gwr 
>> output
>>
>> On Mon, 7 May 2012, Maximilian Sproß wrote:
>>
>>> Dear Roger!
>>>
>>> Thank you very much for your fast reply and work!
>>>
>>> I'm not really an expert in HPC-computing, but i will try to report 
>>> as goog
>>> as i can.
>>>
>>> I updated spgwr and started a job on the cluster which takes 
>>> normally 1,5 h.
>>> So far, it run for 5 hours, which indicates that the parallelization 
>>> does not
>>> work efficient anymore. The function makeCluster(64, type="MPI") 
>>> worked fine.
>>> Our cluster runs with openMPI.
>>
>> Correct. I'll try to add back an option to use snow instead of parallel.
>>
>> I tried out the new version but it seems still using parallel.
>>
>> code:
>>
>> gwr_50 <- 
>> gwr(hef at data$DIF~hef at data$ELEVATION+hef at data$SKY+hef at data$SLOPE+hef at data$SOLAR+factor(asp_fac), 
>> data=hef, bandwidth=50, gweight=gwr.Gauss,fit.points=coords, 
>> hatmatrix=FALSE, cl=cl)
>
> Add use_snow=TRUE to the command to switch to snow.
>
> Roger
>
>> Loading required package: parallel
>>
>> Attaching package: 'parallel'
>>
>> The following object(s) are masked from 'package:snow':
>>
>>    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
>>    clusterExport, clusterMap, clusterSplit, makeCluster, parApply,
>>    parCapply, parLapply, parRapply, parSapply, splitIndices,
>>    stopCluster
>>
>> Max
>>
>>
>> When it reaches R-forge, its revision number will be > 1252.
>>
>> Roger
>>
>>>
>>> In that context, i found on the CRAN Task view: High-Performance and 
>>> Parallel
>>> Computing with R the following:
>>> "<http://www.dict.cc/englisch-deutsch/parallelization.html>Direct 
>>> support in
>>> R is starting with release 2.14.0 which includes a new package parallel
>>> incorporating (slightly revised) copies of packages multicore and 
>>> snow (*but
>>> excluding MPI, PVM and NWS clusters*). Does the new parallel support 
>>> works
>>> still in the openMPI environment?
>>>
>>> regards,
>>>
>>> Max
>>>
>>> fyi:
>>>
>>> sessionInfo()
>>> R version 2.14.0 (2011-10-31)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US       LC_NUMERIC=C         LC_TIME=en_US
>>> [4] LC_COLLATE=en_US     LC_MONETARY=en_US    LC_MESSAGES=en_US
>>> [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C
>>> [10] LC_TELEPHONE=C       LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] parallel  stats     graphics  grDevices utils     datasets  methods
>>> [8] base
>>>
>>> other attached packages:
>>> [1] spgwr_0.6-15    spdep_0.5-45    coda_0.14-6     deldir_0.0-16
>>> [5] maptools_0.8-10 foreign_0.8-46  nlme_3.1-102    MASS_7.3-16
>>> [9] Matrix_1.0-1    lattice_0.20-0  boot_1.3-3      gstat_1.0-10
>>> [13] spacetime_0.5-7 xts_0.8-2       zoo_1.7-6       sp_0.9-98
>>> [17] snow_0.3-8      Rmpi_0.5-9
>>>
>>> loaded via a namespace (and not attached):
>>> [1] grid_2.14.0
>>>
>>>
>>> On 05/05/2012 04:24 PM, Roger Bivand wrote:
>>>> On Fri, 4 May 2012, Maximilian Sproß wrote:
>>>>
>>>>> Dear r-sig-geo list!
>>>>>
>>>>> I run gwr on a multi-node cluster(on 64 slots). In the gwr output 
>>>>> (slot
>>>>> "SDF"), the gwr residuals and the local R-squared are missing. When
>>>>> performing the same model on the local machine, these components are
>>>>> included. Unfortunately, the calculation in this way takes about 5 
>>>>> days
>>>>> instead of few hours when using the cluster.
>>>>>
>>>>> Perhaps, that problem arises due to the argument "fit.points", 
>>>>> which has
>>>>> to be passed if the local coefficient estimates should be made on a
>>>>> multi node cluster.
>>>>>
>>>>> Does anyone have an idea how to solve that problem with the missing
>>>>> local R-squared and residuals if the gwr is calculated on a cluster?
>>>>
>>>> The understanding for use on a cluster was that the data points and 
>>>> the fit
>>>> points are different, so there is no observed dependent variable at 
>>>> the fit
>>>> point, hence no local R2. I've added logic in the code that checks for
>>>> equality between the fit and data points, and this for me resolves the
>>>> problem, but may break other things. I've committed to R-forge, 
>>>> project
>>>> rspatial, module spgwr. The source tarball and binary packages 
>>>> should be
>>>> available later this evening European time from:
>>>>
>>>> https://r-forge.r-project.org/R/?group_id=1014
>>>>
>>>> Could you please try it out, and report back? I should also migrate 
>>>> spgwr
>>>> from snow to parallel before I release it.
>>>>
>>>> Best wishes,
>>>>
>>>> Roger
>>>>
>>>>>
>>>>>
>>>>> Thank you very much in advance!
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> Max
>>>>>
>>>>>
>>>>> selected R-code:
>>>>>
>>>>> ### gwr on local machine:
>>>>>
>>>>> gwr_50 <-
>>>>> gwr(hef at data$DIF~hef at data$ELEVATION+hef at data$SKY+hef at data$SLOPE+hef at data$SOLAR, 
>>>>>
>>>>> data=hef, bandwidth=50, gweight=gwr.Gauss)
>>>>>
>>>>>
>>>>> # part of the  str(gwr_50) output...
>>>>>
>>>>>
>>>>> List of 11
>>>>>  $ SDF      :Formal class 'SpatialPointsDataFrame' [package "sp"] 
>>>>> with 5
>>>>> slots
>>>>>   .. ..@ data       :'data.frame':    286288 obs. of  9 variables:
>>>>>   .. .. ..$ sum.w      : num [1:286288] 2009 2003 2091 2089 2086 ...
>>>>>   .. .. ..$ (Intercept): num [1:286288] -28.7 -28.5 -29.9 -29.7 
>>>>> -29.5 ...
>>>>>   .. .. ..$ elevation  : num [1:286288] 0.0139 0.0138 0.014 0.014 
>>>>> 0.014
>>>>> ...
>>>>>   .. .. ..$ sky        : num [1:286288] -0.153 -0.155 -0.146 
>>>>> -0.148 -0.149
>>>>> ...
>>>>>   .. .. ..$ slope      : num [1:286288] -2.58 -2.61 -2.42 -2.45 
>>>>> -2.48 ...
>>>>>   .. .. ..$ solar      : num [1:286288] -0.00139 -0.00136 -0.0015 
>>>>> -0.00147
>>>>> -0.00144 ...
>>>>>   .. .. ..$ gwr.e      : num [1:286288] -0.461 -0.683 -0.5987 -0.2692
>>>>> 0.0406 ...
>>>>>   .. .. ..$ pred       : num [1:286288] 0.806 0.833 0.507 0.514 
>>>>> 0.576 ...
>>>>>   .. .. ..$ localR2    : num [1:286288] 0.621 0.618 0.638 0.635 
>>>>> 0.632 ...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ### gwr on cluster :
>>>>>
>>>>> cl <- makeCluster(32, type="MPI")
>>>>>
>>>>> coords <- coordinates(hef)
>>>>>
>>>>> gw <-
>>>>> gwr(hef at data$DIF~hef at data$ELEVATION+hef at data$SKY+hef at data$SLOPE+hef at data$SOLAR, 
>>>>>
>>>>> data=hef, bandwidth=50, gweight=gwr.Gauss,fit.points=coords,
>>>>> hatmatrix=FALSE, cl=cl)
>>>>>
>>>>> # part of the  str(gwr_50) output...
>>>>>
>>>>> List of 11
>>>>>  $ SDF      :Formal class 'SpatialPointsDataFrame' [package "sp"] 
>>>>> with 5
>>>>> slots
>>>>>   .. ..@ data       :'data.frame':    286288 obs. of  6 variables:
>>>>>   .. .. ..$ sum.w      : num [1:286288] 1 1 1 1 1 ...
>>>>>   .. .. ..$ (Intercept): num [1:286288] 12541 1970 2057 -1505 
>>>>> -1030 ...
>>>>>   .. .. ..$ elevation  : num [1:286288] -3.891 -0.602 -0.738 0.465 
>>>>> 0.309
>>>>> ...
>>>>>   .. .. ..$ sky        : num [1:286288] -0.954 -0.425 3.714 0.159 
>>>>> 0.152
>>>>> ...
>>>>>   .. .. ..$ slope      : num [1:286288] 62.19 NA -27.21 1.95 16.03 
>>>>> ...
>>>>>   .. .. ..$ solar      : num [1:286288] NA NA NA NA 0.042 ...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-Geo mailing list
>>>>> R-sig-Geo at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>>>
>>>
>>>
>>>
>>
>> -- 
>> Roger Bivand
>> Department of Economics, NHH Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>>
>



More information about the R-sig-Geo mailing list