[R-sig-Geo] AGAIN ON GGWR

Roger Bivand Roger.Bivand at nhh.no
Tue Jan 22 15:10:21 CET 2008


On Tue, 22 Jan 2008, Luca Moiana wrote:

> Hello Everyone,

(Please use a better email client, one that only uses plain text, does not 
use HTML, and does not break lines in the wrong places or add empty 
lines).

>
> Following yesterday?s suggestions I wrote this code:
>
> ##Creation of Spatial Points Data Frame
> x <- as.matrix(subsample$E)
> y <- as.matrix (subsample$N)
> S <- SpatialPoints (cbind(x,y))
> S <- SpatialPoints (list (x,y))
> S <- SpatialPoints (data.frame (x,y))
> data <- (subsample)

Do not assign to data, there is a function called that.

> Sdati14400 <- SpatialPointsDataFrame(S, data)
> ##Random sample for bandwidth (5%)
> subsample <- dati14400 [sample(1:nrow(dati14400), 488, replace=F),]
>
> ##Bandwidth value
>
> Sdati14400test.sel <- ggwr.sel(E14400 ~ V211 + V213 + V240 + V313 + V321 
> + V322 + V331511 + LnMPI25l_max + B:A, family = poisson(link = log), 
> data = Sdati14400, coords=Sdati14400.coords, adapt = FALSE, gweight = 
> gwr.gauss, verbose = TRUE, longlat = FALSE)

I don't follow, what is Sdati14400? and Sdati14400.coords? Please try 
without so many variables, simplify until you understand what is 
happening. longlat = FALSE, but below it is TRUE?

>
> ##GGwr
>
> Sdati14400.ggwr <- ggwr(E14400 ~ V211 + V213 + V313 + V321 + V322 + 
> V331511 + LnMPI25l_max + B:A, data = Sdati14400, 
> coords=Sdati14400 at coords, bandwidth=Sdati14400test.sel, gweight = 
> gwr.gauss, adapt = 1, family = poisson(link = log), longlat = TRUE)
>
> Form the Bandwidth calculation I got this message: Warning in glm.fit(x 
> = X, y = Y, weights = weights, start = start, etastart = etastart, : 
> fitted rates numerically 0 occurred

Warnings in CV search for bandwidths are not a problem, because the search 
algorithm will occasionally try unsuitable values, which get trapped, and 
the search restarted from the last valid value.

>
> Skipped and calculated ggwr to get to this results:
>
> Call:
>
> ggwr(formula = E14400 ~ V211 + V213 + V313 + V321 + V322 + V331511 + 
> LnMPI25l_max + B:A, data = Sdati14400, coords = Sdati14400 at coords, 
> bandwidth = Sdati14400test.sel, gweight = gwr.gauss, adapt = 1,

This places a Gaussian kernel over each point, but includes all points. In 
addition, you did want to fit over all your points, didn't you? You can do 
this if you like, but why?

> family = poisson(link = log), longlat = TRUE)
> Kernel function: gwr.gauss
> Adaptive quantile: 1 (about 488 of 488)
> Summary of GWR coefficient estimates:
>                  Min.   1st Qu. Median   3rd Qu.      Max.    Global
> X.Intercept.   -8.0040 -6.8270   -6.5200   -6.3300 -5.9980   -6.6016
> V211           -3.5370   -2.9440 -2.6340   -2.3250   -1.9590 -2.6024
> V213         -212.0000 -203.8000 -199.3000 -193.1000 -177.1000 -198.6228
> V313            0.1216    0.2915  0.3675  0.4515    0.6626    0.3766
> V321           -5.3780   -4.7580   -4.3820 -4.0840   -3.4480   -4.3489
> V322          -24.1100  -22.7300  -22.0400 -21.4800  -20.8800  -21.9145
> V331511      -110.8000  -92.7700  -70.7300 -56.5300  -49.0700  -68.8769
> LnMPI25l_max    0.3357    0.3532    0.3673   0.3850    0.4546    0.3709
> B.A             5.3070    5.8140    6.2040   6.4940    6.9850    6.1363
>
> Is that correct or you have other suggestions???

I think the onus is on you to answer this, correct depends on what you 
need. I doubt whether this tells you very much. Also, plot pairs() of the 
local coefficients to see if you have induced local collinearity - see 
Wheeler & Tiefelsdorf (2005) referenced in the package help pages.

>
> Other question, I used variables, coming from a colleague GLM analysis, 
> any suggestions on how to choose the variables and use directly ggwr??
>

A formula is a formula, choose as you wish, but best with a substantive 
reasoning behind the choice of variable and its functional form.

Roger

>
>
> THANKS A
> LOT
>
>
>
> Luca Moiana
>
> PhD
> Candidate ? Enrivornmental Science Department
>
> University of
> Milan-Bicocca
>
>
>
>> Date: Mon, 21 Jan 2008 15:11:29 +0100
>> From: Roger.Bivand at nhh.no
>> To: luca_moiana at hotmail.com
>> CC: r-sig-geo at stat.math.ethz.ch
>> Subject: RE: [R-sig-Geo] ggwr and memory problems
>>
>> On Mon, 21 Jan 2008, Luca Moiana wrote:
>>
>>>
>>>
>>>
>>>> Date: Mon, 21 Jan 2008 14:38:18 +0100
>>>> From: Roger.Bivand at nhh.no
>>>> To: luca_moiana at hotmail.com
>>>> CC: r-sig-geo at stat.math.ethz.ch
>>>> Subject: Re: [R-sig-Geo] ggwr and memory problems
>>>>
>>>> On Mon, 21 Jan 2008, Luca Moiana wrote:
>>>>
>>>>> Dear List,
>>>>>
>>>>> Here is my problem:
>>>>>
>>>>> I wanna run a ggwr on a 9000 records Spatial Points Data Frame using R
>>>>> on a Windows Machine (Dual processor, 4 GB RAM).
>>>>
>>>> Have you tuned Windows memory use as discussed in the R for Windows FAQ -
>>>> section 2.9? The binaries are 32-bit, and need to be told how much memory
>>>> to use when trying to carry out memory intensive work.
>>>
>>> We tried this but didn't change anything.
>>
>> OK. It may run on Linux, because the memory allocation there accepts many
>> small free patches but Windows wants a single free chunk the size of the
>> request.
>>
>>>
>>>
>>>>
>>>>>
>>>>> When I try to calculate bandwidth using:
>>>>>
>>>>> Sdati14400test.sel
>>>>> <- ggwr.sel(E14400 ~ V211 + V213 + V240 + V313 + V321 + V322 + V331511 +
>>>>> LnMPI25l.max + B:A, family = poisson(link = log), data = Sdati14400test,
>>>>> coords=Sdati14400test.coords, adapt = FALSE, gweight = gwr.gauss, verbose =
>>>>> TRUE, longlat = FALSE)
>>>>>
>>>>> I get a memory allocation error saying that the software is not able to
>>>>> allocate a 749 Mb memory.
>>>>>
>>>>> Any suggestion??
>>>>
>>>> It isn't strictly necessary to use all the observations to find the
>>>> bandwidth - take a couple of 5% samples and see if the results differ
>>>> much.
>>>
>>> I didn't know that and I would try, but then I'll have memory problems when I try to run ggwr??
>>> Is there a command to obtain a random 5% sample??
>>>
>>
>> Try subsetting the data= argument object: df[o,] with the output of o <-
>> sample(). Remember to say set.seed(whatever) to be able to repeat if need
>> be.
>>
>>>
>>>>
>>>>>
>>>>> I can also switch and use the same machine with a 64bit Ubuntu SO.
>>>>>
>>>>
>>>> You can try that, but consider dividing the fit.points up into chunks, and
>>>> running several R processes when actually fitting the ggwr model. The data
>>>> points stay the same, but fit subsets of the fit.points in separate
>>>> processes.
>>>
>>> I don't have fit.points cause I'm working on the entire Lombardy Region
>>> (Northern Italy) and I'd like to compare the model from ggwr with glm
>>> models a colleague obtained from a regular glm.
>>
>> If no fit.points are given, the data points are copied across as fit
>> points internally. You are free to subset the data.points into many
>> fit.points, and concatenate the output objects afterwards. This should
>> remove the difficulty.
>>
>> Roger
>>
>>>
>>> MANY THANKS
>>>
>>>
>>>>
>>>> ggwr() has not (yet) been adapted for using a cluster, but gwr() has and a
>>>> snow socket cluster will run happily on Linux there, and since it is run
>>>> within the function, it concatenates the results before returning. If this
>>>> would be useful of ggwr(), consider taking a look at the code.
>>>>
>>>> Roger
>>>>
>>>>>
>>>>> THANK A LOT
>>>>>
>>>>>
>>>>>
>>>>> Luca Moiana
>>>>>
>>>>>
>>>>> _________________________________________________________________
>>>>>
>>>>>
>>>>> 	[[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-Geo mailing list
>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>>>
>>>> --
>>>> Roger Bivand
>>>> Economic Geography Section, Department of Economics, Norwegian School of
>>>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>>>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>>>> e-mail: Roger.Bivand at nhh.no
>>>>
>>>
>>> _________________________________________________________________
>>> Express yourself instantly with MSN Messenger! Download today it's FREE!
>>> http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>>
>> --
>> Roger Bivand
>> Economic Geography Section, Department of Economics, Norwegian School of
>> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
>> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today it's FREE!
> http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no




More information about the R-sig-Geo mailing list