[R-sig-Geo] Moran's I with Objects of Different Length

Chanda Chiseni cch|@en| @end|ng |rom gm@||@com
Wed Aug 12 14:30:46 CEST 2020


Hi Roger

Thank you for your response.

I am using DHS individual level data , they cluster individuals into about
25-30 people and assign them coordinates depending on where they were
surveyed. So about 30 individuals have similar coordinates. Yes I do have
more than one observation at one point as you have stated, this could be
the source of my problem indeed, My question is how could i go about
creating nearest neighbors and weights by clustering individuals that share
the same point? I sought for a solution online but could not find any.
Sorry I am a novice at spatial econometrics, this may be obvious to some ,
so instead of testing for spatial correlation in my residuals, I should in
essence test for spatial autocorrelation in my outcome variable, for
instance in my analysis I am using a dummy variable of whether an
individual is hiv positive or negative. And is there a specific way of
using a moran.test() on a dummy variable?


Kind Regards,

Michael Chanda Chiseni

Phd Candidate

Department of Economic History

Lund University

Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund



*Africa is not poor, it is poorly managed (Ellen Johnson-Sirleaf ). *






On Wed, Aug 12, 2020 at 12:55 PM Roger Bivand <Roger.Bivand using nhh.no> wrote:

> On Tue, 11 Aug 2020, Chanda Chiseni wrote:
>
> > I am trying to perform moran's I test on residuals from by probit model
> > using k-nearest neighbour weights, however i run into an error which i
> > cant seem to find the solution anywhere online the error is
> >
> >> moran.test(residuals.glm(svyprobitest),knear2weight)
> > Error in moran.test(residuals.glm(svyprobitest), knear2weight) :
> >  objects of different length
> >
> >
> > The codes I use to come up to this are below
> >
> >> library(foreign)
> >> dhsanalysis2= read.dta("DHSSpatialreg11.dta")
> > ##removing NAs in PSU
> >> dhsanalysis3=subset(dhsanalysis2, !is.na(psu))
> > ##generating a survey design
> >> mydesighdhs= svydesign(ids =~psu, data =dhsanalysis3, weight = ~wgt,
> > strata = ~v023,nest=TRUE)
> > ##Defining coordinate colums
> >> coordinates(dhsanalysis3)= c("longnum","latnum")
> > #Defining Projection
> >> proj4string(dhsanalysis3) <- CRS("+init=epsg:4326")
> > ##binding longiude and latitude
> >> lon2<- dhsanalysis3$longnum
> >> lat2<- dhsanalysis3$latnum
> >> coords<- cbind(lon2,lat2)
> > ##Creating spatial weights based on the nearest neighbour
> >> knear2= knearneigh(coords,k=2,longlat=T)
> > Warning message:
> > In knearneigh(coords, k = 2, longlat = T) :
> >  knearneigh: identical points found ### I get a warning message on
> > identical points found, is this a problem and how would i deal with this
>
> It usually indicates muddled thinking. You have more than one observation
> at the same point, which could be an unintended duplicate (copy of the
> same observation), or it could indicate that any spatial process has
> multiple values at that point. In geostatistics, one would jitter the
> points to introduce distance. In your case, this probably isn't households
> living at one address point. All the observations at the same point will
> get the same sets of neighbours. If there are many co-located
> observations, k in k-nearest may be too small to include them all.
>
> >> knear2.nb= knn2nb(knear2)
> >> knear2weight= nb2listw(knear2.nb, style="W",zero.policy=T)
> >
> > ##declaring categorical variables as factor variables
> >> dhsanalysis3$hivpositive.f <- factor(dhsanalysis3$hivpositive)
> >> dhsanalysis3$protestant.f <- factor(dhsanalysis3$protestant2)
> >> dhsanalysis3$Married.f <- factor(dhsanalysis3$Married)
> >> dhsanalysis3$female.f <- factor(dhsanalysis3$female)
> >> dhsanalysis3$urban1.f <- factor(dhsanalysis3$urban1)
> >> dhsanalysis3$river10kmdum.f <- factor(dhsanalysis3$river10kmdum)
> >> dhsanalysis3$Explorer50kmdum.f <- factor(dhsanalysis3$Explorer50kmdum)
> >> dhsanalysis3$Rail50kmdum.f <- factor(dhsanalysis3$Rail50kmdum)
> >> dhsanalysis3$Province1.f <- factor(dhsanalysis3$Province1)
> >> dhsanalysis3$WealthIndex.f <- factor(dhsanalysis3$WealthIndex)
> >> dhsanalysis3$occupation2.f <- factor(dhsanalysis3$occupation2)
> >> dhsanalysis3$highested.f <- factor(dhsanalysis3$highested)
> >> protestant.f= dhsanalysis3$protestant.f
> >> hivpositive.f=dhsanalysi3$hivpositive.f
> >> Married.f=dhsanalysis3$Married.f
> >> female.f=dhsanalysis3$female.f
> >> urban1.f=dhsanalysis3$urban1.f
> >> river10kmdum.f=dhsanalysis3$river10kmdum.f
> >> Explorer50kmdum.f=dhsanalysis2$Explorer50kmdum.f
> >> Province1.f=dhsanalysis3$Province1.f
> >> WealthIndex.f=dhsanalysis3$WealthIndex.f
> >> occupation2.f=dhsanalysis3$occupation2.f
> >> highested.f=dhsanalysis3$highested.f
> >> Age=dhsanalysis3$Age
> >> Age2=dhsanalysis3$Age2
> >> HIVKnowledge=dhsanalysis3$HIVKnowledge
> >> churchkm=dhsanalysis3$churchkm
> >> lnhospitalkm=dhsanalysis3$lnhospitalkm
> >> lnElevationMean=dhsanalysis3$lnElevationMean
> >
> >> ###VARIABLES IN SURVEY DESIGN
> >> mydesighdhs$hivpositive.f <- factor(dhsanalysis3$hivpositive)
> >> mydesighdhs$protestant.f <- factor(dhsanalysis3$protestant2)
> >> mydesighdhs$Married.f <- factor(dhsanalysis3$Married)
> >> mydesighdhs$female.f <- factor(dhsanalysis3$female)
> >> mydesighdhs$urban1.f <- factor(dhsanalysis3$urban1)
> >> mydesighdhs$river10kmdum.f <- factor(dhsanalysis3$river10kmdum)
> >> mydesighdhs$Explorer50kmdum.f <- factor(dhsanalysis3$Explorer50kmdum)
> >> mydesighdhs$Rail50kmdum.f <- factor(dhsanalysis3$Rail50kmdum)
> >> mydesighdhs$Province1.f <- factor(dhsanalysis3$Province1)
> >> mydesighdhs$WealthIndex.f <- factor(dhsanalysis3$WealthIndex)
> >> mydesighdhs$occupation2.f <- factor(dhsanalysis3$occupation2)
> >> mydesighdhs$highested.f <- factor(dhsanalysis3$highested)
> >> mydesighdhs$Age=Age
> >> mydesighdhs$Age2=Age2
> >> mydesighdhs$HIVKnowledge=HIVKnowledge
> >> mydesighdhs$churchkm=churchkm
> >> mydesighdhs$lnhospitalkm=lnhospitalkm
> >> mydesighdhs$lnElevationMean=lnElevationMean
> >
> >> svyprobitest= svyglm(hivpositive.f~ churchkm +lnhospitalkm
> +protestant.f+
> > Age+
> >
> Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f
> > + WealthIndex.f + HIVKnowledge+ occupation2.f+
> > highested.f,design=mydesighdhs,family = quasibinomial(link =
> > "probit"),data=dhsanalysis3)
> > Error in .subset2(x, i, exact = exact) : subscript out of bounds
>
> So you've an error anyway.
>
> >
> > ## Iran my probit model without implementing the survey design in the
> model
> > just to see whether Moran test is working
> >> svyprobitest2= glm(hivpositive.f~ churchkm +lnhospitalkm +protestant.f+
> > Age+
> >
> Age2+Married.f+female.f+urban1.f+river10kmdum.f+Explorer50kmdum.f+Rail50kmdum.f+lnElevationMean+Province1.f
> > + WealthIndex.f + HIVKnowledge+ occupation2.f+ highested.f,family =
> > quasibinomial(link = "probit"),data=dhsanalysis3)
> > #Moran test
> >> moran.test(residuals.glm(svyprobitest),knear2weight)
> > Error in moran.test(residuals.glm(svyprobitest), knear2weight) :
> >  objects of different length
> >
>
> Testing residuals using moran.test() is usually wrong, as the expectation
> and variance of the statistic are based on a null model (intercept only),
> not a model with covariates.
>
> What are length(residuals.glm(svyprobitest)) and length(knear2.nb)? Did
> glm drop observations with missing values? If so, you should subset both
> the data submitted to glm() and the neighbour object so that they match.
>
> Hope this helps,
>
> Roger
>
> >
> > Kind Regards,
> >
> > Michael Chanda Chiseni
> >
> > Phd Candidate
> >
> > Department of Economic History
> >
> > Lund University
> >
> > Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund
> >
> >
> >
> > *Africa is not poor, it is poorly managed (Ellen Johnson-Sirleaf ). *
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > R-sig-Geo using r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list