[R-sig-Geo] Impute missing values along a spatial network

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Wed Mar 24 18:00:59 CET 2021


On Wed, 24 Mar 2021, Tobias Ruttenauer wrote:

> Thank you very much for the hint, Roger! Completely right, a Gaussian 
> process actually does not make too much sense in this case. I'll have a 
> look into INLA and see if I can work with that.

If you look at this like a rate shrinkage problem, maybe you can set up a 
spatial interaction model based on data at the end nodes of the segments 
to generate "expected" volumes? These would then not be (necessarily) time 
constant covariates at the nodes, but could reflect supply/demand factors. 
Otherwise you just have the volume counts. A basic intrinsic CAR might be 
helpful, but the data would need to guide the signal/noise ratio - without 
expectations, and with many missing counts, the predictions for segments 
with no counts will have broad posterior distributions. Avoiding Gaussian 
may help avoid predicting negative volumes - but log volumes might be OK.

Roger

>
> For now, I don't have any covariates in this. This is mainly because I'm 
> specifically interested in the annual variation of traffic counts, but 
> available covariates are all time-constant.
>
> Thanks again and best wishes
> Tobias
>
>
> -----Original Message-----
> From: Roger Bivand <Roger.Bivand using nhh.no>
> Sent: 24 March 2021 14:18
> To: Tobias Ruttenauer <tobias.ruttenauer using nuffield.ox.ac.uk>
> Cc: r-sig-geo using r-project.org
> Subject: Re: [R-sig-Geo] Impute missing values along a spatial network
>
> On Wed, 24 Mar 2021, Tobias Ruttenauer wrote:
>
>> Dear list members,
>>
>> I am trying to construct a road network with traffic estimates for
>> each road segment. I have count data of the traffic for a subset of
>> the segments and I have the road network as spatial lines data. For
>> those segments without count data, I would like to perform something
>> like linear imputation or some sort of interpolation / kriging along
>> the road network instead of using pure geographical distance. For
>> instance, if I have 7 road segments A-B-C-D-E and F-G (F and G are
>> unconnected to the rest), and I have data for A and D, how can I
>> impute data for B, C (and
>> E) by only using A and D, while ignoring F and G even though they
>> might be geographically close?
>
> Are there any relevant covariates associated with the road segments? I think that this is more of a Markov than a Gaussian random field, so a Poisson spatial regression with a neighbour matrix representing contiguous segments might be possible. Covariates, or an offset by an expected volume might help. INLA with a Besag model - INLA fits missing responses, or
> mgcv::gam() with an "mrf" smooth or hglm() then predict?
>
> Any other suggestions?
>
> Roger
>
>>
>> This seems fairly intuitive to me but I couldn't find a package doing
>> that. stplanr would do something related but it seems it needs
>> origin-destination data (which I don't have). I'd be grateful if
>> someone could nudge me into the right direction. I guess I'm using the
>> wrong terminology.
>>
>> Thanks a lot and best wishes
>> Tobias
>>
>> Tobias Rüttenauer
>> Nuffield College
>> University of Oxford
>> Oxford, OX1 1NF
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo using r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics, Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway.
> e-mail: Roger.Bivand using nhh.no
> https://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway.
e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en


More information about the R-sig-Geo mailing list