[R-sig-Geo] specifying time period for splm

Maryia Bakhtsiyarava bakht013 at umn.edu
Fri Jan 8 02:09:36 CET 2016


Hi Tim,

Thank you so much for your answer! It made things so much more clear.
Thanks for suggesting to sort my data - I managed to do that and re-ran my
models.

I have another issue now with one of my models. I conduct a maximum
likelihood estimation with spatial lag, fixed effects and spatial
autocorrelation (akin to the one on top of page 13 of JSS) and I get the
following error: number of items to replace is not a multiple of
replacement length. I did not alter my data in any way after reading them
in, so even though I looked at similar problems posted on StackOverFlow, I
still can't figure out why this happens. Do you know what can cause this
problem? Also, at which point during the estimation process does the
replacement happen?

Best regards,
Maryia




On Wed, Jan 6, 2016 at 11:52 PM, Tim Meehan <tmeeha at gmail.com> wrote:

> Hi Maryia,
>
> I think you are on the right track.  From the JSS paper (page 3), your
> data can be:
>
> 1. A data.frame whose first two variables (columns) are the individual and
> time indexes. The data should be sorted by individual (Spatial Unit) and
> then time (Year), like you have it.  In this case the index argument in the
> call to spml should be left to the default value, which it is in your
> example.  Note that it isn't clear in your example if the first column
> (with Spatial Unit 1) is an actual column in the dataframe or the row
> names.  If it is row names, then you can make it a column with
> 'yourdata$Unit <- row.names(yourdata)', and then move it to the first
> column with 'yourdata <- yourdata[,c(4,1,2,3)]'.
>
> The other option is:
> ˆ
> 2. A data.frame and a character vector indicating the indexes variables.
> In this case, the indices wouldn't be the first two columns.  They would be
> somewhere else in the dataframe.  But you would specify the columns in the
> call to spml by adding something like 'index=c("Unit", "Year")'.  But, even
> in this case, I would presort the dataframe by Unit and Year to be safe.
>
> One thing that I think is important:
>
> When you import the shapefile to make weights, there is a good chance that
> the shapes won't be sorted according to the spatial units column in your
> dataset.  That is, the first shape might not actually be Spatial Unit 1,
> like in you sorted data from above.  To check, you can create a
> SpatialPolygonDataFrame called, say, 'spdf1' and then run something like
> 'plot(spdf1[1])' to see if R plots Spatial Unit 1 or something else.  If
> the shape order is different, you should reorder your
> SpatialPolygonDataFrame after creating it from the shapefile but before
> making weights.  If this is the case, let me know and I can send along a
> code chunk to do this.
>
> Best,
> Tim
>
>
> On Wed, Jan 6, 2016 at 8:58 PM, Maryia Bakhtsiyarava <bakht013 at umn.edu>
> wrote:
>
>> Hello,
>>
>> I would be very thankful for any help with my issue. I am relatively new
>> to
>> R and would greatly appreciate your help.
>>
>> I am trying to run spatial panel data models using the "splm," package. I
>> studied the manual and have been closely following the article by G. Millo
>> and G. Piras in the Journal of Statistical Software titled "splm: Spatial
>> Panel Data Models in R" to create my own models. What I don't seem to
>> understand is how R understands that the data are arranged in time
>> series..
>> I have repeated observations for spatial units for 5 years, however, I
>> have
>> not found where in the script I need to specify the periods. I looked at
>> the data Millo used in the article, and even though the data are arranged
>> by years, there is no explicit mention in the code or article of how to
>> pass the time periods to R.
>>
>> Let's assume I have the following:
>> I construct a spatial weights matrix (sp_weightsm) from the shapefile of
>> my
>> spatial units.
>> my formula is: fm1<-y~x1+x2+x3
>> mod1<-spml(formula = fm2, data = sids, index=NULL, listw=sp_weightsm,
>> lag=TRUE, spatial.error = "b", model = "within", effect = "individual",
>> method = "eigen", na.action = na.fail, quiet = TRUE, zero.policy = NULL,
>> tot.solve = 1e-10, control=list(), legacy=FALSE)
>>
>> my data has the following structure:
>>                         Year   X1   X2
>> Spatial Unit 1  2000   ....   .....
>> Spatial Unit 1  2001
>> Spatial Unit 2  2000
>> Spatial Unit 2  2001
>> Spatial Unit n  .....
>>
>> This model runs without errors however I am pretty sure the results are
>> not
>> meaningful because how would R know the time periods?
>> So, my question is: How do I pass on the time periods to R? Do I need to
>> subset my dataset into different parts based on years?
>>
>> Thank you in advance,
>> Maria
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
>


-- 
Maryia Bakhtsiyarava
Graduate student
Department of Geography, Environment and Society
University of Minnesota, Twin Cities

Research Assistant
TerraPop Project
Minnesota Population Center

414 Social Sciences, 267 19th Ave S, Minneapolis, MN 55455

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list