[R-sig-Geo] specifying time period for splm
Tim Meehan
tmeeha at gmail.com
Fri Jan 8 05:49:01 CET 2016
I've encountered this error before, but don't remember the problem or
solution. Sorry to not be helpful. Good luck.
On Thu, Jan 7, 2016 at 6:09 PM, Maryia Bakhtsiyarava <bakht013 at umn.edu>
wrote:
> Hi Tim,
>
> Thank you so much for your answer! It made things so much more clear.
> Thanks for suggesting to sort my data - I managed to do that and re-ran my
> models.
>
> I have another issue now with one of my models. I conduct a maximum
> likelihood estimation with spatial lag, fixed effects and spatial
> autocorrelation (akin to the one on top of page 13 of JSS) and I get the
> following error: number of items to replace is not a multiple of
> replacement length. I did not alter my data in any way after reading them
> in, so even though I looked at similar problems posted on StackOverFlow, I
> still can't figure out why this happens. Do you know what can cause this
> problem? Also, at which point during the estimation process does the
> replacement happen?
>
> Best regards,
> Maryia
>
>
>
>
> On Wed, Jan 6, 2016 at 11:52 PM, Tim Meehan <tmeeha at gmail.com> wrote:
>
>> Hi Maryia,
>>
>> I think you are on the right track. From the JSS paper (page 3), your
>> data can be:
>>
>> 1. A data.frame whose first two variables (columns) are the individual
>> and time indexes. The data should be sorted by individual (Spatial Unit)
>> and then time (Year), like you have it. In this case the index argument in
>> the call to spml should be left to the default value, which it is in your
>> example. Note that it isn't clear in your example if the first column
>> (with Spatial Unit 1) is an actual column in the dataframe or the row
>> names. If it is row names, then you can make it a column with
>> 'yourdata$Unit <- row.names(yourdata)', and then move it to the first
>> column with 'yourdata <- yourdata[,c(4,1,2,3)]'.
>>
>> The other option is:
>> ˆ
>> 2. A data.frame and a character vector indicating the indexes variables.
>> In this case, the indices wouldn't be the first two columns. They would be
>> somewhere else in the dataframe. But you would specify the columns in the
>> call to spml by adding something like 'index=c("Unit", "Year")'. But, even
>> in this case, I would presort the dataframe by Unit and Year to be safe.
>>
>> One thing that I think is important:
>>
>> When you import the shapefile to make weights, there is a good chance
>> that the shapes won't be sorted according to the spatial units column in
>> your dataset. That is, the first shape might not actually be Spatial Unit
>> 1, like in you sorted data from above. To check, you can create a
>> SpatialPolygonDataFrame called, say, 'spdf1' and then run something like
>> 'plot(spdf1[1])' to see if R plots Spatial Unit 1 or something else. If
>> the shape order is different, you should reorder your
>> SpatialPolygonDataFrame after creating it from the shapefile but before
>> making weights. If this is the case, let me know and I can send along a
>> code chunk to do this.
>>
>> Best,
>> Tim
>>
>>
>> On Wed, Jan 6, 2016 at 8:58 PM, Maryia Bakhtsiyarava <bakht013 at umn.edu>
>> wrote:
>>
>>> Hello,
>>>
>>> I would be very thankful for any help with my issue. I am relatively new
>>> to
>>> R and would greatly appreciate your help.
>>>
>>> I am trying to run spatial panel data models using the "splm," package. I
>>> studied the manual and have been closely following the article by G.
>>> Millo
>>> and G. Piras in the Journal of Statistical Software titled "splm: Spatial
>>> Panel Data Models in R" to create my own models. What I don't seem to
>>> understand is how R understands that the data are arranged in time
>>> series..
>>> I have repeated observations for spatial units for 5 years, however, I
>>> have
>>> not found where in the script I need to specify the periods. I looked at
>>> the data Millo used in the article, and even though the data are arranged
>>> by years, there is no explicit mention in the code or article of how to
>>> pass the time periods to R.
>>>
>>> Let's assume I have the following:
>>> I construct a spatial weights matrix (sp_weightsm) from the shapefile of
>>> my
>>> spatial units.
>>> my formula is: fm1<-y~x1+x2+x3
>>> mod1<-spml(formula = fm2, data = sids, index=NULL, listw=sp_weightsm,
>>> lag=TRUE, spatial.error = "b", model = "within", effect = "individual",
>>> method = "eigen", na.action = na.fail, quiet = TRUE, zero.policy = NULL,
>>> tot.solve = 1e-10, control=list(), legacy=FALSE)
>>>
>>> my data has the following structure:
>>> Year X1 X2
>>> Spatial Unit 1 2000 .... .....
>>> Spatial Unit 1 2001
>>> Spatial Unit 2 2000
>>> Spatial Unit 2 2001
>>> Spatial Unit n .....
>>>
>>> This model runs without errors however I am pretty sure the results are
>>> not
>>> meaningful because how would R know the time periods?
>>> So, my question is: How do I pass on the time periods to R? Do I need to
>>> subset my dataset into different parts based on years?
>>>
>>> Thank you in advance,
>>> Maria
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>>
>
>
> --
> Maryia Bakhtsiyarava
> Graduate student
> Department of Geography, Environment and Society
> University of Minnesota, Twin Cities
>
> Research Assistant
> TerraPop Project
> Minnesota Population Center
>
> 414 Social Sciences, 267 19th Ave S, Minneapolis, MN 55455
>
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list