[R] Imputing missing values using "LSmeans" (i.e., population marginal means) - advice in R?

Jenn Barrett jsbarret at sfu.ca
Tue Apr 3 07:23:04 CEST 2012


Hi folks,

I have a dataset that consists of counts over a ~30 year period at multiple (>200) sites. Only one count is conducted at each site in each year; however, not all sites are surveyed in all years. I need to impute the missing values because I need an estimate of the total population size (i.e., sum of counts across all sites) in each year as input to another model. 

> head(newdat,40)
   SITE YEAR COUNT
1     1 1975 12620
2     1 1976 13499
3     1 1977 45575
4     1 1978 21919
5     1 1979 33423
...
37    2 1975 40000
38    2 1978 40322
39    2 1979 70000
40    2 1980 16244


It was suggested to me by a statistician to use LSmeans to do this; however, I do not have SAS, nor do I know anything much about SAS. I have spent DAYS reading about these "LSmeans" and while (I think) I understand what they are, I have absolutely no idea how to a) calculate them in R and b) how to use them to impute my missing values in R. Again, I've searched the mail lists, internet and literature and have not found any documentation to advise on how to do this - I'm lost.

I've looked at popMeans, but have no clue how to use this with predict() - if this is even the route to go. Any advice would be much appreciated. Note that YEAR will be treated as a factor and not a linear variable (i.e., the relationship between COUNT and YEAR is not linear - rather there are highs and lows about every 10 or so years).

One thought I did have was to just set up a loop to calculate the least-squares estimates as:

Yij = (IYi + JYj - Y)/[(I-1)(J-1)]
where  I = number of treatments and J = number of blocks (so I = sites and J = years). I found this formula in some stats lecture handouts by UC Davis on unbalanced data and LSMeans...but does it yield the same thing as using the LSmeans estimates? Does it make any sense? Thoughts?

Many thanks in advance.

Jenn



More information about the R-help mailing list