[R] Imputing missing values using "LSmeans" (i.e., population marginal means) - advice in R?
Liaw, Andy
andy_liaw at merck.com
Thu Apr 5 17:40:04 CEST 2012
Don't know how you searched, but perhaps this might help:
https://stat.ethz.ch/pipermail/r-help/2007-March/128064.html
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Jenn Barrett
> Sent: Tuesday, April 03, 2012 1:23 AM
> To: r-help at r-project.org
> Subject: [R] Imputing missing values using "LSmeans" (i.e.,
> population marginal means) - advice in R?
>
> Hi folks,
>
> I have a dataset that consists of counts over a ~30 year
> period at multiple (>200) sites. Only one count is conducted
> at each site in each year; however, not all sites are
> surveyed in all years. I need to impute the missing values
> because I need an estimate of the total population size
> (i.e., sum of counts across all sites) in each year as input
> to another model.
>
> > head(newdat,40)
> SITE YEAR COUNT
> 1 1 1975 12620
> 2 1 1976 13499
> 3 1 1977 45575
> 4 1 1978 21919
> 5 1 1979 33423
> ...
> 37 2 1975 40000
> 38 2 1978 40322
> 39 2 1979 70000
> 40 2 1980 16244
>
>
> It was suggested to me by a statistician to use LSmeans to do
> this; however, I do not have SAS, nor do I know anything much
> about SAS. I have spent DAYS reading about these "LSmeans"
> and while (I think) I understand what they are, I have
> absolutely no idea how to a) calculate them in R and b) how
> to use them to impute my missing values in R. Again, I've
> searched the mail lists, internet and literature and have not
> found any documentation to advise on how to do this - I'm lost.
>
> I've looked at popMeans, but have no clue how to use this
> with predict() - if this is even the route to go. Any advice
> would be much appreciated. Note that YEAR will be treated as
> a factor and not a linear variable (i.e., the relationship
> between COUNT and YEAR is not linear - rather there are highs
> and lows about every 10 or so years).
>
> One thought I did have was to just set up a loop to calculate
> the least-squares estimates as:
>
> Yij = (IYi + JYj - Y)/[(I-1)(J-1)]
> where I = number of treatments and J = number of blocks (so
> I = sites and J = years). I found this formula in some stats
> lecture handouts by UC Davis on unbalanced data and
> LSMeans...but does it yield the same thing as using the
> LSmeans estimates? Does it make any sense? Thoughts?
>
> Many thanks in advance.
>
> Jenn
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Notice: This e-mail message, together with any attachme...{{dropped:11}}
More information about the R-help
mailing list