[R] Errors-In-Variables in R
John Fox
jfox at mcmaster.ca
Sun Mar 3 16:54:26 CET 2013
Dear Cedric,
If I understand correctly what you want to do, and if you're willing to
assume that the variables are normally distributed, then you should be able
to use any of the latent-variable structural-equation-modeling packages in
R, such as sem, OpenMX, or lavaan.
Here's an artificial example using the sem package:
------------ snip ----------
> set.seed(12345)
> zeta <- rnorm(1000)
> y <- 1 + 2*zeta + rnorm(1000, 0, 1)
> x <- zeta + rnorm(1000)
> plot(x, y)
> Data <- data.frame(x, y)
> summary(lm(y ~ x)) # biased
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-6.6339 -1.1406 0.0299 1.1573 6.5652
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.04007 0.05514 18.86 <2e-16 ***
x 1.06089 0.04012 26.44 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.743 on 998 degrees of freedom
Multiple R-squared: 0.4119, Adjusted R-squared: 0.4113
F-statistic: 699.1 on 1 and 998 DF, p-value: < 2.2e-16
> plot(x, y) # not shown
>
> library(sem)
>
> eqns <- specifyEquations()
1: y = alpha*Intercept + beta*zeta
2: x = 1*zeta
3: V(y) = sigma
4: V(x) = 1
5: V(zeta) = phi
6:
Read 5 items
> model <- sem(eqns, data=Data, raw=TRUE, fixed.x="Intercept")
> summary(model)
Model fit to raw moment matrix.
Model Chisquare = 0.2264654 Df = 1 Pr(>Chisq) = 0.6341572
AIC = 8.226465
BIC = -6.68129
Normalized Residuals
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000 0.1635 0.1711 0.2189 0.2564 0.4759
Parameter Estimates
Estimate Std Error z value Pr(>|z|)
alpha 1.0400668 0.05507397 18.884905 1.518098e-79 y <--- Intercept
beta 2.2553406 0.14197058 15.885971 7.926103e-57 y <--- zeta
sigma 0.6404697 0.25612060 2.500657 1.239632e-02 y <--> y
phi 0.8881856 0.08444223 10.518263 7.117323e-26 zeta <--> zeta
Iterations = 15
> library(car)
> linearHypothesis(model, c("alpha = 1", "beta = 2", "sigma = 1", "phi =
1")) # true parameter values
Linear hypothesis test
Hypothesis:
alpha = 1
beta = 2
sigma = 1
phi = 1
Model 1: restricted model
Model 2: model
Res.Df Df Chisq Pr(>Chisq)
1 5
2 1 4 3.8285 0.4297
------------ snip ----------
For other distributional assumptions, you'd have to write your own objective
function but you may still be able to use sem or one of the other SEM
packages.
I hope this helps,
John
-----------------------------------------------
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Cedric Sodhi
> Sent: Saturday, March 02, 2013 4:56 PM
> To: Rui Barradas
> Cc: r-help at r-project.org
> Subject: Re: [R] Errors-In-Variables in R
>
> Perhaps it would have been clearer that this is no homework if I
> hadn't forgotten to say what [1] is. Sorry for that.
>
> [1] https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15225
>
> (This is no homework but genuinely adresses the problem that R to my
> knowledge does not have models for error in variables)
>
>
> On Sat, Mar 02, 2013 at 09:34:21PM +0000, Rui Barradas wrote:
> > There's a no homework policy in R-help.
> >
> > Rui Barradas
> >
> > Em 02-03-2013 18:28, Cedric Sodhi escreveu:
> > > In reference to [1], how would you solve the following regression
> > > problem:
> > >
> > > Given observations (X_i,Y_i) with known respective error
> distributions
> > > (e_X_i,e_Y_i) (say, 0-mean Gaussian with known STD), find the
> parameters
> > > a and b which maximize the Likelihood of
> > >
> > > Y = a*X + b
> > >
> > > Taking the example further, how many of the very simplified
> assumptions
> > > from the above example can be lifted or eased and R still has a
> method
> > > for finding an errors-in-variables fit?
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list