[R] How to include known errors in a regression?

Robert Latest boblatest at gmail.com
Tue May 15 22:14:58 CEST 2012


Hello all,

I have a bunch of aggregated measurement data. The data describe two
different physical properties that correlate, and I want to estimate
the coefficients (slope and intercept) from the dataset.

This is of course easy, I've done it, and I got the expected result.

But here's the thing: Each data point in X and Y is actually a mean of
N individual (automated) measurements taken from the same object. I
have the mean, the standard deviation (SD) and N for each datapoint.
One datapoint corresponds to one of several (different) objects.

Is there any way I can enter this knowledge into the model? I need to
estimate the errors quite precisely, and I feel that I'm throwing away
valuable data by not using N and SD.I'm thinking about bloating my
datapoints into "fake" datasets by creating a rnorm sample with the
given mean, N, and SD, but that sounds silly. Maybe I'll do it as an
experiment to see if it has any significant impact.

To clarify: For each datapoint (X, Y) I additionally have (sdX, sdY)
and (nX, nY). So each (X, Y) would be turned into a nX*nY combination
of all values of rnorm(nX, X, sdX) and rnorm(nY, Y, sdY). Then I'd
pitch all of this together an a linear model. Makes sense?

My goal is to replace one (slow, expensive) measurement by another
(fast, cheap) one, and I need to establish the correlation (and
especially the expected error margin)  between the two to see if it is
feasible.

Thanks,
robert



More information about the R-help mailing list