[R] bowed linear approximations

Fox, John jfox at mcmaster.ca
Tue Sep 26 16:21:49 CEST 2017


Dear Rich,

I think that it's generally a bad idea to give statistical (as opposed to simply technical) advice by email without knowing the context of the research. I think that you'd do well to seek help from a statistician, and not just do what I suggest below.

Interpolating the data only makes sense if there's no random component to the response (mag in your data). Otherwise, it makes more sense to get "predictions" from a statistical model that has an explicit error component for the response. In your case, a simple quadratic model in log(freq) seems to fit the data reasonably well. 

To see what I mean, try

plot(log(freq), mag)
mod <- lm(mag ~ poly(log(freq), 2))
summary(mod)
points(log(freq), fitted(mod), pch=16)
lines(spline(log(freq), fitted(mod)))

Some basic regression diagnostics suggest that we can do better by taking the log of mag as well, producing a closer fit to the data and stabilizing the error variance:

plot(log(freq), log(mag))
mod2 <- lm(log(mag) ~ poly(log(freq), 2))
summary(mod2)
points(log(freq), fitted(mod2), pch=16)
lines(spline(log(freq), fitted(mod2)))

I have no idea whether this makes substantive sense in the context of your problem.

Best,
 John

> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Evans,
> Richard K. (GRC-H000)
> Sent: Tuesday, September 26, 2017 10:01 AM
> To: Eric Berger <ericjberger at gmail.com>; Fox, John <jfox at mcmaster.ca>
> Cc: r-help at r-project.org
> Subject: Re: [R] bowed linear approximations
> 
> My apologies for the typos in the code.
> Here is a corrected version you can copy/paste in R to see the issue.
> 
> freq <- c(2, 3, 5, 10, 50, 100, 200, 300, 500, 750, 1000, 1300, 1800, 2450, 2900,
> 3000, 4000, 5000, 6000, 7000, 8200, 9300, 10000, 11000, 18000, 26500, 33000,
> 40000); mag <- c(1.9893038, 1.5088071, 1.1851947, 0.9444483, 0.7680123,
> 0.7458169, 0.7069638, 0.6393066, 0.6261539, 0.6263381, 0.7053774,
> 0.6900626, 0.6953527, 0.7843036, 0.9056359, 0.8867276, 0.8937421,
> 0.9492288, 0.9629118, 1.1972268, 1.0010515, 0.9945838, 1.0564356,
> 0.8733333, 1.1666667, 1.5366667, 1.4666667, 1.3166667);
> plot(freq,mag,type="b",log="x"); for(i in 1:200){ xx <-
> exp(runif(1,log(min(freq)),log(max(freq)) )); yy <- approx(freq,mag,xout=xx,
> method = "linear"); points(xx,yy$y,col=rgb(1,0,0)); }
> 
> For completeness, I have been puzzling over why the approximated points
> don't lie linearly over the original data set (especially prominent  in the bow
> between freq=10 and 50). Once I realized (and concurred with) why this bow
> exists, I have been struggling with how to make these approximations as
> expected.. In my original post, I think I oversimplified it too much by implying
> that my application was just 2 data points.
> 
> Are your suggestions still valid do you think?
> -Rich
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list