[R] Creating a set that has line of best fit y=3+2x so that SST, SSR, SSE are whole numbers
David Arnold
dwarnold45 at suddenlink.net
Sun Nov 24 20:42:05 CET 2013
I wanted to find a set (x,y) of integers so that their line of best fit was y
= 3 + 2x. So I thought I'd be losing 2 degrees of freedom and chose
1,2,3,4, and x
for my explanatory data and
3, 8, 8, 12, and y
for my response data. I then used b = (n sum(xy) - sum(x)sum(y))/(n sum(x^2)
- (sum(x))^2) to determine the equation
2=(5(xy+91)-(x+10)(y+31))/(5(x^2+30)-(x+10)^2). Then, because (mean(x),
mean(y)) lies on the line of best fit, and mean(x)=(x+10)/5 and
mean(y)=(y+31)/5, subbing them gave me the equation y=2x+4. Subbing that
into my first equation gave me x=-1 and y=2.
Sure enough:
x <- c(-1,1,2,3,4)
y <- c(2,3,8,8,12)
plot(x,y)
lm.res <- lm(y~x)
lm.res
abline(lm.res)
Gave me the correct coefficients.
Coefficients:
(Intercept) x
3 2
Also, it was true that SSY = SSR + SSE, where SSY=sum(y-mean(y))^2,
SSR=(yhat-mean(y))^2, and SSE=sum(y-yhat)^2.
yhat <- predict(lm.res)
tab <- cbind(x,y,yhat,(y-mean(y))^2,(yhat-mean(y))^2,(y-yhat)^2)
addmargins(tab,1)
x y yhat
1 -1 2 1 21.16 31.36 1
2 1 3 5 12.96 2.56 4
3 2 8 7 1.96 0.16 1
4 3 8 9 1.96 5.76 1
5 4 12 11 29.16 19.36 1
Sum 9 33 33 67.20 59.20 8
That is, 67.20 = 59.20 + 8.
However, what I'd like to have is a set of numbers x and y that have a line
of best fit with equation y = 3+ 2x, but all of the numbers in the last
table are integers (or whole numbers). That would give me a good image I can
show in class to demonstrate this idea without having to do too many
calculations with decimals.
Wondering if their might be a method in R to keep picking choices for x and
y until this happens?
D.
--
View this message in context: http://r.789695.n4.nabble.com/Creating-a-set-that-has-line-of-best-fit-y-3-2x-so-that-SST-SSR-SSE-are-whole-numbers-tp4681074.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list