[R-sig-teaching] Simulating Data with predefined reg-coefficients and R2

markus m.kossner at tu-bs.de
Wed Nov 19 09:19:23 CET 2008


Hi all at the R-teaching mailing list,
I am currently preparing my first  R-based  regression  course. Along 
this way I encountered the following problem:

I want to simulate multivariate data that has some specific predefined 
attributes. For example I want to produce a Predictor-matrix (X)
and a response-vector (y) that will yield a given vector of regression 
coefficients (b) and a given R2 when I perform a multivariate linear 
Regression
on the dataset. This would be best described by the well known equation 
y=X*b+e.
In some next step I also want to simulate polynomic relationships, but I 
think that should work not very different.

I already searched the web and found some hints, but no clear answer. 
There is a pdf out there from John H. Walker (Teaching Regression with 
simulation)
which does however not discuss this special topic. I also have a Paper 
from K.Baumann 'Chance Correlation in variable subset regression: 
Influence of the objective function, selection mechanism and Ensemble 
averaging' QCS, 2005. There an 'Autoregressive process' is used to 
simulate such data.

Now my question is:
Is it really that difficult to simulate such data? Is there perhaps a 
package in R facilitating at least parts of this work?

Thanks in advance for the help,
Markus




More information about the R-sig-teaching mailing list