[R] Generate a serie of new vars that correlate withexistingvar
Olivier ETERRADOSSI
olivier.eterradossi at ema.fr
Fri Apr 6 10:03:48 CEST 2007
Hello Greg (and List),
Thnaks for your reply and reflections (and sorry for my "frenglish"....).
Of course you're right, and I agree "a posteriori" with all your views.
Probably my suggestion was first of all a mark of appreciation for your
solution ;-) .
Here is the path I followed to get where I was, but I see that I was
probably misunderstanding what makes the "core" of R :
1) The question of making such related couples of vectors is nearly a
FAQ, as you point out in your reply.
2) It appeared to me that it is often asked by newbies or users with
relatively small statistical knowledge.
3) To get to your solution, a good understanding is needed of what
correlation is, as well as of matrix properties and operators. My guess
was that the people listed above have generally not.
4) I believed from my own experience that the core of R was dedicated
either to basics or to rather complicated algorithms to handle or
produce results appearing as "simple" or "classical".
5) From my same own experience, I was not able to imagine to which
non-core package such a function should "obviously" be added. I imagined
that in the same manner, a person seeking for the function could have
some problems in locating it. Until now I did not have a look to your
TeachingDemos package (I'll do it), but I know of other categories of
searchers, often not statisticians, who have a need to generate such
data and would not think of getting there to find a way.
To end with, all this mainly shows that I did not understand R
philosophy as well as I thought !
Thanks, and regards. Olivier
Greg Snow a écrit :
> Oliver,
>
> I have thought of adding something like this to a package, but here is my current thinking on the issue.
>
> This question (or similar) has been asked a few times, so there is some demand for a general answer, I see three approaches:
>
> 1. Have an example of the necessary steps archived in a publicly available place.
> 2. Write a function and include it in a non-core package.
> 3. Add it to the core of R or a core package.
>
> Number 1 is already in process as the e-mails will be part of the archive. Though someone is welcome to add it to the Wiki if they think that would be useful as well.
>
> Your suggestion is number 3, but I would argue that 2 is better than 3 for the simple reason that anything added to the core is implied to be top quality and have pretty much any options that most people would think of. Putting it in a non-core package makes it available, with less implications of quality.
>
> The question then becomes, what options do we make available? Do we have them specify the entire correlation structure? Or just assume the new variables will be independent of each other? What should the function do if the set of correlations result in a matrix that is not positive definite? What if the user wants to have 2 fixed variables? And other questions.
>
> My current thinking is that the process is simple enough that it is easier to do this by hand than to remember all the options to the function. There are currently people who use bootstrap and permutation tests without loading in the packages that do these because it is quicker to write the code by hand than to remember the syntax of the functions. I think this type of data generation falls under the same situation. But if you, or someone else thinks that there is enough justification for a function to do this, and can specify what options it should have, I will be happy to add it to my TeachingDemos package (this seems an appropriate place, since one of the places that I want to generate data with a specific correlation structure is when creating an example for students).
>
>
> Hope this helps,
>
>
--
Olivier ETERRADOSSI
Maître-Assistant
CMGD / Equipe "Propriétés Psycho-Sensorielles des Matériaux"
Ecole des Mines d'Alès
Hélioparc, 2 av. P. Angot, F-64053 PAU CEDEX 9
tel std: +33 (0)5.59.30.54.25
tel direct: +33 (0)5.59.30.90.35
fax: +33 (0)5.59.30.63.68
http://www.ema.fr
More information about the R-help
mailing list