[R] Assigning factors probabilistically based on the value of another variable.

Charles C. Berry cberry at tajo.ucsd.edu
Sun Mar 28 00:24:03 CET 2010

```On Sat, 27 Mar 2010, Economics Guy wrote:

> I am revising a program that I wrote when I was very new at R
> (2007ish), and while I have been able to write very nice and fast code
> for almost all of it, there is one issue that I cannot seem to do it
> in less than 40 ugly and computationally expensive lines.
>
> I have a data frame that contains one variable:
>
> theFrame <- data.frame(theValues=runif(150,-10,10))
>
> I would like to write a function that would assign each of these
> values a factor, and I need it to meet several criteria:
>
> (1) There are 15 factors.
> (2) I need there to be exactly 10 elements assigned to each factor.
>
> Now here is the tricky part:
>
> (3) I would like to assign the factor probabilistically. The lower
> theValue is for a row, the lower factor I would like it to receive. So
> values close to -10 should have a really high probability of being
> assigned factor 1.
>
> If assigning factors is to tricky I would settle for placing theValues
> in a 10 x 15 matrix where the lower values would be more likely to end
> up in column 1 (again, values close to -10 should have a really high
> probability of being assigned to column 1.).

It is really the same thing. One of many possibilities:

> theFrame <- data.frame(theValues=runif(150,-10,10))
> exact <- diag(15)[1+ (rank(theFrame\$theValues)-1)%/%10,]
> not.so.exact <- diag(15)[1+ (rank(theFrame\$theValues+runif(150,0,3))-1)%/%10,]

If what you actually wanted was one factor with fifteen levels, just wrap
the subscript in the last assignment in factor() instead.

HTH,

Chuck

>
> Any ideas? I have thought at times I was painfully close only to
> realize I was completely wrong.
>
> Thanks,
>
> That Economics Guy
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help