[R] categorizing data
Tom Woolman
twoo|m@n @end|ng |rom ont@rgettek@com
Sun May 29 21:42:38 CEST 2022
Some ideas:
You could create a cluster model with k=3 for each of the 3 variables,
to determine what constitutes high/medium/low centroid values for each
of the 3 types of plant types. Centroid values could then be used as the
upper/lower boundary ranges for high/med/low.
Or utilize a histogram for each variable, and use quantiles or
densities, etc. to determine the natural breaks for the high/med/low
ranges for each of the IVs.
On 2022-05-29 15:28, Janet Choate wrote:
> Hi R community,
> I have a data frame with three variables, where each row adds up to 90.
> I want to assign a category of low, medium, or high to the values in
> each
> row - where the lowest value per row will be set to 10, the medium
> value
> set to 30, and the high value set to 50 - so each row still adds up to
> 90.
>
> For example:
> Data: Orig
> tree shrub grass
> 32 11 47
> 23 41 26
> 49 23 18
>
> Data: New
> tree shrub grass
> 30 10 50
> 10 50 30
> 50 30 10
>
> I am not attaching any code here as I have not been able to write
> anything
> effective! appreciate help with this!
> thank you,
> JC
>
> --
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list