[R] Grouping data via an index

Peter Dalgaard P.Dalgaard at biostat.ku.dk
Mon Jan 28 15:52:36 CET 2008


Tobin, Jared wrote:
> Hello r-help,
>
> I have a lengthy vector of data (with values anywhere from 1-200), and
> another index vector of 'groups' representing values 0-2, 3-5, 6-8, ...
> of length 67.  The index vector has the structure (1, 4, 7, ... , 196,
> 199), where each value is the midpoint of each respective group.
>
> I'm trying to convert the data vector such that values falling into each
> group are changed to the 'number' of that group.  That is, the position
> of the midpoint of that group in the index vector.  For example, 4 or 5
> would become a 2; 6 would become 3; 9 or 10 would become 4, and so on.
>
> Haven't had any success thus far -- does anyone know of a simple method
> offhand?
>
> I've started by converting the data vector to a vector of the midpoints
> of each group, via 
>   
>> round(data.vector/3)*3 + 1
>>     
>
> But haven't been able to accomplish much past that.  I'm guessing it can
> be accomplished via a simple loop or otherwise.
>
>   
You could convert midpoints to breakpoints and use cut().
Do you know that each group contains 3 consecutive values? Otherwise it
gets a bit sticky.
If it does, use

as.integer(cut(data.vector, breaks=seq(-.5,200.5,3)))

(which also works for other sets of breakpoints) or just use integer
division

data.vector %/% 3 + 1
> Thanks,
>
> --
>
> jared tobin, student research assistant
> fisheries and oceans canada
> tobinjr at dfo-mpo.gc.ca
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list