[R] Creating a vector of categories
Sharpie
chuck at sharpsteen.net
Fri Mar 26 12:40:03 CET 2010
Christoffer Karlsson wrote:
>
> Hi,
>
> I have a column in a data frame looking something like:
>
> $sex $language $count
> male english 0
> male english 0
> female english 32
> male spanish 154
> female english 11
> female norweigan 7
>
> and so on.
> What I want to do is to order these in to categories, for instance one
> category where count>=0 & count<10 and so on..
>
> I want my data to turn out looking something like:
>
> male english 0-10 1324
> male english 11-20 756
> .....
> male spanish 0-10 354
> ...
> female english 0-10 1557
> ...
>
> and so on, where the right hand is the count of the number of people in
> each
> category.
> Up until now I've been subsetting the data frame into each category, and
> then counting number of rows in each subset. However I now have a large
> amount of different factor combinations which makes this process tedious.
>
> Any help would be appreciated!
> Chris
>
You can quickly assign a category to each row in your data frame with the
cut() function:
testData <- structure(list(sex = structure(c(2L, 2L, 1L, 2L, 1L, 1L, 2L,
1L, 2L), .Label = c("female", "male"), class = "factor"), language =
structure(c(1L,
1L, 1L, 3L, 1L, 2L, 3L, 3L, 1L), .Label = c("english", "norweigan",
"spanish"), class = "factor"), count = c(0L, 0L, 32L, 154L, 11L,
7L, 3L, 5L, 2L)), .Names = c("sex", "language", "count"), class =
"data.frame", row.names = c(NA,
-9L))
binMax <- ceiling( max(testData$count) / 10 ) * 10
binBreaks <- seq( 0, binMax, by = 10 )
testData$bin <- cut( testData$count, binBreaks, include.lowest = TRUE )
And then as Petr said:
with( testData, aggregate(count, list(sex, language, bin), length))
Hope this helps!
-Charlie
-----
Charlie Sharpsteen
Undergraduate-- Environmental Resources Engineering
Humboldt State University
--
View this message in context: http://n4.nabble.com/Creating-a-vector-of-categories-tp1691911p1692028.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list