[R] Re-grouping data in R
Rui Barradas
ruipbarradas at sapo.pt
Wed Aug 8 00:29:45 CEST 2012
Hello,
Inline.
Em 07-08-2012 19:56, Abraham Mathew escreveu:
> I have a data frame with a column of values that I want to bucket (group)
> into specific levels.
>
>> str(dat)'data.frame': 3678 obs. of 39 variables:
> $ id : int 23 76 129 156 166 180 200 214 296 344 ...
> $ final_purchase_amount : Factor w/ 32 levels
> "\\N","1082","1109",..: 1 1 1 1 1 1 1 1 1 1 ...
>
>
> So I ran the following to produce new levels, one for values from 100
> to 400, 401 to 1000, and 1001+.
>
>
> dat$final_purchase_amount<- NA
> dat$final_purchase_amount[dat$final_purchase_amount %in%
> levels(dat$final_purchase_amount)[c(8,9,11,12,13,15,16,17,18,19,20,21)]]
> <- "100 to 400"
> dat$final_purchase_amount[dat$final_purchase_amount %in%
> levels(dat$final_purchase_amount)[c(22,23,24,25,26,27,28,29,30,31,32)]]
> <- "401 to 1000"
> dat$final_purchase_amount[dat$final_purchase_amount %in%
> levels(dat$final_purchase_amount)[c(2,3,4,5,6,7,10,14)]] <- "1001 +"
> dat$final_purchase_amount <- factor(dat$final_purchase_amount)
> levels(dat$final_purchase_amount)
> table(dat$final_purchase_amount)
>
>
>
> However, this doesn't seem to produce any levels
Fortunately not! You have started by setting the entire column vector to
NA in your first instruction above, then try several times to find that
vector of NAs %in% levels numbers c(8,9, ...etc...) or c(22,23, ...etc..).
Your first line of code makes everything else relative to
dat$final_purchase_amount useless. (I believe that that line should be
deleted.)
Hope this helps,
Rui Barradas
> and returns the following.
>
>
>> levels(dat$final_purchase_amount)character(0)
>
>
> Can anyone point to what I'm doing wrong.
>
>
>
> Thanks!
>
>
More information about the R-help
mailing list