[R] vectorization

james.holtman@convergys.com james.holtman at convergys.com
Fri Jun 17 20:41:53 CEST 2005





try this:

> x.1 <- data.frame(income=runif(100)*10000,
educ=sample(c('hs','col','none'),100,T))
> x.1
        income educ
1   5930.30882  col
2   5528.83222   hs
3   5967.04041   hs
4   3926.30682   hs
5   2603.75924 none
...........
> x.2 <- tapply(x.1$income, x.1$educ, mean)
> x.2
     col       hs     none
5575.310 4994.921 5481.962
> x.1$median <- x.2[x.1$educ]
> x.1
        income educ   median
1   5930.30882  col 5575.310
2   5528.83222   hs 4994.921
3   5967.04041   hs 4994.921
4   3926.30682   hs 4994.921
5   2603.75924 none 5481.962
6   7398.83325  col 5575.310
7    265.06895   hs 4994.921
.........
>

Jim
__________________________________________________________
James Holtman        "What is the problem you are trying to solve?"
Executive Technical Consultant  --  Convergys Labs
james.holtman at convergys.com
+1 (513) 723-2929


                                                                                                                                           
                      "Dimitri Joe"                                                                                                        
                      <dimitrijoe at yahoo.com        To:       "R-Help" <r-help at stat.math.ethz.ch>                                           
                      .br>                         cc:                                                                                     
                      Sent by:                     Subject:  [R] vectorization                                                             
                      r-help-bounces at stat.m                                                                                                
                      ath.ethz.ch                                                                                                          
                                                                                                                                           
                                                                                                                                           
                      06/17/2005 14:00                                                                                                     
                                                                                                                                           




Hi there,

I have a data frame (mydata) with 1 numeric variable (income) and 1 factor
(education). I want a new column in this data with the median income for
each education level. A obviously inneficient way to do this is

for ( k in 1: nrow(mydata) )            {
l <- mydata$education[k]
mydata$md[k] <- median(mydata$income[mydata$education==l],na.rm=T)
                                                    }

Since mydata has nearly 30.000 rows, this will be done not untill the end
of this month. I thus need some help for vectorizing this, please.

Thanks,

Dimitri

             [[alternative HTML version deleted]]






_______________________________________________________

Instale o discador agora! http://br.acesso.yahoo.com/

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html




More information about the R-help mailing list