[R] adding variable into dataframe by indice

Petr Pikal petr.pikal at precheza.cz
Thu Feb 9 11:23:17 CET 2006


Hi

not sure if I understand correctly but table() can be used

ttt <- table(asubs112$fir)
prop <- ttt/nrow(asubs112)
asubs112$prop <-
prop[match(asubs112$first_drink,as.numeric(names(prop)))]

> asubs112
   IND_ID rs1042364 first_drink      prop
2      11     (1,2)           7 0.3333333
5      41     (1,2)          11 0.1111111
6      51     (1,2)           7 0.3333333
7      61     (1,1)           7 0.3333333
8      71     (1,1)          12 0.2222222
10     91     (1,1)           6 0.1111111
11    101     (1,2)           5 0.1111111
12    111     (1,2)          13 0.1111111
14    131     (1,2)          12 0.2222222
> 

HTH
Petr


On 8 Feb 2006 at 9:40, Adrian Katschke wrote:

Date sent:      	Wed, 8 Feb 2006 09:40:43 -0800 (PST)
From:           	Adrian Katschke <adrian at atstatconsulting.com>
To:             	RHelp <r-help at stat.math.ethz.ch>
Subject:        	[R] adding variable into dataframe by indice

>   R-Helpers,
> 
>   I am trying to insert a value into a dataframe. This value is a
>   proportion calculated by counting the number of those individuals
>   with that value and then inserting the proportion at the end of the
>   dataframe to only those individuals with the given value. The
>   problem I am running into is that the proportions are not being
>   attached to only those individuals with the specified value for that
>   proportion. 
> 
>   Below is an example of the code that I am using. The data is made up
>   for the dataframe. Should give you an idea, but the original has
>   'NA' in many rows. The original data is what is reported in the
>   output below.
> 
>     #Read in Data
>   age.int <- data.frame(IND_ID = seq(1, 140, 10),   rs1042364 =
>   sample( c("(1,1)","(1,2)","(2,2)"),14,replace = T), first_drink =
>   sample(5:17,14,replace = T))
> 
> 
> 
>     asubs112 <- subset(age.int, rs1042364 != "(2,2)")
> 
> 
>     ages112 <- sort(unique(na.omit(asubs112$first_drink)))
> 
>   for ( i in ages112) {
>     indce <- which(na.omit(asubs112$first_drink == i))
>     prop <- length(indce)/nrow(asubs112)
>     asubs112[indce,4] <- prop
>     asubs112[indce,]
>   }
> 
>   Below is the output that I get from the script above. Notice the
>   proportion for the first NA but not any of the others. Not sure what
>   I am doing wrong, any suggestions are a big help.
> 
>   TIA,
>   Adrian
> 
>    asubs112[1:50,]
>       IND_ID rs1042364 first_drink age_int          V5
> 4   10008007     (1,2)          NA      16 0.003891051
> 6   10013012     (1,2)          13      14 0.116731518
> 7   10015006     (1,2)          12      17 0.105058366
> 8   10015007     (1,1)          12      16 0.105058366
> 10  10021009     (1,2)          NA      15          NA
> 14  10039036     (1,2)          NA      15          NA
> 15  10039037     (1,2)          NA      13          NA
> 17  10045005     (1,2)          13      17 0.116731518
> 18  10045014     (1,2)          13      14 0.116731518
> 21  10055022     (1,2)          NA      15          NA
> 
> 
> 
> 
> 
> 
>  [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
petr.pikal at precheza.cz




More information about the R-help mailing list