[R] id & filter problems in data.frame

Christian Schulz c.schulz at metrinomics.de
Tue May 28 16:01:14 CEST 2002


....many  thanks - this would be a big step for advanced learning  in prg. R
Christian



Renaud Lancelot wrote:

>>(2.problem) example:
>>id   filterCriteria    ratingOfSatisfaction        ProductType
>>1        Man
>>1                               60                                A
>>1                               40                                B
>>3       Women
>>3                                20                                A
>>5        Man
>>5                                 40                               A
>>5                                 100                             B
>>5                                  80                              C
>>
>>I know that's no a perfect  database model .
>>
>
>Sure !
>
>>But  the dataset is much longer and now i have got a problem
>>i.e filter the ratingOfSatisfaction with gender!
>>
>>Is there a possibilty to write a function in the really flexible R , which
>>autocount (copy the rows under the first row per ID for the filterCriteria)
>>until a new Id starts and again ....
>>
>
>The following assumes the file is perfect (no missing value). However,
>you will get an idea of what is possible to do. I have copied and pasted
>the example above in a file called "file.txt":
>
>>ProcessFile <- function(file){
>>
>+   Line <- readLines(file)
>+   Line <- tapply(X = seq(along = Line),
>+                  INDEX = seq(along = Line),
>+                  FUN = function(x, Line){
>+                    vec <- unlist(strsplit(x = Line[x], split = " "))
>+                    vec <- vec[vec != ""]
>+                    vec}, Line)
>+   i <- 2; j <- 0
>+   List <- list()
>+   while(i < length(Line)){
>+     xid <- Line[[i]][1]
>+     xCrit <- Line[[i]][2]
>+     i <- i + 1
>+     while(Line[[i]][1] == xid & i < length(Line)){
>+       j <- j + 1
>+       List[[j]] <- data.frame(id = xid, Crit = xCrit,
>+         Sat = as.numeric(Line[[i]][2]), Type = Line[[i]][3])
>+       i <- i + 1
>+       }
>+     }
>+     do.call("rbind", List)
>+   }
>
>>test <- ProcessFile(file = "d:\\analyses\\travail\\file.txt")
>>test
>>
>   id  Crit Sat Type
>1   1   Man  60    A
>11  1   Man  40    B
>12  3 Women  20    A
>13  5   Man  40    A
>14  5   Man 100    B
>15  5   Man  80    C
>
>Then:
>
>>tapply(test$Sat, test$Crit, mean)
>>
>  Man Women 
>   64    20 
>
>>tapply(X = test$Sat, test$Crit, table)
>>
>$Man
>
> 40  60  80 100 
>  2   1   1   1 
>
>$Women
>
>20 
> 1 
>
>>tapply(as.factor(test$Sat), test$Crit, table)
>>
>$Man
>
> 20  40  60  80 100 
>  0   2   1   1   1 
>
>$Women
>
> 20  40  60  80 100 
>  1   0   0   0   0 
>
>etc.
>
>Hope this helps,
>
>Renaud
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://stat.ethz.ch/pipermail/r-help/attachments/20020528/944d8d54/attachment.html


More information about the R-help mailing list