[R] a problem: factors, names, tables ..

Adaikalavan Ramasamy ramasamy at cancer.org.uk
Sun Jul 18 16:19:14 CEST 2004


Please give a reproducible example. Here is one way :

# generate example
> v1 <- rep( c(0, 2, 10, 11, 13, 14, 15), c(15, 6, 1, 3, 8, 15, 10) )
> t1 <- table(v1)
> t1
v1
 0  2 10 11 13 14 15 
15  6  1  3  8 15 10 

> v2 <- rep( c(0, 1, 2, 10, 11, 12, 13, 14, 15), c(817, 119, 524, 96,
700, 66, 559, 358, 283) )
> t2 <- table(v2)
> t2
v2
  0   1   2  10  11  12  13  14  15 
817 119 524  96 700  66 559 358 283 

# find results
> merge(t1, t2, by=1, all.x=TRUE)
  v1 Freq.x Freq.y
1  0     15    817
2 10      1     96
3 11      3    700
4 13      8    559
5 14     15    358
6 15     10    283
7  2      6    524

Uwe's suggestion may need a slight modification as the two table have
different labels/levels and hence non-conformable for division

> t2.f <- table( v2.f <- factor(v2) )
> t1.f <- table( v1.f <- factor(v1, levels=levels(v2.f)) )

> cbind( t1.f, t2.f, ratio=t1.f / t2.f )
   t1.f t2.f       ratio
0    15  817 0.018359853
1     0  119 0.000000000
2     6  524 0.011450382
10    1   96 0.010416667
11    3  700 0.004285714
12    0   66 0.000000000
13    8  559 0.014311270
14   15  358 0.041899441
15   10  283 0.035335689
> 

Also have a look at this related posting
http://tolstoy.newcastle.edu.au/R/help/04/06/0594.html

Regards, Adai.


On Sun, 2004-07-18 at 13:05, Uwe Ligges wrote:
> PvR wrote:
> > Hi all,
> > 
> > I am *completely* lost in trying to solve a relatively simple task.
> > 
> > I want to compute the relative number of occurences of an event, the 
> > data  of which sits in a large table (read from file).
> > 
> > I have the occurences of the events in a table 'tt'
> > 
> > 0  2 10 11 13 14 15
> > 15  6  1  3  8 15 10
> > 
> > .. meaning that event of type '0' occurs 15 times, type '2' occurs 6 
> > times  etc.
> > 
> > Now I want to divide the occurence counts by the total number of events 
> > of  that type, which is given in the table tt2:
> > 
> >  0   1   2  10  11  12  13  14  15
> > 817 119 524  96 700  66 559 358 283
> > 
> > Saying that event type '0' occurred 817 times, type '1' occurs 119 
> > times  etc.
> > 
> > The obvious problem is that not all events in tt2 are present in tt, 
> > which  is the result of the experiment so that cannot be changed.
> > 
> > What needs to be done is loop over tt, take the occurence count, and  
> > divide that with the corresponding count in tt2.  This corresponding 
> > tt2  count is *not* at the same index in tt2, so I need a reverse lookup 
> > of the  type number.  For example:
> > 
> > event type 10:
> > occurs 1 time (from table tt)
> > occurs 96 times in total (from table tt2)  <- this is found by looking 
> > up  type '10' in tt2 and reading out 96
> > 
> > result: 1/96
> > 
> > 
> > 
> > I have tried programming this as follows:
> 
> 
> It's *much* easier. Just make V32 a factor. After that, table() knows 
> all the levels and counts also the zeros:
> 
> V32 <- factor(V32)
> table(V32[V48 == 0]) / table(V32)
> 
> Uwe Ligges
> 
> 
> 
> 
> > 
> > tt <- table(V32[V48 == 0]) # this is taking the events I want counted
> > tt2 <- table(V32) # this is taking the total event count per type
> > df <- as.data.frame(tt) #convert to dataframe to allow access to  
> > type-numbers .. ?
> > df2 <-  as.data.frame(tt2) #same here
> > 
> > print(tt);
> > print(df);
> > 
> > print(tt2);
> > print(df2);
> > 
> > for( i in 1:length(tt) ) { #loop over smallest table tt
> >     print("i:"); #index
> >     print(i);
> >     print( "denominator "); #corresponds to the "1" in the example
> >     print(     df$Freq[i] );
> >     denomtag = ( df$Var1[ i ] );    # corresponds to the "10" in the 
> > example,  being the type number of the event
> >     print("denomtag ");
> >     print( denomtag );
> >     print( "nominator: " );
> >     print( df2[2][ df[1] == as.numeric(denomtag) ] );  #this fails ....
> >     #result would then be somthing like :  denomitor / nominator   
> > }
> > 
> > The problem is that the factor names that are extracted in 'denomtag' 
> > are  not usable as index in the dataframe in the last line.   I have 
> > tried  converting to numeric using 'as.numeric', but that fails since 
> > this  returns the index in the factor rather then the factor name I need 
> > from  the list.
> > 
> > Any suggestions .. ?   I am sure its dead simple, as always.
> > 
> > 
> > Thanks,
> > 
> > 
> > Piet (Belgium)
> > 
> > PS: please reply to pvremortNOSPAM at vub.ac.be
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list