[R] a problem: factors, names, tables ..
PvR
pvremort at vub.ac.be
Sun Jul 18 13:17:42 CEST 2004
Hi all,
I am *completely* lost in trying to solve a relatively simple task.
I want to compute the relative number of occurences of an event, the data
of which sits in a large table (read from file).
I have the occurences of the events in a table 'tt'
0 2 10 11 13 14 15
15 6 1 3 8 15 10
.. meaning that event of type '0' occurs 15 times, type '2' occurs 6 times
etc.
Now I want to divide the occurence counts by the total number of events of
that type, which is given in the table tt2:
0 1 2 10 11 12 13 14 15
817 119 524 96 700 66 559 358 283
Saying that event type '0' occurred 817 times, type '1' occurs 119 times
etc.
The obvious problem is that not all events in tt2 are present in tt, which
is the result of the experiment so that cannot be changed.
What needs to be done is loop over tt, take the occurence count, and
divide that with the corresponding count in tt2. This corresponding tt2
count is *not* at the same index in tt2, so I need a reverse lookup of the
type number. For example:
event type 10:
occurs 1 time (from table tt)
occurs 96 times in total (from table tt2) <- this is found by looking up
type '10' in tt2 and reading out 96
result: 1/96
I have tried programming this as follows:
tt <- table(V32[V48 == 0]) # this is taking the events I want counted
tt2 <- table(V32) # this is taking the total event count per type
df <- as.data.frame(tt) #convert to dataframe to allow access to
type-numbers .. ?
df2 <- as.data.frame(tt2) #same here
print(tt);
print(df);
print(tt2);
print(df2);
for( i in 1:length(tt) ) { #loop over smallest table tt
print("i:"); #index
print(i);
print( "denominator "); #corresponds to the "1" in the example
print( df$Freq[i] );
denomtag = ( df$Var1[ i ] ); # corresponds to the "10" in the example,
being the type number of the event
print("denomtag ");
print( denomtag );
print( "nominator: " );
print( df2[2][ df[1] == as.numeric(denomtag) ] ); #this fails ....
#result would then be somthing like : denomitor / nominator
}
The problem is that the factor names that are extracted in 'denomtag' are
not usable as index in the dataframe in the last line. I have tried
converting to numeric using 'as.numeric', but that fails since this
returns the index in the factor rather then the factor name I need from
the list.
Any suggestions .. ? I am sure its dead simple, as always.
Thanks,
Piet (Belgium)
PS: please reply to pvremortNOSPAM at vub.ac.be
More information about the R-help
mailing list