[R] (no subject)

Thu Sep 20 18:10:36 CEST 2012

Maybe it is longer, but it's also more general, it issues an error if 
the tables are not 1-dim. That's where most of the function's extra 
lines are. Otherwise it's the same as your first solution. The second 
one has the problem you've mentioned.

Rui Barradas
Em 20-09-2012 16:46, Stefan Th. Gries escreveu:
> Ye, but this is way longer than any of the three solutions I sent, is it not?
> STG
> --
> Stefan Th. Gries
> -----------------------------------------------
> University of California, Santa Barbara
> http://www.linguistics.ucsb.edu/faculty/stgries
> -----------------------------------------------
>
>
> On Thu, Sep 20, 2012 at 8:43 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
>> Hello,
>>
>> The trick is to use the table's dimnames attributes. Try the following.
>>
>> addTables <- function(t1, t2){
>>      dn1 <- dimnames(t1)
>>      dn2 <- dimnames(t2)
>>      if(length(dn1) == 1){
>>          dn1 <- unlist(dn1)
>>          dn2 <- unlist(dn2)
>>          dns <- sort(unique(c(dn1, dn2)))
>>          tsum <- array(integer(length(dns)), dim = length(dns))
>>          dimnames(tsum) <- list(dns)
>>          tsum[dn1] <- t1
>>          tsum[dn2] <- tsum[dn2] + t2
>>      }else
>>          stop(paste("table with", ndim, "dimensions is not implemented."))
>>      tsum
>> }
>>
>>
>> a <- c("d", "d", "j", "f", "e", "g", "f", "f", "i", "g")
>> b <- c("a", "g", "d", "f", "g", "a", "f", "a", "b", "g")
>> ta <- table(a)
>> tb <- table(b)
>> rm(a, b)
>>
>> addTables(ta, tb)
>>
>> Hope this helps,
>>
>> Rui Barradas
>> Em 20-09-2012 15:57, Stefan Th. Gries escreveu:
>>> >From my book on corpus linguistics with R:
>>>
>>> # (10)   Imagine you have two vectors a and b such that
>>> a<-c("d", "d", "j", "f", "e", "g", "f", "f", "i", "g")
>>> b<-c("a", "g", "d", "f", "g", "a", "f", "a", "b", "g")
>>>
>>> # Of these vectors, you can create frequency lists by writing
>>> freq.list.a<-table(a); freq.list.b<-table(b)
>>> rm(a); rm(b)
>>>
>>> # How do you merge these two frequency lists without merging the two
>>> vectors first? More specifically, if I delete a and b from your
>>> memory,
>>> rm(a); rm(b)
>>> # how do you generate the following table only from freq.list.a and
>>> freq.list.b, i.e., without any reference to a and b themselves? Before
>>> you complain about this question as being unrealistic, consider the
>>> possibility that you generated the frequency lists of two corpora
>>> (here, a and b) that are so large that you cannot combine them into
>>> one (a.and.b<-c(a, b)) and generate a frequency list of that combined
>>> vector (table(a.and.b)) ...
>>> joint.freqs
>>> a b d e f g i j
>>> 3 1 3 1 5 5 1 1
>>>
>>> joint.freqs<-vector(length=length(sort(unique(c(names(freq.list.a),
>>> names(freq.list.b)))))) # You generate an empty vector joint.freqs (i)
>>> that is as long as there are different types in both a and b (but note
>>> that, as requested, this information is not taken from a or b, but
>>> from their frequency lists) ...
>>> names(joint.freqs)<-sort(unique(c(names(freq.list.a),
>>> names(freq.list.b)))) # ... and (ii) whose elements have these
>>> different types as names.
>>> joint.freqs[names(freq.list.a)]<-freq.list.a # The elements of the new
>>> vector joint.freqs that have the same names as the frequencies in the
>>> first frequency list are assigned the respective frequencies.
>>>
>>> joint.freqs[names(freq.list.b)]<-joint.freqs[names(freq.list.b)]+freq.list.b
>>> # The elements of the new vector joint.freqs that have the same names
>>> as the frequencies in the second frequency list are assigned the sum
>>> of the values they already have (either the ones from the first
>>> frequency list or just zeroes) and the respective frequencies.
>>> joint.freqs # look at the result
>>>
>>> # Another shorter and more elegant solution was proposed by Claire
>>> Crawford (but uses a function which will only be introduced later in
>>> the book)
>>> freq.list.a.b<-c(freq.list.a, freq.list.b) # first the two frequency
>>> lists are merged into a single vector ...
>>> joint.freqs<-as.table(tapply(freq.list.a.b, names(freq.list.a.b),
>>> sum)) # ... and then the sums of all numbers that share the same names
>>> are computed
>>> joint.freqs # look at the result
>>>
>>> # The shortest, but certainly not memory-efficient way to do this
>>> involves just using the frequency lists to create one big vector with
>>> all elements and tabulate that.
>>> table(c(rep(names(freq.list.a), freq.list.a), rep(names(freq.list.b),
>>> freq.list.b))) # kind of cheating but possible with short vectors ...
>>>
>>> HTH,
>>> STG
>>> --
>>> Stefan Th. Gries
>>> -----------------------------------------------
>>> University of California, Santa Barbara
>>> http://www.linguistics.ucsb.edu/faculty/stgries
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>