[R] Comparison of vectors in a matrix

Tony Plate tplate at acm.org
Wed Nov 11 19:57:03 CET 2009


This is a tricky data entry problem.  The right technique will depend on the fine details of the data, and it's not clear what those are.  E.g., when you say "In my first column, for example, I have "henry" ", it's unclear to me whether or not the double quotes are part of the data or not - which is why it's nice to provide reproducible examples.

But, if you do have quoted strings in your data fields as they exist in an R matrix, you can do something like the following:

> # each element of the matrix x contains one or more quoted strings, separated by commas
> x <- matrix(c('"a", "b"', '"c"', '"b"', '"d"'), ncol=2, dimnames=list(c("row1", "row2"), c("X","Y")))
> x
     X              Y      
row1 "\"a\", \"b\"" "\"b\""
row2 "\"c\""        "\"d\""
> # use R's parsing and evaluation to turn '"a", "b"' into c("a", "b"), and turn that
> # into a matrix containing character vectors of various lengths.
> matrix(lapply(parse(text=paste("c(", x, ")")), eval), ncol=ncol(x), dimnames=dimnames(x))
     X           Y  
row1 Character,2 "b"
row2 "c"         "d"
> 

- Tony Plate

esterhazy wrote:
> Yes, thanks for this, this is exactly what I want to do.
> 
> However, I have a remaining problem which is how to get R to understand that
> each entry in my matrix is a vector of names.
> 
> I have been trying to import my text file with the names in each vector of
> names enclosed in quotes and separated by commas, or separated by spaces, or
> without quotes, etc, with no luck. 
> 
> Everytime, R seems to consider the vector of names as just one long name.
> 
> In my first colum, for example, I have "henry", in the second, "mary",
> "ruth", and in the third "mary", "joseph", and I have no idea how to get R
> to see that "mary", "ruth", for example, is composed of two strings of text,
> rather than just one.
> 
> Thanks for any further help!
> 
> http://old.nabble.com/file/p26305756/ffoexample.txt ffoexample.txt 
> 
> Tony Plate wrote:
>> Nice problem!
>>
>> If I understand you correctly, here's how to do it (with list-based
>> matrices):
>>
>>> set.seed(1)
>>> (x <- matrix(lapply(rpois(10,2)+1, function(k) sample(letters[1:10],
>>> size=k)), ncol=2, dimnames=list(1:5,c("A","B"))))
>>   A           B          
>> 1 Character,2 Character,5
>> 2 Character,2 Character,5
>> 3 Character,3 Character,3
>> 4 Character,5 Character,3
>> 5 Character,2 "i"        
>>> x[1,1]
>> [[1]]
>> [1] "c" "b"
>>
>>> x[1,2]
>> [[1]]
>> [1] "c" "d" "a" "j" "f"
>>
>>> (y <- cbind(x, "A-B"=apply(x, 1, function(ab) setdiff(ab[[1]],
>>> ab[[2]]))))
>>   A           B           A-B        
>> 1 Character,2 Character,5 "b"        
>> 2 Character,2 Character,5 "g"        
>> 3 Character,3 Character,3 Character,3
>> 4 Character,5 Character,3 Character,2
>> 5 Character,2 "i"         Character,2
>>> y[1,3]
>> [[1]]
>> [1] "b"
>>
>> -- Tony Plate
>>
>> esterhazy wrote:
>>> Hi,
>>>
>>> I have a matrix with two columns, and the elements of the matrix are
>>> vectors.
>>>
>>> So for example, in line 3 of column 1 I have a vector v31=("marc",
>>> "robert,
>>> "marie").
>>>
>>> What I need to do is to compare all vectors in column 1 and 2, so as to
>>> get,
>>> for example setdiff(v31,v32) into a new column.
>>>
>>> Is there a way to do this in R?
>>>
>>> Thanks!
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>




More information about the R-help mailing list