[R] Looping through multiple sub elements of a list to compare to multiple components of a vector
debra ragland
ragland.debra at yahoo.com
Wed Dec 2 16:39:50 CET 2015
I think I am making this problem harder than it has to be and so I keep getting stuck on what might be a trivial problem.
I have used the seqinr package to load a protein sequence alignment containing 15 protein sequences;
> library(seqinr) > x = read.alignment("proteins.fasta",format="fasta",forceToLower=FALSE)This automatically loads in a list of 4 elements including the sequences and other information.
I store the sequences to a new list;
> mylist = x$seqwhich returns a character vector of 15 strings.
I have found that if I split the long character strings into individual characters it is easy to use lapply to loop over this list. So I use strsplit;
>list.2 = strsplit(mylist, split = NULL)
>From this list I can determine which proteins have changes at certain positions by using;
>lapply(list.2, "[", 10) == "L"This returns a logical T/F vector for those elements of the list that do/do not the letter L at position 10.
Because each of the protein sequences contains 99amino acids, I want to automate this process so that I do not have to compare/contrast positions 1 x 1. Most of the changes occur between positions/letters 10-95. I have a standard character vector that I wish to use for comparison when looping through the list.
Should I perhaps combine all -- the standard "letter"/aa vector, the list of protein sequences -- into one list? Or is it better to leave them separate for this comparison? I'm not sure what the output should be as I need to use it for another statistical test. Would a list of logical vectors be the most sufficient output to return?
[[alternative HTML version deleted]]
More information about the R-help
mailing list