[Bioc-devel] alphabetFrequency on AAString
Hervé Pagès
hpages at fhcrc.org
Fri Dec 20 09:32:06 CET 2013
Hi Michael,
On 11/12/2013 11:31 AM, Hervé Pagès wrote:
> Hi Michael,
>
> On 11/12/2013 10:27 AM, Michael Lawrence wrote:
>> Seems like the output could be more consistent with the behavior on
>> DNAStringSet, i.e., the counts could be named.
>>
>>> alphabetFrequency(AAString("CYGGAGTRQ"))
>> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 0 0 0
>> 0 0
>> [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
>> 0 0 3
>> 0 0
>> [75] 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 0 0 0
>> 0 0
>> [112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 0 0 0
>> 0 0
>> [149] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 0 0 0
>> 0 0
>> [186] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 0 0 0
>> 0 0
>> [223] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> Right. There is actually no specific method for AAString objects. The
> more generic method for XString objects is being called here. I'll
> change this.
This is done in Biostrings 2.31.7:
> alphabetFrequency(x[[4]])
A R N D C Q E G H I L K
M
3 3 4 3 1 2 3 2 4 4 3 2
1
F P S T W Y V U O B Z X
*
1 1 4 2 1 3 2 0 0 0 0 0
0
- + other
0 0 0
> alphabetFrequency(x)
A R N D C Q E G H I L K M F P S T W Y V U O B Z X * - + other
[1,] 0 2 1 3 5 0 0 2 1 1 0 2 2 1 1 1 1 2 0 0 0 0 0 0 0 0 0 0 0
[2,] 3 1 1 2 0 0 0 0 1 2 2 3 1 0 0 3 1 0 2 0 0 0 0 0 0 0 0 0 0
[3,] 1 2 3 3 2 4 0 2 4 3 0 1 3 4 4 5 0 2 3 1 0 0 0 0 0 0 0 0 0
[4,] 3 3 4 3 1 2 3 2 4 4 3 2 1 1 1 4 2 1 3 2 0 0 0 0 0 0 0 0 0
[5,] 1 2 1 1 2 2 1 1 0 1 1 2 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
[6,] 1 0 2 1 0 0 0 2 1 0 2 2 3 2 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0
[7,] 1 0 2 1 1 1 1 1 0 1 1 0 1 1 0 2 1 1 1 3 0 0 0 0 0 0 0 0 0
[8,] 0 3 1 1 1 2 0 1 0 1 0 1 3 5 1 2 0 0 2 2 0 0 0 0 0 0 0 0 0
[9,] 0 1 3 2 1 1 3 1 2 2 0 1 1 0 3 2 2 1 2 3 0 0 0 0 0 0 0 0 0
[10,] 0 0 0 1 0 1 2 1 3 3 0 2 2 1 1 2 3 5 3 1 0 0 0 0 0 0 0 0 0
The reason there is an "other" col is that the Amino Acid alphabet
is not enforced (yet).
Cheers,
H.
>
> H.
>
>
>>
>> Thanks,
>> Michael
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list