[R] More than on loop??

che fadialnaji at live.com
Sat Jan 30 21:38:52 CET 2010


Here is the the written instruction as i managed to get it from my professor,
the graphs and data are attached:

The graph below shows an example of the expected outcome of this course
work. You may
procude a better one. The graph for analysing the motifs of a set of
peptides is designed
this way

• the graph is composed of columns of coloured rectangles

• a column corresponding to a residue from “N4” to “C4”. Note that eight
residues
are denoted by “N4”, “N3”, “N2”, “N1”, “C1”, “C2”, “C3”, “C4”. “N4” means
the
4th flanking residue of a cleavage site on the N-terminal side and “C3”
means the 3rd
flanking residue of a cleavage site on the C-terminal side. The cleavage
occurs between
“N1” and “C1”.

• there are 20 rectangles in each column corresponding to 20 amino acids. A
rectangular
of an amino acid has a larger height if the corresponding amino acid has a
larger
frequency to occur at the residue, for instance, the rectangular of “S” in
the first
column for the cleaved peptides.

• a letter of an amino acid is printed within a rectangular. Its font size
depends on the
frequency of the amino acid in a residue.

In your package, you need to have the following functions
1. set a colour map using the following or your own design
• colmap<-c("#FFFFFF", "#FFFFCC", "#FFFF99", "#FFFF66", "#FFFF33",
"#FFFF00", "#FFCCFF", "#FFCCCC", "#FFCC99", "#FFCC66", "#FFCC33",
"#FFCC00", "#FF99FF", "#FF99CC", "#FF9999", "#FF9966", "#FF9933",
"#FF9900", "#FF33FF", "#FF33CC")
2. define a set of amino acids using string or other format if you want
• amino.acid<-"ACDEFGHIKLMNPQRSTVWY"

3. read in the given peptide data (“hiv.dat”) using
read.table(‘‘../data/hiv.dat’’,header=TRUE)
• The data I sent to you should not be saved in the same directory where you
save
your R code!
• The data is composed of two parts, cleaved (denoted by “cleaved”) and non
cleaved (denoted by “noncleaved”). The first five lines of the data are
shown
below
Peptide Label
TQIMFETF cleaved
GQVNYEEF cleaved
KVFGRCEL noncleaved
VFGRCELA noncleaved
• to access to the ith peptide, you can use X$Peptide[i]
• to access to the ith label, you can use X$Label[i]

4. detect the number of cleaved peptides and the number of non-cleaved
peptides using
• nrow(X)

5. define two matrices with initialised entries, one for positive peptides
and one for neg-
ative peptides
• matrix(0,AA,mer),where AA is the number of amino acids, and mer is the
number
of residues detected from data using the nchar function
• both matrices have the same size, the number of rows being equal to the
number
of amino acids and the number of columns being equal to the number of
residues
in peptides
• name the columns of these two matrices using
– c("N4","N3","N2","N1","C1","C2","C3","C4"),

6. use one three-loop structure to detect the frequency of amino acids in
cleaved peptides
and one three-loop structure to detect the frequency of amino acids in
non-cleaved
peptides. They should not be mixed in one three-loop structure. The best way
to
handle this is to use a function. The three-loop structure is exampled as
below
for(i in 1:num)#scanning data for all peptides, where num means the number
of peptides
{
for(j in 1:mer)#scanning all residues in a peptide
{
for(k in 1:AA)#scanning 20 amino acids
{
#actions
}
}
}

7. make sure that each frequency matrix needs to be converted to a
percentage, i.e. each
entry in the matrix is divided by the number of cleaved or non-cleaved
peptides and
multiplied by 100. This converted frequency is named as the normalised
frequency.

8. detect the maximum height of the normalised frequency each residue in
cleaved or
non-cleaved peptides using
height<-rep(0,mer)
for(j in 1:mer)
height[j]<-sum(round(X.frequency[,j]))
max.height<-max(height)
• Note that the height of each column in a graph (see the graph on 3)
corresponds
to the summation of 20 frequencies of 20 amino acids for a residue.

9. draw a blank plot using the maximum height
• plot(c(0,10*mer),c(0,max.height),col="white", • • •)
• in this blank plot, you can add graphics as discussed below

10. determine the x coordinate, but it is recommended to use i*10 as the
x-coordinate
where i indexes the residues. The x-coordinate represents columns in the
graph shown
in 3. If there are 8 residues in peptides, there are 8 columns.

11. determine the y coordinate, which is cumulative (see next item below).
The y-
coordinate represents rows in the graph shown in 3. There are always 20 rows
for
20 amino acids. Note that the rows cannot be aligned because the frequency
of an
amino acid in a residue varies.
12. draw a rectangular based on the frequency of each residue and each amino
acid
• rect(x,y,x+10,y+round(X.frequency[k,j]),col=colmap[k]), where k indi-
cates an amino acid and j indicates a residue
• after drawing this rectangular, the y-coordinate “y” should be increased
by round(X.frequency[k,j])
• after one column is drawn for one residue, the x-coordinate “x” should be
in-
creased by 10
13. plot a text at the corresponding position using
• text((x+5),(y+round(X.frequency[k,j])/2),substr(amino.acid,k,k))
14. place two drawings in one plot using the par function
http://n4.nabble.com/file/n1457645/cleaved.jpg cleaved.jpg 
http://n4.nabble.com/file/n1457645/noncleaved.jpg noncleaved.jpg 
http://n4.nabble.com/file/n1457645/hiv.dat hiv.dat 


-- 
View this message in context: http://n4.nabble.com/More-than-on-loop-tp1015851p1457645.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list