[Bioc-sig-seq] overlaying coverage plots

Martin Morgan mtmorgan at fhcrc.org
Thu Aug 12 21:04:27 CEST 2010


On 08/12/2010 11:47 AM, LU Zen wrote:
> Hi Martin,
> 
> Thanks for your help. This may sound really silly. My understanding of vector in R is that this is a row of numbers. In my case, I have a table consisting of 2 columns, position vs coverage. So I went tried:
> 
>> x<-read.table(file="B6_CAST_s1_cov1.txt", sep="\t", header=F)

I guess here x is now a data.frame. Suppose the position is in the first
column, the coverage at that position in the second column; I'm assuming
that position is 1-based, i.e., that the first nucleotide on the
chromosome is nucleotide number 1, and not nucleotide number 0. You
could create a vector cvg that is as long as the maximum position.

  cvg = numeric(max(x[[1]]))

cvg is initially all 0, and you could set the non-zero values to the
correct coverage with

  cvg[x[[1]]] = x[[2]]

and then

  plotLongVector(cvg)

Your use of quotes below isn't correct; try working through 'An
Introduction to R', available in a browser after typing

  help.start()

Martin

>> r<-Rle('x')
>> plotLongVector('r')
> Error in plot.window(...) : invalid 'ylim' value
> 
> Do I need to transpose my data first?
> 
> Thank you.
> 
> Cheers,
> Zen
> 
> 
> 
>> -----Original Message-----
>> From: Martin Morgan [mailto:mtmorgan at fhcrc.org]
>> Sent: 12 August 2010 18:00
>> To: LU Zen
>> Cc: 'bioc-sig-sequencing at r-project.org'
>> Subject: Re: [Bioc-sig-seq] overlaying coverage plots
>>
>> On 08/12/2010 09:20 AM, LU Zen wrote:
>>> I'm trying to overlay coverage plots of individual chromosomes from
>>> different experiments to get a quick overview of probable CNVs. I've
>>> tried using simple plot, ggplot and plotrix packages of R (and I'm a
>>> real novice in R) but it seems that my linux machine with 64GB of
>>> memory is unable to handle the task. I've also reduced my file size
>>> by putting only the coordinates and the coverages derived from
>>> samtools pileup into a single file.
>>>
>>> My understanding is that I should be able to plot the coverage of a
>>> single chromosome with the bioconductor package but is it possible to
>>> overlay multiple plots using the package? I'll be really grateful if
>>> someone can advice on the way to do this.
>>
>> One easy possibility is to use HilbertVis' plotLongVector function with
>> standard R graphics commands. So here's some long data (simulated with
>> another function in HlibertVis)
>>
>>   library(HilbertVis)
>>   x <- makeRandomTestData() # 1e7 entries
>>
>> Then we set up a device so that we'll plot into 4 'rows' and 1 'column',
>> with margins on each plot fairly tight (see ?par)
>>
>>   par(mfcol=c(4, 1), mar=c(1, 4, 2, 2))
>>
>> And then we'll create four plots, showing progressively more extreme peaks
>>
>>   par(mfcol=c(4, 1), mar=c(1, 4, 2, 2))
>>   plotLongVector(x)
>>   plotLongVector(x * (abs(x) > 50))
>>   plotLongVector(x * (abs(x) > 100))
>>   plotLongVector(x * (abs(x) > 200))
>>
>> This would also work with IRanges' Rle objects
>>
>>   library(IRanges)
>>   r <-	Rle(x)
>>   plotLongVector(r)
>>   plotLongVector(r * (abs(r) > 50))
>>   plotLongVector(r * (abs(r) > 100))
>>   plotLongVector(r * (abs(r) > 200))
>>
>> One could also easily 'zoom in'
>>
>>   len = length(r)
>>   plotLongVector(r)
>>   plotLongVector(seqselect(r, len/10, 9 * len / 10))
>>   plotLongVector(seqselect(r, len/100, 9 * len / 100))
>>   plotLongVector(seqselect(r, len/1000, 9 * len / 1000))
>>
>> (though here the x-coordinates are not correct).
>>
>> Also of course one might explore the raison d'etre of the package, and
>> its companion HilbertVisGUI
>>
>>   showHilbertImage(hilbertImage(x))
>>
>> Martin
>>
>>>
>>> Thank you.
>>>
>>> Zen
>>>
>>>
>>>
>>>
>>>
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>>
>>> _______________________________________________ Bioc-sig-sequencing
>>> mailing list Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>> --
>> Martin Morgan
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
> 


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-sig-sequencing mailing list