[Bioc-sig-seq] overlaying coverage plots
Wolfgang Huber
whuber at embl.de
Fri Aug 13 13:50:56 CEST 2010
Lu,
if you type "plotLongVector" into the R interpreter, you will see its
(simple) implementation, and also, see why you currently cannot specify
your own argument for "ylim". I suppose that the implementation will
soon be updated to address this problem, but for the time being, you
could simple define your own copy of that function, and set ylim as you
wish.
Best wishes
Wolfgang.
LU Zen scripsit 13/08/10 13:03:
> Hi Martin,
>
> Thanks for your help again.
>
> While I can zoom in the plot by limiting the range of the x-coordinates, I can't get the ylim to work. The presence of a some exceptionally high coverage values just distords the plot. When I included ylim, this is what I got:
>
>> plotLongVector(cvg, ylim = range(0,100))
> Error in plot.default(NULL, xlab = xlab, ylab = ylab, xlim = offset + :
> formal argument "ylim" matched by multiple actual arguments
>
> Is there any trick that I'm missing here?
>
> Thanks,
> Zen
>
>> -----Original Message-----
>> From: Martin Morgan [mailto:mtmorgan at fhcrc.org]
>> Sent: 12 August 2010 20:04
>> To: LU Zen
>> Cc: 'bioc-sig-sequencing at r-project.org'
>> Subject: Re: [Bioc-sig-seq] overlaying coverage plots
>>
>> On 08/12/2010 11:47 AM, LU Zen wrote:
>>> Hi Martin,
>>>
>>> Thanks for your help. This may sound really silly. My understanding of vector in R is that this is a
>> row of numbers. In my case, I have a table consisting of 2 columns, position vs coverage. So I went
>> tried:
>>>
>>>> x<-read.table(file="B6_CAST_s1_cov1.txt", sep="\t", header=F)
>>
>> I guess here x is now a data.frame. Suppose the position is in the first
>> column, the coverage at that position in the second column; I'm assuming
>> that position is 1-based, i.e., that the first nucleotide on the
>> chromosome is nucleotide number 1, and not nucleotide number 0. You
>> could create a vector cvg that is as long as the maximum position.
>>
>> cvg = numeric(max(x[[1]]))
>>
>> cvg is initially all 0, and you could set the non-zero values to the
>> correct coverage with
>>
>> cvg[x[[1]]] = x[[2]]
>>
>> and then
>>
>> plotLongVector(cvg)
>>
>> Your use of quotes below isn't correct; try working through 'An
>> Introduction to R', available in a browser after typing
>>
>> help.start()
>>
>> Martin
>>
>>>> r<-Rle('x')
>>>> plotLongVector('r')
>>> Error in plot.window(...) : invalid 'ylim' value
>>>
>>> Do I need to transpose my data first?
>>>
>>> Thank you.
>>>
>>> Cheers,
>>> Zen
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Martin Morgan [mailto:mtmorgan at fhcrc.org]
>>>> Sent: 12 August 2010 18:00
>>>> To: LU Zen
>>>> Cc: 'bioc-sig-sequencing at r-project.org'
>>>> Subject: Re: [Bioc-sig-seq] overlaying coverage plots
>>>>
>>>> On 08/12/2010 09:20 AM, LU Zen wrote:
>>>>> I'm trying to overlay coverage plots of individual chromosomes from
>>>>> different experiments to get a quick overview of probable CNVs. I've
>>>>> tried using simple plot, ggplot and plotrix packages of R (and I'm a
>>>>> real novice in R) but it seems that my linux machine with 64GB of
>>>>> memory is unable to handle the task. I've also reduced my file size
>>>>> by putting only the coordinates and the coverages derived from
>>>>> samtools pileup into a single file.
>>>>>
>>>>> My understanding is that I should be able to plot the coverage of a
>>>>> single chromosome with the bioconductor package but is it possible to
>>>>> overlay multiple plots using the package? I'll be really grateful if
>>>>> someone can advice on the way to do this.
>>>>
>>>> One easy possibility is to use HilbertVis' plotLongVector function with
>>>> standard R graphics commands. So here's some long data (simulated with
>>>> another function in HlibertVis)
>>>>
>>>> library(HilbertVis)
>>>> x<- makeRandomTestData() # 1e7 entries
>>>>
>>>> Then we set up a device so that we'll plot into 4 'rows' and 1 'column',
>>>> with margins on each plot fairly tight (see ?par)
>>>>
>>>> par(mfcol=c(4, 1), mar=c(1, 4, 2, 2))
>>>>
>>>> And then we'll create four plots, showing progressively more extreme peaks
>>>>
>>>> par(mfcol=c(4, 1), mar=c(1, 4, 2, 2))
>>>> plotLongVector(x)
>>>> plotLongVector(x * (abs(x)> 50))
>>>> plotLongVector(x * (abs(x)> 100))
>>>> plotLongVector(x * (abs(x)> 200))
>>>>
>>>> This would also work with IRanges' Rle objects
>>>>
>>>> library(IRanges)
>>>> r<- Rle(x)
>>>> plotLongVector(r)
>>>> plotLongVector(r * (abs(r)> 50))
>>>> plotLongVector(r * (abs(r)> 100))
>>>> plotLongVector(r * (abs(r)> 200))
>>>>
>>>> One could also easily 'zoom in'
>>>>
>>>> len = length(r)
>>>> plotLongVector(r)
>>>> plotLongVector(seqselect(r, len/10, 9 * len / 10))
>>>> plotLongVector(seqselect(r, len/100, 9 * len / 100))
>>>> plotLongVector(seqselect(r, len/1000, 9 * len / 1000))
>>>>
>>>> (though here the x-coordinates are not correct).
>>>>
>>>> Also of course one might explore the raison d'etre of the package, and
>>>> its companion HilbertVisGUI
>>>>
>>>> showHilbertImage(hilbertImage(x))
>>>>
>>>> Martin
>>>>
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Zen
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The University of Edinburgh is a charitable body, registered in
>>>>> Scotland, with registration number SC005336.
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________ Bioc-sig-sequencing
>>>>> mailing list Bioc-sig-sequencing at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>>>
>>>>
>>>> --
>>>> Martin Morgan
>>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N.
>>>> PO Box 19024 Seattle, WA 98109
>>>>
>>>> Location: Arnold Building M1 B861
>>>> Phone: (206) 667-2793
>>>
>>
>>
>> --
>> Martin Morgan
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>
--
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
More information about the Bioc-sig-sequencing
mailing list