[BioC] makeGappedAlignmentPairs for broken pairs
Ido Tamir
tamir at imp.ac.at
Fri Jan 17 10:50:06 CET 2014
Dear Valerie,
thank you very much for your reply. I can't test the development packages at the moment, I'm still on 2.12.
> param <- ScanBamParam(flag=scanBamFlag(hasUnmappedMate=TRUE))
a)
Will this load only reads with unmapped mates or will this allow both proper paired and unpaired which is what I really want.
reading the files in twice (once for proper pairs, once for single) would be possible, but I prefer reading in once.
b)
will findMateAlignment be extended to allow these unmapped mate reads to be represented in a GAlignmentPairs class?
c)
Do you know when this (2.14) will be released?
thank you very much,
ido
On Jan 16, 2014, at 9:09 PM, Valerie Obenchain <vobencha at fhcrc.org> wrote:
> Hi Ido,
>
> library(GenomicAlignments)
> fl <- system.file("extdata", "ex1.bam", package="Rsamtools")
>
> You can set different flags in ScanBamParam to control what is read in.
> See ?ScanBamParam for details. Here is a 'param' that reads only records with unmapped mates.
>
> param <- ScanBamParam(flag=scanBamFlag(hasUnmappedMate=TRUE))
>
> Two classes can hold paired-end reads, one is GAlignmentPairs and the other is GAlignmentsList. They run the same mate-pairing code but return different objects.
>
> ## GAlignmentPairs:
>
> Oops. Looks like there is a bug in readGAlignmentPairs() when this flag is set. We will fix that.
> > gapairs <- readGAlignmentPairs(fl, param=param)
> Error in .combineBamFlagFilters(bamFlag(param, asInteger = TRUE), flag0) :
> BAM flag filters to combine are incompatible
>
> ## GAlignmentsList:
>
> galist <- readGAlignmentsList(fl, param=param)
>>> galist
>> GAlignmentsList of length 127:
>> [[1]]
>> GAlignments with 1 alignment and 0 metadata columns:
>> seqnames strand cigar qwidth start end width ngap
>> [1] seq2 + 35M 35 1520 1554 35 0
>>
>> [[2]]
>> GAlignments with 1 alignment and 0 metadata columns:
>> seqnames strand cigar qwidth start end width ngap
>> [1] seq1 + 36M 36 6 41 36 0
>
> There is a metadata column on the list that has the mate status. In this case they are all 'unmated'.
> > mcols(galist)
> DataFrame with 127 rows and 1 column
> mate_status
> <factor>
> 1 unmated
> 2 unmated
> 3 unmated
> 4 unmated
> 5 unmated
> ... ...
>
> This code is available in the devel branch.
>
> Valerie
>
>
> > sessionInfo()
> R Under development (unstable) (2014-01-06 r64682)
> Platform: x86_64-unknown-linux-gnu (64-bit)
> ...
>
> other attached packages:
> [1] GenomicAlignments_0.99.11 VariantAnnotation_1.9.28
> [3] Rsamtools_1.15.18 Biostrings_2.31.7
> [5] GenomicRanges_1.15.19 XVector_0.3.6
> [7] IRanges_1.21.19 BiocGenerics_0.9.2
>
> loaded via a namespace (and not attached):
> [1] AnnotationDbi_1.25.9 Biobase_2.23.3 biomaRt_2.19.1
> [4] bitops_1.0-6 BSgenome_1.31.7 DBI_0.2-7
> [7] GenomicFeatures_1.15.4 RCurl_1.95-4.1 RSQLite_0.11.4
> [10] rtracklayer_1.23.6 stats4_3.1.0 tools_3.1.0
> [13] XML_3.98-1.1 zlibbioc_1.9.0
>
>
>
>
> On 01/16/2014 02:57 AM, Ido Tamir wrote:
>> Hi,
>> is there a roadmap for allowing paired end reads where only one is mapped to be reported?
>> Basically I would like to have readGappedAlignmentPairs report them also.
>> One could reread the whole file and take only those reads where the makeGappedAlignmentPairs test
>> was unsuccessfull.
>>
>> thank you very much,
>> ido
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
> --
> Valerie Obenchain
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B155
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: vobencha at fhcrc.org
> Phone: (206) 667-3158
> Fax: (206) 667-1319
More information about the Bioconductor
mailing list