[BioC] Detection copy number variations with cn.MOPS package
Günter Klambauer
klambauer at bioinf.jku.at
Mon Dec 10 10:22:35 CET 2012
Hello Laura,
For the "CNV regions" I join the individual CNV regions, when they
cluster at some location. Therefore the CNV region can be longer
than the individual CNVs. To obtain the copy number for this region I
summarize the counts from the initial segments. Therefore it
may be that the posterior probability of CN2 is the largest for all
samples. The posterior probabilities for other copy numbers should also
be high.
I suggest you look in the slot "cnvs", "segmentation" or
"integerCopyNumber" -- there you should find copy numbers different from 2.
Another possibility is to lower the parameter "priorImpact". This
parameter enforces the algorithm to explain the data by copy number 2.
Lowering it
leads to more detections. It should be adjusted to each data set.
I hope this helps,
Regards,
Günter
On 12/07/2012 04:35 PM, lpascual wrote:
> Dear group,
>
> I'm trying to detect copy number variations with the cn.MOPS package. I
> have eight different samples (coming from different individuals
> re-sequenced by Solexa). Sequences have been mapped against the
> reference genome with BWA, the genome coverage of my samples ranges
> from 13x to 6x. I have run the cn.mops using the package default parameters.
> However, when I run the algorithm it detects some CNV regions for which
> the copy number is 2 for all the individuals. Does anyone have an
> explanation for this result?
> I paste you here the code I have used and one example of a detected CNV
> where the copy number is 2.
>
> Thanks in advance
>
> Laura
>
> > sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
> [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Rsamtools_1.8.6 Biostrings_2.24.1 cn.mops_1.2.6
> [4] GenomicRanges_1.8.12 IRanges_1.14.4 Biobase_2.16.0
> [7] BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] bitops_1.0-4.1 stats4_2.15.1 zlibbioc_1.2.0
>
> > Countsreads_SL2.40ch02<-getReadCountsFromBAM(BAMFiles,
> c("Cervil","Criollo_new","Ferum_new","LA0147","LA1420","Levovil","Plovdiv","Stupike"),refSeqName=c("SL2.40ch02"),mode="unpaired")
>
> > resCNMOPS_SL2.40ch02<-cn.mops(Countsreads_SL2.40ch02)
>
> > head(params(resCNMOPS_SL2.40ch02),n=11L)
> $method
> [1] "cn.mops"
> $folds
> [1] 0.025 0.500 1.000 1.500 2.000 2.500 3.000 3.500 4.000
> $classes
> [1] "CN0" "CN1" "CN2" "CN3" "CN4" "CN5" "CN6" "CN7" "CN8"
> $priorimpact
> [1] 1
> $cyc
> [1] 20
> $normType
> [1] "poisson"
> $normQu
> [1] 0.25
> $upperThreshold
> [1] 0.5
> $lowerThreshold
> [1] -0.9
> $minWidth
> [1] 3
> $SegmentationParams
> character(0)
>
> > *Countsreads*_SL2.40ch02[1512]
> GRanges with 1 range and 8 elementMetadata cols:
> seqnames ranges strand | Levovil Criollo_new
> Ferum_new Stupike Plovdiv LA1420 Cervil LA0147
> <Rle> <IRanges> <Rle> | <integer> <integer> <integer> <integer>
> <integer> <integer> <integer> <integer>
> [1] SL2.40ch02 [*43819001, 43848000*] * | *112 403
> 164 **1306 1192 1181 907 617*
>
> seqlengths:
> SL2.40ch02
> NA
>
> > *cnvs*(resCNMOPS_SL2.40ch02)[6]
> GRanges with 1 range and 4 elementMetadata cols:
> seqnames ranges strand | sampleName
> median mean *CN*
> <Rle> <IRanges> <Rle> | <factor> <numeric> <numeric> *<character>*
> [1] SL2.40ch02 [*43819001, 43906000*] * | Criollo_new
> 0.5849618 0.63563 *CN2*
>
> seqlengths:
> SL2.40ch02
> NA
>
> > *cnvr*(resCNMOPS_SL2.40ch02)[7]
> GRanges with 1 range and 8 elementMetadata cols:
> seqnames ranges strand | CN.Levovil
> CN.Criollo_new CN.Ferum_new CN.Stupike CN.Plovdiv CN.LA1420 CN.Cervil
> CN.LA0147
> <Rle> <IRanges> <Rle> | <factor> <factor> <factor> <factor> <factor>
> <factor> <factor> <factor>
> [1] SL2.40ch02 [*43819001, 43906000*] * | *CN2 CN2
> CN2 CN2 CN2 CN2 CN2 CN2*
>
> seqlengths:
> SL2.40ch02
> NA
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list