[BioC] Making table with SNP-ID, allele combination and genotype to verify correctness.
Johannes Gulmann Madsen
johannes at dsr.life.ku.dk
Mon Mar 23 11:37:41 CET 2009
Hello
I have used as.geneSet in the GeneticsBase package to deal with some SNP data,
and i want to make sure that the package handles it they way i want it to. So i
want to make sure that the programe genotypes the alleles correct, and by that
i mean making a table like this guy did (he did it with the same data as mine,
but he had more data).
> y
gentype genotype
[1,] "0_A2M_DS066406.1_15" "AG" "2"
[2,] "0_A2M_DS066406.1_15" "GG" "3"
[3,] "0_A2M_DS068238.1_4" "AG" "2"
[4,] "0_A2M_DS068238.1_4" "GG" "3"
[5,] "0_A2M_DS068238.1_4" "AA" "1"
[6,] "0_ABCA1_DS062937.1_32" "AG" "2"
[7,] "0_ABCA1_DS062937.1_32" "GG" "3"
[8,] "0_ABCA1_DS062937.1_32" "AA" "1"
[9,] "0_ABCA1_DS073864.1_41" "AG" "2"
[10,] "0_ABCA1_DS073864.1_41" "GG" "3"
[11,] "0_ABCA1_DS073864.1_41" "AA" "1"
[12,] "0_ABCA1_DS073864.1_41_2" "AG" "2"
[13,] "0_ABCA1_DS073864.1_41_2" "GG" "3"
[14,] "0_ABCA1_DS073864.1_41_2" "AA" "1"
[15,] "0_ABCA1_DS078984.1_39" "AT" "2"
[16,] "0_ABCA1_DS078984.1_39" "TT" "3"
[17,] "0_ABCA1_DS078984.1_39" "AA" "1"
[18,] "0_ABCA1_DS082793.1_19" "AG" "2"
[19,] "0_ABCA1_DS082793.1_19" "GG" "3"
[20,] "0_ABCA1_DS082793.1_19" "AA" "1"
[21,] "0_ABCC10_DS066718.1_3" "AC" "2"
[22,] "0_ABCC10_DS066718.1_3" "CC" "3"
[23,] "0_ABCC6_DS063353.1_20" "AC" "2"
[24,] "0_ABCC6_DS063353.1_20" "CC" "3"
[25,] "0_ABCC6_DS063353.1_20" "AA" "1"
[26,] "0_AARSL_DS061819.1_2" "GG" "1"
Im pretty sure that GeneticsBase handle it the right way, cause i can se that
using alleleCount() and alleleLevels(). I already have the table with the
SNP-ID, and with the allele combination:
> z <- cbind(unlist(genotypeLevels(gs)))
> z
[,1]
X0_A2M_DS066406.1_151 "G/G"
X0_A2M_DS066406.1_152 "G/A"
X0_A2M_DS068238.1_41 "G/G"
X0_A2M_DS068238.1_42 "G/A"
X0_A2M_DS068238.1_43 "A/A"
X0_ABCA1_DS062937.1_321 "G/G"
X0_ABCA1_DS062937.1_322 "G/A"
X0_ABCA1_DS062937.1_323 "A/A"
X0_ABCA1_DS062937.1_324 "NA/NA"
X0_ABCA1_DS073864.1_411 "G/G"
X0_ABCA1_DS073864.1_412 "G/A"
X0_ABCA1_DS073864.1_413 "A/A"
X0_ABCA1_DS073864.1_414 "NA/NA"
X0_ABCA1_DS073864.1_41_21 "A/A"
X0_ABCA1_DS073864.1_41_22 "A/G"
X0_ABCA1_DS073864.1_41_23 "G/G"
X0_ABCA1_DS073864.1_41_24 "NA/NA"
X0_ABCA1_DS078984.1_391 "A/A"
X0_ABCA1_DS078984.1_392 "A/T"
X0_ABCA1_DS078984.1_393 "T/T"
X0_ABCA1_DS078984.1_394 "NA/NA"
X0_ABCA1_DS082793.1_191 "G/G"
X0_ABCA1_DS082793.1_192 "G/A"
X0_ABCA1_DS082793.1_193 "A/A"
X0_ABCC10_DS066718.1_31 "C/C"
X0_ABCC10_DS066718.1_32 "NA/NA"
X0_ABCC10_DS066718.1_33 "C/A"
X0_ABCC6_DS063353.1_201 "A/C"
X0_ABCC6_DS063353.1_202 "A/A"
X0_ABCC6_DS063353.1_203 "C/C"
X0_ABCC6_DS063353.1_204 "NA/NA"
X0_AARSL_DS061819.1_21 "G/G"
X0_AARSL_DS061819.1_22 "NA/NA"
So what I miss is the column with the genotypes (like the other guy), which i
get from allelCount(). Can anyone help me with that.
Regards,
Johannes
> sessionInfo()
R version 2.8.0 (2008-10-20)
i386-pc-mingw32
locale:
LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GeneticsBase_1.8.0 haplo.stats_1.3.8 mvtnorm_0.9-5 xtable_1.5-4
combinat_0.0-6
loaded via a namespace (and not attached):
[1] gdata_2.4.2 gplots_2.6.0 gtools_2.5.0 MASS_7.2-44 tools_2.8.0
More information about the Bioconductor
mailing list