[BioC] Making table with SNP-ID, allele combination and genotype to verify correctness.

Johannes Gulmann Madsen johannes at dsr.life.ku.dk
Mon Mar 23 11:37:41 CET 2009


Hello

I have used as.geneSet in the GeneticsBase package to deal with some SNP data,
and i want to make sure that the package handles it they way i want it to. So i
want to make sure that the programe genotypes the alleles correct, and by that
i mean making a table like this guy did (he did it with the same data as mine,
but he had more data).

> y
                                gentype genotype
 [1,] "0_A2M_DS066406.1_15"     "AG"    "2"
 [2,] "0_A2M_DS066406.1_15"     "GG"    "3"
 [3,] "0_A2M_DS068238.1_4"      "AG"    "2"
 [4,] "0_A2M_DS068238.1_4"      "GG"    "3"
 [5,] "0_A2M_DS068238.1_4"      "AA"    "1"
 [6,] "0_ABCA1_DS062937.1_32"   "AG"    "2"
 [7,] "0_ABCA1_DS062937.1_32"   "GG"    "3"
 [8,] "0_ABCA1_DS062937.1_32"   "AA"    "1"
 [9,] "0_ABCA1_DS073864.1_41"   "AG"    "2"
[10,] "0_ABCA1_DS073864.1_41"   "GG"    "3"
[11,] "0_ABCA1_DS073864.1_41"   "AA"    "1"
[12,] "0_ABCA1_DS073864.1_41_2" "AG"    "2"
[13,] "0_ABCA1_DS073864.1_41_2" "GG"    "3"
[14,] "0_ABCA1_DS073864.1_41_2" "AA"    "1"
[15,] "0_ABCA1_DS078984.1_39"   "AT"    "2"
[16,] "0_ABCA1_DS078984.1_39"   "TT"    "3"
[17,] "0_ABCA1_DS078984.1_39"   "AA"    "1"
[18,] "0_ABCA1_DS082793.1_19"   "AG"    "2"
[19,] "0_ABCA1_DS082793.1_19"   "GG"    "3"
[20,] "0_ABCA1_DS082793.1_19"   "AA"    "1"
[21,] "0_ABCC10_DS066718.1_3"   "AC"    "2"
[22,] "0_ABCC10_DS066718.1_3"   "CC"    "3"
[23,] "0_ABCC6_DS063353.1_20"   "AC"    "2"
[24,] "0_ABCC6_DS063353.1_20"   "CC"    "3"
[25,] "0_ABCC6_DS063353.1_20"   "AA"    "1"
[26,] "0_AARSL_DS061819.1_2"    "GG"    "1"

Im pretty sure that GeneticsBase handle it the right way, cause i can se that
using alleleCount() and alleleLevels(). I already have the table with the
SNP-ID, and with the allele combination:

> z <- cbind(unlist(genotypeLevels(gs)))
> z
                          [,1]
X0_A2M_DS066406.1_151     "G/G"
X0_A2M_DS066406.1_152     "G/A"
X0_A2M_DS068238.1_41      "G/G"
X0_A2M_DS068238.1_42      "G/A"
X0_A2M_DS068238.1_43      "A/A"
X0_ABCA1_DS062937.1_321   "G/G"
X0_ABCA1_DS062937.1_322   "G/A"
X0_ABCA1_DS062937.1_323   "A/A"
X0_ABCA1_DS062937.1_324   "NA/NA"
X0_ABCA1_DS073864.1_411   "G/G"
X0_ABCA1_DS073864.1_412   "G/A"
X0_ABCA1_DS073864.1_413   "A/A"
X0_ABCA1_DS073864.1_414   "NA/NA"
X0_ABCA1_DS073864.1_41_21 "A/A"
X0_ABCA1_DS073864.1_41_22 "A/G"
X0_ABCA1_DS073864.1_41_23 "G/G"
X0_ABCA1_DS073864.1_41_24 "NA/NA"
X0_ABCA1_DS078984.1_391   "A/A"
X0_ABCA1_DS078984.1_392   "A/T"
X0_ABCA1_DS078984.1_393   "T/T"
X0_ABCA1_DS078984.1_394   "NA/NA"
X0_ABCA1_DS082793.1_191   "G/G"
X0_ABCA1_DS082793.1_192   "G/A"
X0_ABCA1_DS082793.1_193   "A/A"
X0_ABCC10_DS066718.1_31   "C/C"
X0_ABCC10_DS066718.1_32   "NA/NA"
X0_ABCC10_DS066718.1_33   "C/A"
X0_ABCC6_DS063353.1_201   "A/C"
X0_ABCC6_DS063353.1_202   "A/A"
X0_ABCC6_DS063353.1_203   "C/C"
X0_ABCC6_DS063353.1_204   "NA/NA"
X0_AARSL_DS061819.1_21    "G/G"
X0_AARSL_DS061819.1_22    "NA/NA"

So what I miss is the column with the genotypes (like the other guy), which i
get from allelCount(). Can anyone help me with that.

Regards,
Johannes

> sessionInfo()
R version 2.8.0 (2008-10-20)
i386-pc-mingw32

locale:
LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] GeneticsBase_1.8.0 haplo.stats_1.3.8  mvtnorm_0.9-5      xtable_1.5-4      
combinat_0.0-6

loaded via a namespace (and not attached):
[1] gdata_2.4.2  gplots_2.6.0 gtools_2.5.0 MASS_7.2-44  tools_2.8.0



More information about the Bioconductor mailing list