[BioC] gwSnpTests in GGtools
francesca casalino
francy.casalino at gmail.com
Wed Nov 30 13:53:55 CET 2011
Thank you so much! It is much clearer now.
Also, the code you suggested for splitting the data into chromosomes
works perfectly.
Thanks again!
2011/11/29 Vincent Carey <stvjc at channing.harvard.edu>:
>
>
> On Tue, Nov 29, 2011 at 10:04 AM, francesca casalino
> <francy.casalino at gmail.com> wrote:
>>
>> Thank you so much for all your valuable information and help.
>>
>> The gwSnpTests now works, since I specified a function as you suggested:
>>
>> f1 = gwSnpTests(probeId("10023813203")~GENDER, ss)
>>
>> However I still find myself having to limit the chromosome data at the
>> plink import stage to chromosome 19, and can't seem to be able to do
>> this after the creation of the smlSet. As you say, the gwSnpTeststakes
>> uses all the SNPs in the smList(smlSet) to run the association, but I
>> am not getting a list organized by chromosome, and I think my problem
>> is the original sml_Set list, which I created doing this:
>>
>> ss<- make_smlSet(es, list("1"=snp.matrix$genotypes))
>>
>
> The above command will create an smlSet with smList of length one. If you
> want to be
> able to separate out chromosomes of SNP, you should take
> snp.matrix$genotypes and do a
> little more work. The columns of this entity are SNP loci. You might be
> able to do the following.
> Make a table, say CMAP, that has column 1 the snp IDs (usually rs numbers,
> perhaps something else) and column
> 2 the chromosome to which each SNP belongs. For simplicity make sure that
> each column is a character vector.
>
> then create the list SMAP = split(CMAP[,1], CMAP[,2])
>
> this will be a list with as many elements as there are chromosomes. The
> contents of each list element are the
> snp residing on the corresponding chromosome, and the names(SMAP) is a
> vector of chromosome names
>
> Now, SLIST = lapply(SMAP, function(x) snp.matrix$genotypes[, x]) # errors
> can occur if SMAP has elements not in colnames ...
> names(SLIST) = names(SMAP)
>
> will create a list of SnpMatrix instances with one SnpMatrix per
> chromosome. you can use SLIST as
> the second argument to make_smlSet, and you can use chrnum() to select
> specific element used in testing
> with gwSnpTests.
>
>>
>> While I have SNPs from all chromosomes in the snp.matrix$genotypes, I
>> guess I am indicating wrongly here that I only have one chromosome. Is
>> it possible to create an smlSet with all the chromosomes, and then
>> select only some using:
>> chr19_20 <- getSS("GGtools", c("19", "20"))
>>
>> Also when I try to find the genesymbol APOE (which is present in the
>> original plink files) I get:
>>
>> f2 = gwSnpTests(genesym("APOE")~GENDER, ss)
>> character(0)
>
>
> This is a convenience feature that will succeed if you have annotation(ss) =
> "[name of a bioconductor .db chip annotation package]" and the symbol you
> ask for is mapped.
>
>>
>> Failed with error: ‘'package' must be of length 1’
>> Error in revmap(get(paste(gsub(".db", "", annpack), "SYMBOL", sep =
>> ""))) :
>> error in evaluating the argument 'x' in selecting a method for
>> function 'revmap': Error in get(paste(gsub(".db", "", annpack),
>> "SYMBOL", sep = "")) :
>> object 'SYMBOL' not found
>>
>> Which could still be due to creating an incorrect sml_Set.
>>
>
> These error messages can be clarified, sorry about this.
>
>>
>> Thank you once more for your all your advices!
>> -f
>>
>> 2011/11/29 Vincent Carey <stvjc at channing.harvard.edu>:
>> >
>> >
>> > On Tue, Nov 29, 2011 at 5:20 AM, francy [guest] <guest at bioconductor.org>
>> > wrote:
>> >>
>> >>
>> >> Hi all,
>> >>
>> >> I am trying to find eQTLs in or around a particular gene with probe ID=
>> >> "10023813203" (gene is APOE). I have first selected the SNPs on only my
>> >> chromosome of interest (chr19), then imported the plink files only for
>> >> this
>> >> chromosome doing this:
>> >>
>> >> snp.matrix<-read.plink("plink.bed", "plink.bim",
>> >> "plink.fam",select.snps=chr19)
>> >>
>> >> I was able to create an expression set (called 'es'), and a sml_Set by
>> >> doing this:
>> >>
>> >> ss<- make_smlSet(es, list("1"=snp.matrix$genotypes))
>> >>
>> >> but I can't seem to go beyond and use this sml_Set to perform the
>> >> association.
>> >>
>> >> When I try this or other combinations of this (e.g. using 'APOE')
>> >> f1 = gwSnpTests(probeId("10023813203"), ss)
>> >>
>> >> Error in function (classes, fdef, mtable) :
>> >> unable to find an inherited method for function "gwSnpTests", for
>> >> signature "character", "smlSet", "missing", "missing"
>> >>
>> >
>> > When you encounter an error of this sort, please check what signatures
>> > are
>> > supported:
>> >
>> >> showMethods("gwSnpTests")
>> > Function: gwSnpTests (package GGtools)
>> > sym="formula", sms="smlSet", cnum="cnumOrMissing", cs="missing"
>> > sym="formula", sms="smlSet", cnum="snpdepth", cs="missing"
>> >
>> > This shows that the first argument should be a formula, and suggests
>> > that
>> > you can have only two arguments if you like.
>> >
>> > If you change your call to
>> >
>> > f1 = gwSnpTests(probeId("10023813203")~1, ss)
>> >
>> > I would expect it to succeed. Note the example code, from R 2.14, which
>> > you
>> > really should be using at this time:
>> >
>> >> hmceuB36.2021 <- getSS("GGtools", c("20", "21")) # construct smlSet
>> >> from prepackaged data
>> >> hmFou = hmceuB36.2021[, which(hmceuB36.2021$isFounder)] # filter
>> >> samples
>> >> to 'founders'
>> >> f1 = gwSnpTests(genesym("CPNE1")~male, hmFou, chrnum(20)) # execute
>> >> simple set of tests
>> >> f1 # there are 119921 tests, so have a concise report
>> > gwSnpScreenResult for gene CPNE1 [probe GI_23397697-A ]
>> >> topSnps(f1) # get top results
>> > p.val
>> > rs17093026 3.735759e-10
>> > rs1118233 1.272958e-09
>> > rs12480408 1.360682e-09
>> > rs6060535 1.360682e-09
>> > rs11696527 1.360682e-09
>> > rs6058303 1.360682e-09
>> > rs6060578 1.360682e-09
>> > rs2425078 1.360682e-09
>> > rs1970357 1.360682e-09
>> > rs7273815 1.806197e-09
>> >
>> >
>> >
>> >
>> >>
>> >> My first question is, why is the 'gwSnpTests' not working?
>> >
>> >
>> > Please use a supported call sequence.
>> >
>> >>
>> >> My second question is, do I have to select the chromosome I am
>> >> interested
>> >> in before creating the sml test? I would have liked to select
>> >> chromosome 19
>> >> after so that I could analyse more than this one chromosome if I wanted
>> >> to…Is this possible?? It seems as the snp.matrix must be in the form
>> >> of a
>> >> list, so maybe I have to create a list of all the chromosomes?
>> >>
>> >> THANK YOU VERY VERY MUCH FOR ANY HELP YOU COULD GIVE ME!! I really
>> >> appreciate it!
>> >>
>> >
>> > The signatures show that you can omit the chromosome if you wish. In
>> > this
>> > case,
>> >
>> >> f2 = gwSnpTests(genesym("CPNE1")~male, hmFou)
>> >> f2
>> > gwSnpScreenResult for gene CPNE1 [probe GI_23397697-A ]
>> >> topSnps(f2)
>> > $`20`
>> > p.val
>> > rs17093026 3.735759e-10
>> > rs1118233 1.272958e-09
>> > rs12480408 1.360682e-09
>> > rs6060535 1.360682e-09
>> > rs11696527 1.360682e-09
>> > rs6058303 1.360682e-09
>> > rs6060578 1.360682e-09
>> > rs2425078 1.360682e-09
>> > rs1970357 1.360682e-09
>> > rs7273815 1.806197e-09
>> >
>> > $`21`
>> > p.val
>> > rs2823672 3.024310e-05
>> > rs4257464 4.340207e-05
>> > rs16994832 4.340207e-05
>> > rs2823676 4.340207e-05
>> > rs2823677 4.340207e-05
>> > rs8131686 4.340207e-05
>> > rs2823683 4.340207e-05
>> > rs238983 4.340207e-05
>> > rs2828436 4.340207e-05
>> > rs2828438 4.340207e-05
>> >
>> > gwSnpTests will operate on all the SNPs in the smList(smlSet) and return
>> > lists organized by
>> > chromosome. Hence the "gw" -- it is possible to compute a genome-wide
>> > search for eQTL if the
>> > smlSet contains SNP from all chromosomes. A few years ago this was
>> > quite
>> > reasonable when we
>> > handled, say, 4 million SNP. Then it became less reasonable when we
>> > started
>> > to work with 8 million
>> > SNP. So the infrastructure changed to deemphasize working with all
>> > chromosomes at once -- thus the
>> > introduction of getSS() to construct the smlSet for a selected set
>> > (typically only one) of chromosomes of SNP.
>> >
>> > Concisely computing and managing results from transcriptome x genome
>> > searches is addressed by
>> > the eqtlTests function and by genewiseFDRtab ... these functions are
>> > under
>> > development to simplify these
>> > tasks, which can be arduous as large imputed SNP panels come into play.
>> > Thus it is relevant to work with
>> > the devel branch as you start hitting limits. I will provide more news
>> > as
>> > work proceeds. As the initial discussion
>> > of this topic occurred on biostar list, I will note for other readers
>> > that a
>> > tutorial on using GGtools with R 2.14 is
>> > available at ismb11gg.wordpress.com, and that the ggtut experimental
>> > data
>> > package underlies the tutorial. Of
>> > note is that ggtut will not pass check with R devel, because some
>> > serialized
>> > objects conflict with revised class
>> > definitions. This will be sorted before too long.
>> >
>> >
>> >>
>> >> -- output of sessionInfo():
>> >
>> >
>> > I strongly advise you to upgrade to R 2.14. My sessionInfo for the runs
>> > above is
>> >
>> > R version 2.14.0 Patched (2011-11-09 r57622)
>> > Platform: x86_64-unknown-linux-gnu (64-bit)
>> >
>> > locale:
>> > [1] LC_CTYPE=en_US.iso88591 LC_NUMERIC=C
>> > [3] LC_TIME=en_US.iso88591 LC_COLLATE=en_US.iso88591
>> > [5] LC_MONETARY=en_US.iso88591 LC_MESSAGES=en_US.iso88591
>> > [7] LC_PAPER=C LC_NAME=C
>> > [9] LC_ADDRESS=C LC_TELEPHONE=C
>> > [11] LC_MEASUREMENT=en_US.iso88591 LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] splines stats graphics grDevices datasets tools utils
>> > [8] methods base
>> >
>> > other attached packages:
>> > [1] illuminaHumanv1.db_1.12.1 GGtools_4.1.8
>> > [3] ff_2.2-3 bit_1.1-7
>> > [5] GenomicRanges_1.6.2 org.Hs.eg.db_2.6.4
>> > [7] rtracklayer_1.14.2 RCurl_1.7-0
>> > [9] bitops_1.0-4.1 IRanges_1.12.1
>> > [11] annotate_1.32.0 AnnotationDbi_1.16.2
>> > [13] GGBase_3.15.2 genefilter_1.36.0
>> > [15] RSQLite_0.10.0 DBI_0.2-5
>> > [17] snpStats_1.4.0 Matrix_1.0-1
>> > [19] lattice_0.20-0 survival_2.36-10
>> > [21] BiocGenerics_0.1.0 Biobase_2.14.0
>> > [23] weaver_1.20.0 codetools_0.2-8
>> > [25] digest_0.5.1 BiocInstaller_1.2.1
>> >
>> > loaded via a namespace (and not attached):
>> > [1] Biostrings_2.22.0 BSgenome_1.22.0 grid_2.14.0
>> > Rsamtools_1.7.1
>> > [5] XML_3.4-3 xtable_1.6-0 zlibbioc_1.0.0
>> >
>> >>
>> >>
>> >> > sessionInfo()
>> >> R version 2.13.1 (2011-07-08)
>> >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>> >>
>> >> locale:
>> >> [1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
>> >>
>> >> attached base packages:
>> >> [1] tools splines stats graphics grDevices utils
>> >> datasets
>> >> [8] methods base
>> >>
>> >> other attached packages:
>> >> [1] GGtools_3.10.2 ff_2.2-3 bit_1.1-7
>> >> [4] GenomicRanges_1.4.8 org.Hs.eg.db_2.5.0 rtracklayer_1.12.5
>> >> [7] RCurl_1.6-10 bitops_1.0-4.1 IRanges_1.10.6
>> >> [10] annotate_1.30.1 AnnotationDbi_1.14.1 GGBase_3.12.0
>> >> [13] RSQLite_0.10.0 DBI_0.2-5 snpStats_1.2.1
>> >> [16] Matrix_0.999375-50 lattice_0.19-33 survival_2.36-9
>> >> [19] Biobase_2.12.2
>> >>
>> >> loaded via a namespace (and not attached):
>> >> [1] Biostrings_2.20.4 BSgenome_1.20.1 grid_2.13.1 XML_3.4-3
>> >> [5] xtable_1.5-6
>> >>
>> >>
>> >> --
>> >> Sent via the guest posting facility at bioconductor.org.
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at r-project.org
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>> >
>
>
More information about the Bioconductor
mailing list