[BioC] Newbie question
James MacDonald
jmacdon at med.umich.edu
Tue Jan 18 17:36:48 CET 2011
Hi Rama,
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
>>> quantrum75 <quantrum75 at yahoo.com> wrote:
> Hi There,
> I am a newbie to Bioconductor. I am trying to get off the ground with
> regards to my analysis. My question is as follows.
>
> 1) I have a list of 1000-2000 SNP's which are associated with a set of few
> diseases (Hemochromatosis etc).
And what sort of data do you have? RSIDs?
>
> 2) We would like to use Affy6.0 Chip to look for the presence of these SNPs
> experimentally.
>
> 3) Even before we use the chips, How do I verify that these SNP's are
> actually located on the Affy6.0 chip?
There are numerous ways to do this. You could for instance download the snpArrayAffy6 table from here
http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=183744723&clade=mammal&org=Human&db=hg18&hgta_group=allTables&hgta_track=hg18&hgta_table=snpArrayAffy6&hgta_regionType=range&position=chr1%3A1-48&hgta_outputType=primaryTable&hgta_outFileName=
and then import into R and then see what overlap you have.
Or you could query it directly using RMySQL:
> library(RMySQL)
Loading required package: DBI
> con <- dbConnect("MySQL", user="genome", host="genome-mysql.cse.ucsc.edu", dbname="hg18")
> fakersids <- paste("rs", sample(1000:10000, 200), sep = "")
> head(fakersids)
[1] "rs7579" "rs2720" "rs4184" "rs9439" "rs3179" "rs4682"
> sql <- paste("select name from snpArrayAffy6 where rsId in ('", paste(fakersids, collapse = "','"), "');", sep = "")
> length(dbGetQuery(con, sql)[,1])
[1] 27
> dbGetQuery(con, sql)
name
1 SNP_A-2276689
2 SNP_A-8607650
3 SNP_A-8288820
4 SNP_A-2077380
5 SNP_A-2122995
6 SNP_A-8336953
7 SNP_A-8364177
<snip>
LOL. So there are 27 of my fake RSIDs on the Affy SNP6 chip.
Best,
Jim
>
> 4) I am assuming I will need to write a small R script of some sorts which
> will obtain the list of my SNP's -> Compare it against a database which has the
> numbers of the all the SNP's on 6.0 chip and append into an empty list the
> ones which are present on the chip and ones that are not.
>
> 5) The problem is, I do not know which software package in bioconductor to
> start using with...There are well over 400 packages and I am confused as to
> which ones to start off with (Affy?, Oligo? Affyparser?).
>
> I understand this is extremely basic, but I would be glad for any
> information regarding a workflow I could adopt, and also any possible
> literature references.
> Thank you.
> Rama
>
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list