[BioC] Newbie question

James MacDonald jmacdon at med.umich.edu
Tue Jan 18 17:36:48 CET 2011


Hi Rama,
-- 

James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826


>>> quantrum75 <quantrum75 at yahoo.com> wrote:
> Hi There,
> I am a newbie to Bioconductor. I am trying to get off the ground with 
> regards to my analysis. My question is as follows. 
> 
> 1) I have a list of 1000-2000 SNP's which are associated with a set of few 
> diseases (Hemochromatosis etc).

And what sort of data do you have? RSIDs?


> 
> 2) We would like to use Affy6.0 Chip to look for the presence of these SNPs 
> experimentally.
> 
> 3) Even before we use the chips, How do I verify that these SNP's are 
> actually located on the Affy6.0 chip?

There are numerous ways to do this. You could for instance download the snpArrayAffy6 table from here

http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=183744723&clade=mammal&org=Human&db=hg18&hgta_group=allTables&hgta_track=hg18&hgta_table=snpArrayAffy6&hgta_regionType=range&position=chr1%3A1-48&hgta_outputType=primaryTable&hgta_outFileName=

and then import into R and then see what overlap you have.

Or you could query it directly using RMySQL:

> library(RMySQL)
Loading required package: DBI
> con <- dbConnect("MySQL", user="genome", host="genome-mysql.cse.ucsc.edu", dbname="hg18")
> fakersids <- paste("rs", sample(1000:10000, 200), sep = "")
> head(fakersids)
[1] "rs7579" "rs2720" "rs4184" "rs9439" "rs3179" "rs4682"
> sql <- paste("select name from snpArrayAffy6 where rsId in ('", paste(fakersids, collapse = "','"), "');", sep = "")
> length(dbGetQuery(con, sql)[,1])
[1] 27
> dbGetQuery(con, sql)
               name
1     SNP_A-2276689
2     SNP_A-8607650
3     SNP_A-8288820
4     SNP_A-2077380
5     SNP_A-2122995
6     SNP_A-8336953
7     SNP_A-8364177
<snip>


LOL. So there are 27 of my fake RSIDs on the Affy SNP6 chip.

Best,

Jim



> 
> 4) I am assuming I will need to write a small R script of some sorts which 
> will obtain the list of my SNP's -> Compare it against a database which has the 
> numbers of the all the SNP's on 6.0 chip and append into an empty list the 
> ones which are present on the chip and ones that are not.
> 
> 5) The problem is, I do not know which software package in bioconductor to 
> start using with...There are well over 400 packages and I am confused as to 
> which ones to start off with (Affy?, Oligo? Affyparser?).
> 
> I understand this is extremely basic, but I would be glad for any 
> information regarding a workflow I could adopt, and also any possible 
> literature references.
> Thank you.
> Rama
> 
> 
> 
>       
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org 
> https://stat.ethz.ch/mailman/listinfo/bioconductor 
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list