[BioC] where to start?
yousef at wistar.org
Wed Apr 20 18:06:19 CEST 2005
I have data that been preprocessed to have the gene expression for each
genes, where I have 19200 genes involved in the experiments and I have 186
samples. The samples define 32 phenotypes (classes). I would like to find
the significant genes among 10 different combinations of classes and then
find out the intersection between those lists of significant genes.
My problem was is how to read this simple data to any package of
bioconductor, since I saw that bioconductor input format is more requiring
the image format (or I'm missing some thing here). I want to read the input
file where I want to keep track of the gene Id and the gene name.
So please only provide me with simple example reading this input format to
any basic package of bioconductor. For simplicit consider that we have a
table as fellow:
GenId GeneName Sample1 Sample2 Sample3 Sample4 Sample5 ......SampleN
Class C1 C1 C2 C3 C4 C1
1 gene1 0.04 0.05 0.06 0.7 0.8 ....... 0.9
Where the second row have the class labels, and then at the third row we
have the gene expressions (just numbers!!).
So I want to read this format to a specific bioconductor package (say
limma/?) and start applying diffirent functions.
So again I want to know how to read this file to the package???
From: Sean Davis [mailto:sdavis2 at mail.nih.gov]
Sent: Wednesday, April 20, 2005 2:56 AM
To: yousef at wistar.org
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] where to start?
On Apr 20, 2005, at 1:56 AM, Malik Yousef wrote:
> I have a gene expression data set build up form rows of genes
> expression as
> GeneID GeneName Sample1 .......... Samplen
> Category +1 ...........-1
> 1 gene1 0.5 ..............0.67
> 2 gene2 0.34 ............. 0.78
> How I could use bioconductor to analyze this data set and get the most
> informative genes, classification.. Clustering and etc
You will have to decide what specific questions you want to answer
using your data. To get a sense of what bioconductor has to offer, try
The vignettes give a lot of detail about how to use different packages.
The BioConductor Short Courses are very helpful as a starting place.
When you run into specific problems, ask here. If you want more help
here, you will probably have to be more specific about your data, what
you have tried, and what hasn't worked. Single channel or two-color?
Patient samples or cell lines or something else? Expression or CGH?
How many classes of sample? What are the research
More information about the Bioconductor