[BioC] how to find out the differentially expressed genes?how to downweight the arrays?
Adaikalavan Ramasamy
ramasamy at cancer.org.uk
Sat Sep 24 08:49:35 CEST 2005
I am not sure if someone has answered you.
First regarding small sample sizes. The following article shows that
LIMMA does a pretty decent job for small sample sizes
http://www3.interscience.wiley.com/cgi-bin/abstract/110492632/ABSTRACT
However with small sample sizes, the effect of poor quality experiments,
outliers and misclassified samples have much larger influence. This is
an argument against small sizes in general.
Second, you have a highly imbalanced dataset. As I mentioned before, you
need to increase the size of the normal groups. So I am not surprised
your results are "not nice" and further I suspect it will be quite
sensitive to different tests (i.e. not robust enough).
Third, you can ask your biologists if the the top k genes are
interesting despite being insignificant. If they find it interesting,
then suggest they find they money to do larger study.
Now answering your specific questions :
1) This depends on the type and number of arrays as well as what is
running in the background. But I assume you already have done this to
produce the output below, so I am not sure what your question is.
2) Google gave http://bioinf.wehi.edu.au/folders/arrayweights/ but I
cannot find a related article, which means that it might be in
preparation or in press. You can try asking the first author or look at
the simple example given in help("arrayWeights").
3) From the example given in help("arrayWeights"), it appears that the
weighting is only incorporated during model fitting.
Regards, Adai
On Thu, 2005-09-15 at 21:27 -0700, weinong han wrote:
> 17 samples(3 normal samples, 14 NPC tumor samples from different patients)were used in my Affymetrix microarray experiments. The small size microarrays were recommmended to be analyzed using LIMMA. After moderated t statistic, I found the results were not so nice. please see attachment.
>
> quality assessment was recommended in the first steps of data analysis. In some cases poor quality arrays will have to be dropped, but an alternative is to downweight the lower quality arrays using the arrayWeights() function in limma or array level standard errors from affyPLM.
>
> My Questions:
> 1. my RAM is 512M(windows xp), can the RAM size be used for 17 affymetrix chips?
> 2. If poor quality arrays checked, how to downweight the lower quality arrays using the arrayWeights() function in limma or array level atandard errors from affyPLM? All chips downweighted or only poor quality arrays downweighted?
> 3. The downweight will change the original expression values or not?
>
> Any advice and suggestions will be much appreciated.
>
> dir()
> > [1] "G05.CEL" "G09.CEL" "G10.CEL" "G12.CEL" "G15.CEL"
> > [6] "G19.CEL" "GF.CEL" "GM.CEL" "H044.CEL" "H05.CEL"
> >[11] "H07.CEL" "H10.CEL" "H11.CEL" "H14.CEL" "hgu133acdf"
> >[16] "N01.CEL" "N02.CEL" "N03.CEL"
> > > library(limma)
> > > library(affy)
> >Loading required package: Biobase
> >Loading required package: tools
> >Welcome to Bioconductor
> > Vignettes contain introductory material. To view,
> > simply type: openVignette()
> > For details on reading vignettes, see
> > the openVignette help page.
> >Loading required package: reposTools
> > > Data <- ReadAffy()
> > > eset <- rma(Data)
> >Background correcting
> >Normalizing
> >Calculating Expression
> > > pData(eset)
> > sample
> >G05.CEL 1
> >G09.CEL 2
> >G10.CEL 3
> >G12.CEL 4
> >G15.CEL 5
> >G19.CEL 6
> >GF.CEL 7
> >GM.CEL 8
> >H044.CEL 9
> >H05.CEL 10
> >H07.CEL 11
> >H10.CEL 12
> >H11.CEL 13
> >H14.CEL 14
> >N01.CEL 15
> >N02.CEL 16
> >N03.CEL 17
> > > tissue <-
> >
> c("C","C","C","C","C","C","C","C","C","C","C","C","C","C","N","N","N")
> > > design <- model.matrix(~factor(tissue))
> > > colnames(design) <- c("C", "CvsN")
> > > design
> > C CvsN
> >1 1 0
> >2 1 0
> >3 1 0
> >4 1 0
> >5 1 0
> >6 1 0
> >7 1 0
> >8 1 0
> >9 1 0
> >10 1 0
> >11 1 0
> >12 1 0
> >13 1 0
> >14 1 0
> >15 1 1
> >16 1 1
> >17 1 1
> >attr(,"assign")
> >[1] 0 1
> >attr(,"contrasts")
> >attr(,"contrasts")$"factor(tissue)"
> >[1] "contr.treatment"
> >
> >
> > > fit <-lmFit(eset,design)
> > > fit <-eBayes(fit)
> > > options(digits=2)
> > > topTable(fit,coef=2,n=50,adjust="fdr")
> > ID M A t P.Value B
> >22193 78047_s_at 0.60 7.3 5.3 0.82 -3.4
> >2594 203065_s_at -1.26 6.7 -5.0 0.82 -3.5
> >10680 211245_x_at 0.58 4.9 4.7 1.00 -3.6
> >17919 218554_s_at 0.59 4.7 4.5 1.00 -3.6
> >9431 209945_s_at -0.67 6.1 -4.5 1.00 -3.6
> >4556 205029_s_at 3.09 3.6 4.4 1.00 -3.6
> >4557 205030_at 3.58 4.6 4.3 1.00 -3.6
> >5845 206319_s_at 0.82 4.0 4.3 1.00 -3.7
> >21838 36019_at 0.67 6.7 4.2 1.00 -3.7
> >5209 205682_x_at 0.61 4.8 4.2 1.00 -3.7
> >6791 207266_x_at -0.95 7.8 -4.0 1.00 -3.7
> >21916 38447_at 0.66 7.3 4.0 1.00 -3.7
> >21914 38340_at 0.59 6.3 3.9 1.00 -3.8
> >16241 216871_at 0.59 3.4 3.9 1.00 -3.8
> >982 201454_s_at -0.65 6.2 -3.9 1.00 -3.8
> >22024 46256_at 0.62 7.2 3.9 1.00 -3.8
> >7489 207978_s_at 0.47 4.3 3.8 1.00 -3.8
> >4452 204925_at 0.48 5.0 3.8 1.00 -3.8
> >7121 207600_at 0.48 5.5 3.7 1.00 -3.8
> >12443 213060_s_at 1.41 6.0 3.7 1.00 -3.8
> >1619 202091_at 0.51 3.3 3.7 1.00 -3.8
> >9890 210412_at 0.53 3.5 3.6 1.00 -3.8
> >21922 38707_r_at 0.45 7.8 3.6 1.00 -3.9
> >2715 203187_at 0.59 5.8 3.6 1.00 -3.9
> >3354 203827_at -0.99 5.5 -3.6 1.00 -3.9
> >5340 205813_s_at 0.52 5.8 3.5 1.00 -3.9
> >2445 202916_s_at -0.61 6.1 -3.5 1.00 -3.9
> >18810 219446_at -0.68 5.9 -3.5 1.00 -3.9
> >14010 214632_at -0.54 4.2 -3.4 1.00 -3.9
> >2915 203388_at 0.46 6.2 3.4 1.00 -3.9
> >21936 396_f_at 0.70 7.7 3.4 1.00 -3.9
> >16292 216922_x_at 0.61 3.8 3.4 1.00 -3.9
> >13378 213999_at 0.44 4.5 3.4 1.00 -3.9
> >9642 210158_at 0.58 4.4 3.4 1.00 -3.9
> >19117 219753_at 0.65 5.6 3.4 1.00 -3.9
> >10820 211405_x_at 0.53 5.3 3.4 1.00 -3.9
> >19242 219878_s_at -0.58 4.5 -3.4 1.00 -3.9
> >3275 203748_x_at -0.90 7.9 -3.4 1.00 -3.9
> >16554 217187_at 0.58 5.7 3.4 1.00 -3.9
> >8627 209133_s_at 0.54 4.7 3.3 1.00 -3.9
> >17983 218618_s_at -1.15 8.0 -3.3 1.00 -3.9
> >20977 221615_at 0.50 3.7 3.3 1.00 -3.9
> >18562 219198_at 0.54 5.7 3.3 1.00 -3.9
> >19513 220149_at 0.58 4.8 3.3 1.00 -3.9
> >1770 202242_at 1.04 5.4 3.3 1.00 -3.9
> >10081 210616_s_at -0.56 8.4 -3.3 1.00 -3.9
> >17995 218630_at 0.37 5.4 3.3 1.00 -3.9
> >3018 203491_s_at -0.67 5.1 -3.3 1.00 -3.9
> >10823 211410_x_at 0.56 5.3 3.3 1.00 -3.9
> >16351 216981_x_at 0.57 6.3 3.3 1.00 -3.9
>
>
> Best Regards
>
> Han Weinong
> hanweinong at yahoo.com
>
> __________________________________________________
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
More information about the Bioconductor
mailing list