[R] Help with the lumi R package

Fri Mar 30 20:10:08 CEST 2012

Hi Amy,

You might repost this question to the Bioconductor mailing list:
you'll get more specialized help there. But to answer your questions
as best I can see inline:

On Fri, Mar 30, 2012 at 12:15 PM, Minyue Wang <mwang3 at ncsu.edu> wrote:
> Hi all,
> My name is Amy, I am a masters student in Bioinformatics at North Carolina
> State University. I am working on a project and I am trying to use the lumi
> R package for microarray data analysis. I have shown the sample code here
> and have questions about modifying the sample code for my own data.
>
>
> lumi package in R, example.lumi, the sample data has 8000 features and 4
> samples
>
> I have highlighted the code I have questions on in red,

Well, the mail server strips HTML so we can't see those, but I think I
know where you are talking about.

> my data has 4
> different types of samples, each repeated 6 times, so a total of 24 samples
> and about 48,000 rows. how should I identify my sampleType in my case? also
> what does colnames(design) <- c('100:0', '95:5-100:0') do, which columns
> exactly does it take into consideration? Thanks!

This will hit all the columns -- however, if you don't give enough
names, it throws an error. For example,

x <- matrix(1:6, nrow = 2); colnames(x) <- letters[1:3]; rownames(x)
<-LETTERS[1:2]
print(x)

colnames(x) <- "A" # Problem!
colnames(x) <- c("X","Y", "Z") # Good to go.

>
>
> so the sample code i'm trying to follow is below:
>
> ###################################################
>
> ### code chunk number 30: filtering
>
> ###################################################
>
> presentCount <- detectionCall(example.lumi)
>
> selDataMatrix <- dataMatrix[presentCount > 0,]
>
> probeList <- rownames(selDataMatrix)
>
>
>
>
>
> ###################################################
>
> ### code chunk number 31: Identify differentially expressed genes
>
> ###################################################
>
> ## Specify the sample type
>
> sampleType <- c('100:0', '95:5', '100:0', '95:5')

I'm not sure where you got this code, but if I were using it, I'd
rewrite it as: sampleType <- rep(c("100:0", "95:5"), 2) and then you
can just change those to specify sampleType (after making my other
change below)

>
> if (require(limma)) {
>
>                ##  compare '95:5' and '100:0'
>
>                design <- model.matrix(~ factor(sampleType))
>
>                colnames(design) <- c('100:0', '95:5-100:0')

colnames(design) <- sampleType[1:2]

>
>                fit <- lmFit(selDataMatrix, design)
>
>                fit <- eBayes(fit)
>
>                ## Add gene symbols to gene properties
>
>                if (require(lumiHumanAll.db) & require(annotate)) {
>
>               geneSymbol <- getSYMBOL(probeList, 'lumiHumanAll.db')
>
>               geneName <- sapply(lookUp(probeList, 'lumiHumanAll.db',
> 'GENENAME'), function(x) x[1])
>
>               fit$genes <- data.frame(ID= probeList,
> geneSymbol=geneSymbol, geneName=geneName, stringsAsFactors=FALSE)
>
>          }
>
>                ## print the top 10 genes
>
>                print(topTable(fit, coef=colnames(design), adjust='fdr',
> number=10))

Change here too!

Hope that helps, [But no promises it's perfect -- i'm not a geneticist!]

Michael

>
>
>
>                ## get significant gene list with FDR adjusted p.values
> less than 0.01
>
>                p.adj <-
> p.adjust(fit$p.value[,2])
>
>                sigGene.adj <- probeList[ p.adj < 0.01]
>
>                ## without FDR adjustment
>
>                sigGene <- probeList[ fit$p.value[,2] < 0.01]
>
> }
>
>
>
> --
> - Amy W.
>
> --
> Minyue Wang (Amy)
> Graduate student, Bioinformatics
> mwang3 at ncsu.edu
> 919-5210893
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.