[BioC] nsFilter and GSEA
Robert Gentleman
rgentlem at fhcrc.org
Fri Jan 11 17:50:55 CET 2008
Hi,
It looks like something fairly odd is going on, and that we are not
seeing all of the code that is being run.
What chip are you using? What is very odd is that in your first
example 1098 "duplicate" probes are found, but in the second run only 3.
Basically this cannot happen (since the probes are the same) and
suggests that some piece of code has manipulated the names, and at that
point I think fairly bad things are going to happen. So this would be
one place to try and fix things.
Second, nsFilter filters by default at the median, so you should
retain about 0.5 of your probe sets. But since you loose so many (you
didn't tell us the chip so I can't be sure) but it looks like all of the
values are corrupt for that example as well.
So, I think that you are looking in the wrong place. Your problem is
probably earlier on.
best wishes
Robert
Paolo Innocenti wrote:
> Hi again,
>
> I tried with a different normalisation method, and I was pretty
> surprised by the results:
>
> > eset.mas <- mas5(mydata)
> background correction: mas
> PM/MM correction : mas
> expression values: mas
> background correcting...done.
> 14010 ids to be processed
> | |
> |####################|
> > eset.mas.f <- nsFilter(eset.mas)
> > eset.mas.f$filter.log
> $numDupsRemoved
> [1] 1098
>
> $numLowVar
> [1] 1
>
> $feature.exclude
> [1] 3
>
> $numRemoved.ENTREZID
> [1] 786
>
> > eset.rma <- rma(mydata)
> Background correcting
> Normalizing
> Calculating Expression
> > eset.rma.f <- nsFilter(eset.rma)
> > eset.rma.f$filter.log
> $numDupsRemoved
> [1] 3
>
> $numLowVar
> [1] 13047
>
> $feature.exclude
> [1] 3
>
> $numRemoved.ENTREZID
> [1] 786
>
> > dim(eset.rma.f$eset)
> Features Samples
> 171 15
> > dim(eset.mas.f$eset)
> Features Samples
> 12122 15
>
> I don't understand how is it possible. Any suggestion about what to do?
> Should I lower the cutoff for the rma, or that processing method doesn't
> work for my dataset?
>
> Paolo
> PS: I tried also a really low cutoff, but the situation doesn't change,
> unless I choose a cutoff=0.1:
>
> > eset.filter <- nsFilter(eset,var.cutoff=0.2)
> > eset.filter$filter.log
> $numDupsRemoved
> [1] 69
>
> $numLowVar
> [1] 10560
>
> $feature.exclude
> [1] 3
>
> $numRemoved.ENTREZID
> [1] 786
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org
More information about the Bioconductor
mailing list