[BioC] custom affy chip, mm values with NA (affy, makecdfenv)
James Bullard
bullard at berkeley.edu
Wed May 4 07:03:19 CEST 2005
Sorry to respond to my own thread, but I believe I have some more
insight on what is happening. From my previous post I was having some
problems with some mm values which were NA. I interogated which they
were and I got the following:
> indexProbes(affybatch, which = "both", "02280106180000.6757_pPM_GC")
$"02280106180000.6757_pPM_GC"
[1] 505546 NA
I have exactly 5 genes where this is true (pm = index, mm = NA). When I
examined the code for make.cdf.env there is a block:
if(length(mm)==0)
return(cbind(pm=pm, rep(NA,length(pm))))
else
return(cbind(pm=pm,mm=mm))
I put some print statements in there and the mm length is 0 for my five
genes. I assumed since the code was checking and handling this case that
it probably meant that NA values in the mm field were okay, but now I am
not sure that I can say either way. On a possibly related note, I
realized that the cbind was giving me many warnings saying that the
lengths of the columns dont match, ie:
1: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(pm = pm, mm = mm)
This occurrs in the cbind above for a fair number of the probesets. I
had assumed (again probably incorrectly) that because the code wasnt
even checking to see if they were same length that these were
*ignorable* warnings. Now, upon typing that thought I realize that it is
probably the opposite, but I am pretty sure I dont know enough about cdf
files to say what this indicates, and how I can correct the problem.
thanks again, jim
James Bullard wrote:
>
> I have a custom affy chip (dont know what more information would be
> relevant here (I am new to this)). I am attempting to perform
> background correction using the bg.correct.mas function and am running
> into problems because of na values in a very small (5 or so) number of
> the mm values (At least, I think this is why I am running into problems).
>
> First, I noticed that none of the bg.correct.*, normalize.* methods
> have an na.rm parameter - this seems to indicate to me that having NA
> values in pm, mm matrices is not expected and therefore I am reading
> in the data incorrectly. Is this true? I took it for granted that it
> was not true, and have been trying to exclude them after the fact.
>
> First, to get the data into R i do:
>
> > cdf.env <- make.cdf.env("Mar_12_2004.cdf")
> > affybatch <- read.affybatch(filenames = c("T1.CEL"))
> > affybatch at cdfName <- "cdf.env"
>
> This occurs without incident (save the warning: Incompatible phenoData
> object. Created a new one.)
>
> So then I want to do the following:
>
> > bg.correct.mas(affybatch)
> Error in as.vector(data) : NA/NaN/Inf in foreign function call (arg 1)
>
> So... My first thought was to find all probesets with NA values, and
> then remove them from the AffyBatch object (please excuse the codes
> ugliness, just trying to make it work for now):
>
> remove.na.vals <- function(affybatch) {
> na.probesets <- NULL
>
> for (ps in probeset(affybatch)) {
> if (is.na(sum(pm(ps))) || is.na(sum(mm(ps)))) {
> na.probesets <- c(na.probesets, ps at id)
> }
> }
> na.probesets
> }
>
> So using the above function I do the following:
>
> ab.probes <- probeset(affybatch, setdiff(geneNames(affybatch),
> remove.na.vals(affybatch)))
>
> This gives me a list of probeset objects which have no NA values in
> either pm or mm column. I then want create/modify the AffyBatch
> object to use just this probeset. I cannot set the pm, mm values
> because they have different dimensions. I am sure there are
> alternate/superior solutions to this problem. As I said before I am
> new to bioconductor and so potentially I am on the wrong track
> entirely. Some information which might be important
> is below, Thanks in advance. Jim
>
> R 2.0.1
> bioconductor 1.5
> affy_1.5.8-1
> makecdfenv_1.4.8
> debian 2.6 (sarge)
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>
More information about the Bioconductor
mailing list