[BioC] error in matchprobes package

yzhang at vbi.vt.edu yzhang at vbi.vt.edu
Thu Sep 13 15:58:29 CEST 2007


Dear Jim:

Thank you very much for your help. I did debug and got the following
message. Since I am a newcomer in bioconductor, I don't have clue on how
to read these message. Could you help me?


> debug(.lgExtraParanoia) > makeProbePackage("ehis1a520285f",
version="1.0", build=FALSE, check=FALSE, force=TRUE, maintainer="yan
zhang< yzhang at vbi.vt.edu>", species="ehis")
Importing the data.
debugging in: .lgExtraParanoia(pt, cdfname)
debug: {
    do.call("library", list(cdfname))
    thecdf <- as.environment(paste("package", cdfname, sep = ":"))[[cdfname]]
    probesetnames = ls(thecdf)
    pm1 = unlist(lapply(probesetnames, function(ps) {
        thecdf[[ps]][, 1]
    }))
    mm1 = unlist(lapply(probesetnames, function(ps) {
        thecdf[[ps]][, 2]
    }))
    psnm1 = unlist(lapply(probesetnames, function(ps) {
        rep(ps, nrow(thecdf[[ps]]))
    }))
    tab = table(mm1 - pm1)
    sizex = as.numeric(names(tab))[max(tab) == tab]
    pm2 = pt$y * sizex + pt$x + 1
    mm2 = (pt$y + 1) * sizex + pt$x + 1
    psnm2 = pt[["Probe.Set.Name"]]
    z1 = z2 = rep(NA, max(pm1, mm1, pm2, mm2))
    z1[pm1] = z1[mm1] = psnm1
    z2[pm2] = z2[mm2] = psnm2
    diffprob = which(z1 != z2)
    if (length(diffprob) > 0) {
        cat("***************************************************************************\n",
            "Found different probe set names in 'CDF package' and 'probe
package' for\n",
            length(diffprob), "probes.\n")
        for (i in 1:min(10, length(diffprob))) cat(z1[diffprob[i]],
            z2[diffprob[i]], "\n")
        cat("If you consider this mismatch insignificant, you may want to
rerun this\n",
            "function with 'comparewithcdf  = FALSE'. Otherwise, you'll
need to\n",
            "figure out the reason for this!\n")
        stop("Stopped")
    }
    invisible(TRUE)
}
Browse[1]>
debug: do.call("library", list(cdfname))
Browse[1]>
debug: thecdf <- as.environment(paste("package", cdfname, sep =
":"))[[cdfname]]
Browse[1]>
debug: probesetnames = ls(thecdf)
Browse[1]>
debug: pm1 = unlist(lapply(probesetnames, function(ps) {
    thecdf[[ps]][, 1]
}))
Browse[1]>
debug: mm1 = unlist(lapply(probesetnames, function(ps) {
    thecdf[[ps]][, 2]
}))
Browse[1]>
debug: psnm1 = unlist(lapply(probesetnames, function(ps) {
    rep(ps, nrow(thecdf[[ps]]))
}))
Browse[1]>
debug: tab = table(mm1 - pm1)
Browse[1]>
debug: sizex = as.numeric(names(tab))[max(tab) == tab]
Browse[1]>
debug: pm2 = pt$y * sizex + pt$x + 1
Browse[1]>
debug: mm2 = (pt$y + 1) * sizex + pt$x + 1
Browse[1]>
debug: psnm2 = pt[["Probe.Set.Name"]]
Browse[1]>
debug: z1 = z2 = rep(NA, max(pm1, mm1, pm2, mm2))
Browse[1]>
Error in rep(NA, max(pm1, mm1, pm2, mm2)) :
        invalid 'times' argument
In addition: Warning messages:
1: NAs introduced by coercion in: as.integer.default(dat[[2]])
2: NAs introduced by coercion in: as.integer.default(dat[[3]])
3: NAs introduced by coercion in: as.integer.default(dat[[4]])
>

On Wed, September 12, 2007 4:01 pm, James W. MacDonald wrote:
>

>
> Yan Zhang wrote:
>
>> jim:
>>
>>
>> I am wrong. That chip did have MM. I just checked it using mm function
>> in affy package. The reason that I think it is only has pm is because
>> only pm in probesequence file.  Then, do you have some suggestion to
>> solve that error message?
>
> Sure. You have two choices. You can add comparewithcdf=FALSE to your
> call to makeProbePackage(), which will eliminate the warnings because you
> will no longer be comparing to the cdf. This is the simplest answer, but
> regrettably the most dangerous as well.
>
> Otherwise, you could
>
>
> debug(.lgExtraParanoia)
>
> before running makeProbePackage(), and then step through that function,
> looking at what you get for pm1, mm1, pm2, and mm2 to see why you are
> getting the error in the first place. I have to assume one of those
> variables is ending up as an NA (usually this happens because there aren't
> any MMs). Then you will have to figure out what to do with this
> information.
>
> Best,
>
>
> Jim
>
>
>
>>
>> best yan
>>
>> James W. MacDonald wrote:
>>
>>
>>> Hi Yan,
>>>
>>>
>>> First, please don't take things off-list. The archives are intended
>>> to be a resource, and if the questions/answers become private then we
>>> have less of a resource.
>>>
>>> Yan Zhang wrote:
>>>
>>>
>>>> Thank you very much for your response.
>>>> Yes, that chip only has PM. Then, what can I do?
>>>> I need to solve this problem in order to continue.
>>>> For warning message,
>>>> Can I just ignore that warning messages? I doubled. Because later,
>>>> when I using GCRMA, those NA will cause trouble in the
>>>> compute.infinite function. What can I do? Can I just delete the head
>>>> of probesequence file?
>>>
>>>
>>> You won't be able to do GCRMA with a PM-only chip. GCRMA uses the MM
>>> probes to compute a background estimate, and if you don't have MM
>>> probes you won't be able to do that.
>>>
>>> As for the second question (which is a moot point now), you don't
>>> want to delete the head of the probe_tab file. As I mentioned in my
>>> earlier reply you would need to use the devel version of matchprobes
>>> with R-2.6.0alpha.
>>>
>>>
>>> Best,
>>>
>>>
>>> Jim
>>>
>>>
>>>
>>>>
>>>> best yan
>>>>
>>>> James W. MacDonald wrote:
>>>>
>>>>
>>>>> Hi Yan,
>>>>>
>>>>>
>>>>> yzhang at vbi.vt.edu wrote:
>>>>>
>>>>>> When I use makeProbePackage function in newest version
>>>>>> matchprobes package(1.8.1), I got the following error message:
>>>>>>
>>>>>>> makeProbePackage("ehis1a520285f",version="1.0",species="ehis"
>>>>>>> ,maintainer="yanzhang<yzhang at vbi.vt.edu>",build=FALSE,
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> check=FALSE, force=True) Importing the data.
>>>>>> Error in rep(NA, max(pm1, mm1, pm2, mm2)) :
>>>>>> invalid 'times' argument In addition: Warning messages:
>>>>>> 1: NAs introduced by coercion in: as.integer.default(dat[[2]])
>>>>>> 2: NAs introduced by coercion in: as.integer.default(dat[[3]])
>>>>>> 3: NAs introduced by coercion in: as.integer.default(dat[[4]])
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The error comes from code that compares the probeset IDs from the
>>>>>  probe package with the cdf package, and IIRC this happens when
>>>>> you have a PM-only chip. Is this chip PM-only?
>>>>>
>>>>> The warnings come from an unfortunate change that was made to
>>>>> getProbeDataAffy() that I have fixed in the devel version (and
>>>>> have no idea right now why I didn't push to the release as
>>>>> well...). The problem stems from the fact that you are reading in
>>>>> the whole probe_tab file, including the header. When the (x,y)
>>>>> coordinates and probe interrogation position data are coerced to
>>>>> integer, the first value for each is character, which is coerced
>>>>> to a NA.
>>>>>
>>>>> The release branch is no longer being built, so I cannot push a
>>>>> fix that will end up being available. The easiest thing for you to
>>>>> do is upgrade your R to 2.6.0 alpha and use the devel version of
>>>>> matchprobes.
>>>>>
>>>>> Best,
>>>>>
>>>>>
>>>>> Jim
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> I don't have this problem if I use old version(1.0.22).
>>>>>> Anyonne knows what cause this?
>>>>>>
>>>>>>
>>>>>> best yan
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at stat.math.ethz.ch
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives:
>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conducto
>>>>>> r
>>>>>
>>>>>
>>>>>
>>>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
>
>



More information about the Bioconductor mailing list