[BioC] how to revert to an older limma version?

Wed Jul 11 20:36:54 CEST 2007

Hi Maya -- Providing sessionInfo and a transcript of your session
really helped. Please see the comments below. I am responding to the
list, so that others may benefit.

Martin

"Maya Bercovich" <MayaB at tauex.tau.ac.il> writes:

> -----Original Message-----
> From: Martin Morgan [mailto:mtmorgan at fhcrc.org] 
> Sent: 10 July, 2007 3:58 PM
> To: Maya Bercovich
> Cc: Seth Falcon; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] how to revert to an older limma version?
>
> Hi Maya -- Here are my suggestions. Most important, cut and paste
> command and results from your R session, so that we can see what is
> going on
>
> 1. Please, please include the output of sessionInfo(). You can cut and
>    paste this from your R session into the email. Do this after you
>    have reproduced the error. 
>
>> sessionInfo()
> R version 2.5.0 (2007-04-23)
> i686-redhat-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
> TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-
> 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID
> ENTIFICATION=C
>
> attached base packages:
> [1] "tools"     "tcltk"     "stats"     "graphics"  "grDevices" "utils"
> [7] "datasets"  "methods"   "base"
>
> other attached packages:
>     Biobase     reldist      marray   tkWidgets      DynDoc widgetTools
>    "1.14.0"     "1.5-5"    "1.14.0"    "1.14.0"    "1.14.0"    "1.12.0"
>       limma
>    "2.10.5"

This is really helpful. Here's where my system starts: 

> sessionInfo()
R version 2.6.0 Under development (unstable) (2007-07-09 r42160) 
x86_64-unknown-linux-gnu 

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

Notice that these are en_US, whereas yours are en_US.UTF-8

> 2. Report the exact command that causes the error, and error message.
>    Do this by copying and pasting the relevant portions of your R
>    session. For instance, we do not yet know whether you provide your
>    own column names, or what type of files you are trying to read.
>
> The Data file is a GenePix output gpr file
>
> The commands I'm running and error message:
>
>
>> library(limma)
>> library(marray)
> Loading required package: tkWidgets
> Loading required package: widgetTools
> Loading required package: tcltk
> Loading Tcl/Tk interface ... done
> Loading required package: DynDoc
>> library(reldist)
> Relative Distribution Methods
> Version 1.5-5 created on April 1, 2006.
> copyright (c) 2003, Mark S. Handcock, University of Washington
>                     Martina Morris, University of Washington
> Type help(package="reldist") to get started.
>> library(Biobase)
> Loading required package: tools
>
> Welcome to Bioconductor
>
>   Vignettes contain introductory material. To view, type
>   'openVignette()'. To cite Bioconductor, see
>   'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>> setwd("/mnt/lifestore/Biotech/DannyS_Shared/Users/Revital/ssdp/")
>> path<-"/mnt/lifestore/Biotech/DannyS_Shared/Users/Revital/ssdp/"
>>
>> targets.RG1 <- readTargets("ssdpA270607.txt")
>>
>> targets.RG2 <- readTargets("ssdpB270607.txt")
>>
>> RG1 <- read.maimages(targets.RG1$Filename,
> columns=list(Rf="F635Median",Gf="F532Median",Rb="B635Median",Gb="B532Med
> ian"), path=path)
> Error in readGenericHeader(fullname, columns = columns, sep = sep) :
>         Specified column headings not found in file
> In addition: Warning message:
> input string 1 is invalid in this locale in: grep(pattern, x,
> ignore.case, extended, value, fixed, useBytes)

Thanks for the files you forwarded (not included in this
response). When I do

> setwd("~/tmp")
> library(limma)
> targets.RG1 <- readTargets("ssdpA270607.txt")
> fname <- targets.RG1$Filename[[1]]
> fname
[1] "B12Z0471_A.gpr"

The file name has lower-case 'gpr', but the file you sent has
upper-csae GPR, so

> columns <- list(Rf="F635Median",Gf="F532Median",
+   Rb="B635Median",Gb="B532Median")
> read.maimages(fname, columns=columns)
Error in file(file, "r") : unable to open connection
In addition: Warning message:
In file(file, "r") :
  cannot open file 'B12Z0471_A.gpr', reason 'No such file or directory'

on the other hand

> fname <- toupper(fname)
> res <- read.maimages(fname, columns=columns)
Read B12Z0471_A.GPR 

So far so good. One point is that these are genepix files, so that a
better way to read the files might be

> res <- read.maimages(fname, "genepix.custom", columns=columns)
Custom background: LocalFeature	 
Read B12Z0471_A.GPR 

This reads information about the printer as well, which can be useful
during normalization.

I now change my system local to be like yours

> Sys.setlocale("LC_ALL", "en_US.UTF-8")
[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C"

and try to read the files

> read.maimages(fname, columns=columns)
Error in readGenericHeader(fullname, columns = columns, sep = sep) : 
  Specified column headings not found in file
In addition: Warning message:
In grep(a, txt) : input string 1 is invalid in this locale

Ah ha! This seems to be the problem. So set the locale to "en_US"

> Sys.setlocale("LC_ALL", "en_US")
[1] "LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C"
> res <- read.maimages(fname, columns=columns)
Read B12Z0471_A.GPR 

Does this work for you? Alternatively here's an interesting solution
that 'just works':

> Sys.setlocale("LC_ALL", "en_US.UTF-8")
[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C"
> res <- read.maimages(fname, "genepix.custom", columns=columns)
Custom background: LocalFeature	 
Read B12Z0471_A.GPR 

I'll just delve a little further, and see if we can figure out where
that warning message is coming from:

> options(warn=2)
> res <- read.maimages(fname, columns=columns)
Error in grep(a, txt) : 
  (converted from warning) input string 1 is invalid in this locale

setting warn=2 causes the warning to become an error. Here's where the
error occurs (I've edited frame 2 with [...] to shorten the
presentation):

> traceback()
8: doWithOneRestart(return(expr), restart)
7: withOneRestart(expr, restarts[[1]])
6: withRestarts({
       .Internal(.signalCondition(simpleWarning(msg, call), msg, 
           call))
       .Internal(.dfltWarn(msg, call))
   }, muffleWarning = function() NULL)
5: .signalSimpleWarning("input string 1 is invalid in this locale", 
       quote(grep(a, txt)))
4: grep(a, txt)
3: readGenericHeader(fullname, columns = columns, sep = sep)
2: switch(source2, quantarray = {
[...]
   }, {
       skip <- readGenericHeader(fullname, columns = columns, sep = sep)$NHeaderRecords
       obj <- read.columns(fullname, required.col, text.to.search, 
           skip = skip, sep = sep, quote = quote, as.is = TRUE, 
           fill = TRUE, flush = TRUE, ...)
       nspots <- nrow(obj)
   })
1: read.maimages(fname, columns = columns)

Frame 4 is where the grep statement is, it's inside
readGenericHeader. Here's readGenericHeader:

> readGenericHeader
function (file, columns, sep = "\t") 
{
    if (missing(columns) || !length(columns)) 
        stop("must specify column headings to find")
    columns <- protectMetachar(as.character(columns))
    if (!length(columns)) 
        stop("column headings must be specified")
    con <- file(file, "r")
    on.exit(close(con))
    out <- list()
    Found <- FALSE
    i <- 0
    repeat {
        i <- i + 1
        txt <- readLines(con, n = 1)
        if (!length(txt)) 
            stop("Specified column headings not found in file")
        Found <- TRUE
        for (a in columns) Found <- Found && length(grep(a, txt))
        if (Found) 
            break
    }
    out$NHeaderRecords <- i - 1
    out$ColumnNames <- strsplit(txt, split = sep)[[1]]
    out
}
<environment: namespace:limma>

The 'grep' is toward the end, and is in a loop that looks at each
column name and compares it with the tab-delimited line of headings
from the gpr file. A little bit more snooping shows that the header
line has a field 'Rgn R^2 (635/532)', where the 'R^2' is rendered with
a superscripted '2'. This causes the problem. In UTF-8 it is
represented as "\xb2"; I don't know enough about locales to know what
this means.

The header line is read in a few lines above, with

        txt <- readLines(con, n = 1)

The help page for readLines indicates that there is an argument
'encoding'. We 'know' (experience, I guess) that the file is in
'latin1', and in fact changing the readLine to

        txt <- readLines(con, n = 1, encoding="latin1")

allows readGenericHeader to work correctly:

> res <- readGenericHeader(fname, columns)
>

I really don't know if this is the 'right' long-term solution for 
limma or other package maintainers.

Martin

> 3. After the error occurs, run the command traceback() and include the
>    results. This shows where the error likely occured
>
> After I got the error:
>
>> oldOpt=options(warn=2)
>> traceback()
> 4: stop("Specified column headings not found in file")
> 3: readGenericHeader(fullname, columns = columns, sep = sep)
> 2: switch(source2, quantarray = {
>        firstfield <- scan(fullname, what = "", sep = "\t", flush = TRUE,
>            quiet = TRUE, blank.lines.skip = FALSE, multi.line = FALSE,
>            allowEscapes = FALSE)
>        skip <- grep("Begin Data", firstfield)
>        if (length(skip) == 0)
>            stop("Cannot find \"Begin Data\" in image output file")
>        nspots <- grep("End Data", firstfield) - skip - 2
>        obj <- read.columns(fullname, required.col, text.to.search,
>            skip = skip, sep = sep, quote = quote, as.is = TRUE,
>            fill = TRUE, nrows = nspots, flush = TRUE, ...)
>    }, arrayvision = {
>        skip <- 1
>        cn <- scan(fullname, what = "", sep = sep, quote = quote,
>            skip = 1, nlines = 1, quiet = TRUE, allowEscape = FALSE)
>        fg <- grep(" Dens - ", cn)
>        if (length(fg) != 2)
>            stop(paste("Cannot find foreground columns in", fullname))
>        bg <- grep("^Bkgd$", cn)
>        if (length(bg) != 2)
>            stop(paste("Cannot find background columns in", fullname))
>        columns <- list(R = fg[1], Rb = bg[1], G = fg[2], Gb = bg[2])
>        obj <- read.columns(fullname, required.col, text.to.search,
>            skip = skip, sep = sep, quote = quote, as.is = TRUE,
>            fill = TRUE, flush = TRUE, ...)
>        fg <- grep(" Dens - ", names(obj))
>        bg <- grep("^Bkgd$", names(obj))
>        columns <- list(R = fg[1], Rb = bg[1], G = fg[2], Gb = bg[2])
>        nspots <- nrow(obj)
>    }, bluefuse = {
>        skip <- readGenericHeader(fullname, columns = c(columns$G,
>            columns$R))$NHeaderRecords
>        obj <- read.columns(fullname, required.col, text.to.search,
>            skip = skip, sep = sep, quote = quote, as.is = TRUE,
>            fill = TRUE, flush = TRUE, ...)
>        nspots <- nrow(obj)
>    }, genepix = {
>        h <- readGPRHeader(fullname)
>        if (verbose && source == "genepix.custom")
>            cat("Custom background:", h$Background, "\n")
>        skip <- h$NHeaderRecords
>        obj <- read.columns(fullname, required.col, text.to.search,
>            skip = skip, sep = sep, quote = quote, as.is = TRUE,
>            fill = TRUE, flush = TRUE, ...)
>        nspots <- nrow(obj)
>    }, smd = {
>        skip <- readSMDHeader(fullname)$NHeaderRecords
>        obj <- read.columns(fullname, required.col, text.to.search,
>            skip = skip, sep = sep, quote = quote, as.is = TRUE,
>            fill = TRUE, flush = TRUE, ...)
>        nspots <- nrow(obj)
>    }, {
>        skip <- readGenericHeader(fullname, columns = columns, sep =
> sep)$NHeaderRecords
>        obj <- read.columns(fullname, required.col, text.to.search,
>            skip = skip, sep = sep, quote = quote, as.is = TRUE,
>            fill = TRUE, flush = TRUE, ...)
>        nspots <- nrow(obj)
>    })
> 1: read.maimages(targets.RG1$Filename, columns = list(Rf = "F635Median",
>        Gf = "F532Median", Rb = "B635Median", Gb = "B532Median"),
>        path = path)
>
>
> 4. evaluate the command
>
>> oldOpt = options(warn=2)
>
>    (this will cause the warning to become an error), rerun the command
>    and report the results of traceback(). This will indicate where the
>    suspicious warning about 'grep' occurs.
>
> 5. Does the error occur when only some files are used for input, or
>    does it occur with any file? If it is with only some files, then
>    can you verify that the column names are present in those files?
>    Can you determine the character encoding of those file, for
>    instance by opening them in a browser such as firefox and looking
>    at View --> Character encoding.
>
> This occurs with any file. I attached an example of one of them
> (B12Z0471_A.GPR), and one of the targets files, in case you would like
> to make a test run. The column names are present.
> Character encoding: Western (ISO-8859-1).
>
>
> Thanks for your assistance.
>
> Martin   
>
> "Maya Bercovich" <MayaB at tauex.tau.ac.il> writes:
>
>> -----Original Message-----
>> From: Seth Falcon [mailto:sfalcon at fhcrc.org] 
>> Sent: 09 July, 2007 11:18 PM
>> To: Maya Bercovich
>> Cc: Marcus Davy; Kasper Daniel Hansen; bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] how to revert to an older limma version?
>>
>> "Maya Bercovich" <MayaB at tauex.tau.ac.il> writes:
>>
>>> See bellow and thank you so much.
>>
>> In general, I would recommend using the most recent version of limma.
>> It would be helpful to include the output of sessionInfo() after the
>> error occurs.  The error message does suggest a locale or encoding
>> mismatch.  Can you try setting your locale to "C":
>>
>>    Sys.setlocale(locale="C")
>>
>> I tried it, and I still get the same error. Any more suggestions?
>>
>> Appreciate your assistance,
>>
>> Maya
>>
>>
>>
>> + seth
>>
>> -- 
>> Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research
>> Center
>> http://bioconductor.org
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> -- 
> Martin Morgan
> Bioconductor / Computational Biology
> http://bioconductor.org
>
>

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org