[Bioc-devel] NEWS May 2007
jmacdon at med.umich.edu
Fri Jun 8 16:18:53 CEST 2007
remove the core parts of the quantile normalization code which
operate on matrix objects. This code has been moved to a new
Fixed featureNames for AffyBatch objects
core c medianpolish code removed. Now calls preprocessCore for
Doc fix - mention how to retrieve se.exprs from mas5calls in
the mas5calls help page
fix for small issue with closing files in gzipped binary CEL
fix SET_VECTOR_ELT/SET_STRING_ELT problem in read.cdffile.list
with text cdf files
Fixed uses of @exprs, @se.exprs, @cdfName, @weights in plot.R
Similarly in Normalize.R for similar references Fixed bug in
IntensityHistogramAll function Fixed bug in HeatDiagramFile
menu commands changelog updated
removing core quantile normalization code moved to
remove the core medianpolish code. Use the code in
preprocessCore for this purpose instead
move the core RLM C code to preprocessCore
Moved dist2() from affyQCReport to genefilter
fix bug related to KEGG EC number extraction
Removed setGeneric() statements for methods GOID, Term,
Synonym, Secondary, Definition and Ontology: these statements
are now in AnnotationDbi (>= 0.0.69).
Reorganized the AnnObj class hierarchy (with lot of class
renaming). Added minimal infrastructure for GO_DB
schema. Minor bug fix (and reorganization) in the code that
generates the man pages of an ann db pkg from its template.
Some new generics: left.db_table, left.colname,
right.db_table, right.colname. Added the L2Rpath slot to
AnnDbMap objects: will replace current slots leftTable,
leftCol, rightTable, rightCol and join so at some point they
will be removed. Started to re-implement some methods of the
low-level API to take advantage of the L2Rpath slot.
Almost completed the 'join' to 'L2Rpath' migration. Removed
slots leftTable, leftCol, rightTable, rightCol and join from
AnnDbMap objects. Modified definition of the HGU95AV2_DB
schema (in R/schema.HGU95AV2_DB.R) to make use of the new
AnnDbMap slots (schema definition is now shorter and easier to
read/modify). Still broken: AG_DB, YEAST2_DB, YEAST_DB and
AFFYHUEX_DB schemas + the AnnDbTable class.
The as.character(), toList() and as.list() methods were broken
when the data frame returned by toTable has duplicated names
(this can now happen with the GO_DB schema e.g. with
16/19 maps of the GO_DB schema are ready for testing. Replaced
the GoAnnDbMap class by 2 classes: GoAnnDbMap (one single
right table) and Go3AnnDbMap (3 right tables). Old GoAnnDbMap
class corresponds to new Go3AnnDbMap class. Same changes for
the corresponding reverse classes.
Added R/GOTerm.R from annotate (with "GOTerms" replaced by
Removed Ontology method for signature="ANY" since (1) not
clear it is useful and (2) it is already defined in annotate.
Added the "filter" feature (stored in new slots Lfilter and
Rfilter for AnnDbMap objects): 12 maps use it in the GO_DB
schema. Did some basic testing but need to do more.
Renamed "GOTerm" class -> "GONode" (more appropriate).
Completed annother important move: the "L2Rpath" slot
(AnnDbMap class) is now a list of L2Rbricks objects. This
change allows the representation of maps that did not fit in
the previous model (e.g. GOTERM, GOOBSOLETE, GOSYNONYM). The
implementation of the low-level API was modified
accordingly. All predefined maps for the HGU95AV2_DB and GO_DB
schemas now use this new format (the other schemas will follow
soon). The number of GO_DB maps that are ready is still
16/19. However the last 3 maps have been added but are not yet
fully functional (only the low-level API works on them). Added
4 new generics to the low-level API: tagnames, colnames,
left.filter and right.filter.
Renamed the following generics: left.db_table -> left.table,
right.db_table -> right.table, mapped.left.names ->
left.mappedNames, mapped.right.names -> right.mappedNames,
mapped.names -> mappedNames, count.mapped.left.names ->
count.left.mappedNames, count.mapped.right.names ->
count.right.mappedNames and count.mapped.names ->
count.mappedNames. This breaks some examples in the HGU95AV2DB
package template so all the HGU95AV2_DB-based packages need to
Added a "sample" method for AnnDbMap objects. I put it in
AnnDbMap-envirAPI.R because it is _not_ low-level: it is built
on top of the "as.list" method which is itself _not_ low-level
(as.list is defined in AnnDbMap-envirAPI.R).
The GO_DB schema is almost ready: 18/19 maps are fully
functional. The GOOBSOLETE map is also ready in theory but
because it uses queries like "SELECT NULL AS blabla FROM ...",
then it crashes RSQLite :-/.
Use a temporary workaround for the GOOBSOLETE crash (see
commit 24833) so now all maps in the GO_DB schema are fully
functional (still needs more testing). Some improvements to
the GONode class and methods (added an "initialize" method) so
that mono-valued and multi-valued slots are treated
Added the "FlatBimap" class and its API: "ncol", "colnames",
"left.colname", "right.colname", "left.names", "right.names",
"left.length", "right.length", "left.mappedNames",
"count.right.mappedNames", "nrow", "dim", "links", "nlinks",
"head", "tail" and "fold". Added some of them to the AnnDbObj
low-level API: "ncol", "dim", "links", "nlinks" ("links" and
"nlinks" not ready yet). Also added the "flatten" method to
this API as a replacement for "toTable" the difference being
that "flatten" returns a FlatBimap object instead of a naked
data frame ("toTable" will be deprecated soon). All the
"as.list" methods for AnnDbMap objects (envir-like API) now
use "flatten" instead of "toTable" for retrieving the data
from the database.
Added the BimapAPI0 interface (the common interface to
FlatBimap and to AnnDbMap objects). The "FlatBimap" and
"AnnDbMap" classes extend it. Renamed the "nlinks" generic ->
"count.links". Reorganized a little bit the code in
R/FlatBimap.R and R/AnnDbObj-lowAPI.R + improved the
comments. Added a new test, checkProperty0(), in
This package performs quality metrics on AffyBatch,
ExpressionSet, NchannelSet, containing microarray data
from any platforms, one or two channels. The results are
designated to allow the user to rapidly assess the quality
of a set of arrays.
bug fix to path argument in readIllumina() - added warning
message to readIllumina() when text or tif files are not
BeadLevelList has a new slot 'annotation' for storing the
annotation package name. - changes to readIllumina() to allow
data in text files to be stored in BeadLevelList instead of
having to use images (useImages argument) to get the
intensities. A new argument annoPkg and singleChannel have
also been added to allow users to specify the relevant
annotation package (if there is one - expression only at
present) and the type of data (one channel or two). The type
of background correction and normalization is also now
recorded in the 'arrayInfo' slot. - numBeads has a new
argument 'array' - createBeadSummaryData now passes on
'annotation' argument from BeadLevelList to
ExpressionSetIllumina object. - new BLData.rda added with
'annotation' slot - removed unused arguments 'identify' from
plotMA, plotXY and plotMAXY functions - updated man pages for
most functions to improve examples, descriptions and match
new argument 'annoPkg' to readBeadSummaryData - updated
readBeadSummaryData man page
added beadInfo argument to readIllumina() which fills in the
beadAnno slot if supplied. Updated man page to reflect this
change - added file argument to example in readQC man page
Removed various redundant functions Updated vignettes and
example data sets added introductory vignette to work with
LOH / genotype reporting
setting up Interactive determination of Copy numbers
Bugfix (closing unopened ofstream); small memory management
Added informative message for missing boost libraries
Small change to bgx.Rd example; fixed memory leak in
gene buffer overflow fix; plotDEHistogram fix/cleanup; added
comment in example
More informative error message when AnnotatedDataFrame fails
Separate setMethod and function definition for
annotatedDataFrameFrom This allows methods for new classes
(e.g., BufferedMatrix) to reuse code without 'tricking' method
dispatch. Function defintions remain not exported, to
discourage the end user from accessing directly.
When implementing $ methods, don't rely on [[ doing partial
matching Hopefully, [[ will stop partial matching at some
point in the near future. To retain the partial matching
semantics of $ methods, we can no longer rely on [[. In most
cases, all cases here, the solution is to call "$"(x, n) in
the method definition.
Fix bug in $ methods The previous patch that implemented $
methods using $ instead of [[ was broken. It seems that $
stops dispatching after entering the first method or ???. This
patch passes a basic test, but relies on an unfortunate
Add NChannelSet class * Stores data from N-channel (e.g.,
two-color) experiments in an eSet-derived object. Matricies in
assay data correspond to different channels, with rows in each
matrix representing features and columns samples. phenoData
varMetadata has a column 'channel' indicating which channel
the phenodata is associated with (either an assay data member,
or the special symbol _ALL_ representing phenotypic data
common to all assays). * Object creation like ExpressionSet
(see inst/UnitTest and man pages) * channel(object, <channel>)
creates an ExpressionSet object * selectChannels(object,
<channels>) subsets NChannelSet by channel. * Additional
miscelaneous fixes: - document tidy - AnnotatedDataFrame
... arguments can be used to specify varMetadata information:
adf[[covX, labelDescription="Covariate X", channel="G"]] =
Use 'unsafeSetSlot' within eSet replacement methods. This
relies on the _current implementation_ of S4 generics making a
copy before entering the replacement method -- incoming
objects have exactly one reference, and hence can be modified
'in place'. This saves 2-4 copies of the object, but is
terrible programming practice (relying on implementation
detail; direct slot assignment, ...) and the problem is only
partially 'real' (when assayData is an environment or
lockedEnvironment, the 'big' data isn't being copied anyway).
Safer unsafeSetSlot "sampleNames<-"(...) allowed the
unsafeSetSlot to seep out. Instead, only use unsafeSetSlot
after triggering a copy _within_ the replacement
method. Idioms like phenoData(object) <- pd (appear to)
trigger an extra copy compared to object at phenoData <- pd, so
use direct slot access at crucial points internally
Refactor ScalarCharacter: add ScalarInteger, ScalarNumeric
There is now a ScalarObject class that handles the basic
validation. There is also a factory function, mkScalar, that
creates a Scalar<type> instance of the appropriate type.
Added new function read.AnnotatedDataFrame2 that hopefully
will replace read.AnnotatedDataFrame.
Replaced the "CharBuffer" and "IntBuffer" classes by "XRaw"
(external raw vector) and "XInteger" (external integer vector)
respectively. The "XRaw" class is the RAWSXP-based replacement
for the previously CHARSXP-based "CharBuffer" class. The
"XInteger" class is the same as the "IntBuffer" class: just a
renaming. Also followed Seth recommendation to not use
allocString anymore for creating new CHARSXP objects: now I
use mkChar() everywhere for this.
Fixed some breakage introduced by the "CharBuffer to XRaw"
migration started at r24995.
Add exports and doc for DatPkg class and subclasses This will
allow others to properly subclass the HyperGParams class, but
still needs more thought. These classes might be candidates to
move into the annotate package.
Fix bug in applyByCategory, use rownames not colnames
In makeChrBandGraph, check that we have a human chip The
current implementation can only handle the parsing of human
chromosome band annotations. For now, we fail early with an
error message if we are given an annodation data package that
isn't for Homo sapiens.
Add cb_children function, use min.expected instead of
geneIdsByCategory didn't respect conditional, fix and refactor
Refactored methods so that more are implemented for
HyperGResultBase and rely on condGeneIdUniverse geneIds being
defined. This should reduce code duplication and make the
handling of conditional test results more uniform.
makeChrBandInciMat now returns gene sets on rows, genes on
columns This is more convenient for application of GSEA
Small tweak to ttperm, use known length to init list, not NULL
Add gseaperm function This function provides a convenient
interface for obtaining permutation based p-values for a GSEA
analysis based on a t-statistic.
removed the dependency on Matrix
gseaperm optionally uses Matrix so it moves to Suggests We'd
like to be able to import Matrix, but there are issues with
that at the moment.
The plate normalization in normalizeChannels.R is now done
BEFORE calling fun. I have also simplified its code and man
Image() bug fix new("Image", .Data=smth) did not work with new
R!!!! corrected as res<-new("Image"); res at .Data=smth - works
Added: image moments and moment invariants, rotation angle and
elongation etc; IndexedImage class for object detection
problems. Corrected: some documentation, "}" problems in
several man pages. Updated: normalize method transfered to C
enabling per-frame normalizations; image.Image, hist.Image
transfered to S4 to enable inherited use by IndexedImage;
windows dll recompiled with the newest releases of R-2.5.0,
GTK 2.10.11-1 and ImageMagick 6.3.4-1
'display' for IndexedImage's now normalizes the image by
choose.image now allows to specify if images are to be read as
grayscale (with 16-bit color precision if IM compiled so) or
as true color (with 8-bit per color, thus loosing image info
if images subsequently converted to grayscales)
critical bug correction (obvious only in R2.6): in the C code
Image objects were created incorrectly! Basically they were
created as arrays onto which additional attributes (including
class name) were attached. In R2.6 this led to R ignoring [
operators redefined for Image although class(obj) still shows
Image! Now corrected throughout.
filter is no more a slot in the Image, it is now supplied in
the resize function (not used anywhere else)
fixed cex having no effect in gene.strip
fixed bug in .X.to.probeset functions
fixed missing rownames in pc()
Added probeset/exon/transcript/gene translation functions for
Ensembl ESTs and genescan predictions, fixed the use of unique
in translation fucntions
fix read.FCS function to support NA under the keywords
ANASTART ANAEND of the header section
improve read.FCS function to read a sample of records; add the
Add an the which.lines argument for the read.FCS function
Add missing slot transformId in the transforms object so that
Add splitScaleTransformation (for transformML) Modify read.FCS
to remove random sampling but leave which.lines Add parentId
argument in class Filter Fix rectangle gate boundary allow
Min>=Max like in GateML standard, return empty gate
added a vignette about visualizing filters
moved ecdfplot to use different idiom based on match.call(),
which is kinda necessary to support nonstandard evaluation of
Replaced calls to xy2i() with calls to xy2indices().
Add feature.exclude argument to nsFilter feature.exclude
allows the user to specify a character vector of regular
expressions. Probe sets (featureNames) that match one of the
regular expressions are removed during the filtering
process. This is especially useful for removing quality
control probe sets.
I moved dist2 from affyQCReport to genefilter
fixed two bugs: (1) 'file' in the file 'readGenes.ped.R'
should be 'filename' (2) the function 'getFounders' should be
called without condition in 'getLD.R'
(1) added some code to LD functions to check if ped is
validated. (2) fixed a bug in test.R (the 'sampleInfo' should
have at least 5 columns)
fixed a bug in the function 'getFounders'
Remove Suggests on Biobase Instead, we now check to see if
Biobase is already loaded and then (on Windows) attempt to add
the vignette to the GUI menu. This results in a warning from R
CMD check, but seems the best compromise for now.
Use R_VERSION to work around interface change to Rf_duplicated
This patch allows the graph package to compile against R-2.5
This package provides classes and methods to support Gene
Set Enrichment Analysis (GSEA). In particular, the
GeneSet class provides a common data structure for
representing gene sets.
Bug fix to read.columns() to stop spurious warning message
when text.to.search has length greater than one.
Fixes to as.matrix for ExpressionSet and LumiBatch objects.
as.matrix methods for ExpressionSet added to NAMESPACE
as.matrix method for vsn objects added
Small fix to escape underscores in probe file name in package
man page so they will pass R CMD check.
Better handling of creation of new strings, following recent
changes to mkChar
Fix memory leak Calls to CallocCharBuf need a matching call to
adding justSNPRMA - less memory intensive / keeping
BufferedMatrix only for SNP chips, everything else uses matrix
/ adding dependence on preprocessCore
The ExpressionSet error generated loading exon data and large
tab delimited files was removed
Bug related to Targets files with FileNames containing "-"
Fixed bug in table output from GOenrichment function
added the possibility to locate a legend in ocPlotPCA.R
Affymetrix APT tools could be used to upload gene/exon level
probe set summaries on oneChannelGUI.
The visualization of samples in the pCA space was improved to
fit an high number of sample. Legene was dusabled Exon data,
loaded using the APT implementation, are consistent the gene
meta data. A bug related to unbalance expeirmental design was
fixed. The PDMCLASS package was implemented.
Fixing a warning do the absence of the
OpenCDFandTargetsfiles.R function derived from affylmGUI
affyPLM was temporarely removed from description to allow the
A library of core preprocessing routines for various
packages (affy, affyPLM, oligo, etc.)
Benjamin Milo Bolstad
output node names only for articulationPoints
update output format for biConnComp
update .gxl files to get rid of warnings
add parsers for PSI-MI 2.5 XML format
update parseInteractor function: don't pick intact ID when
refType is isoform-parent
add a new slot "confidenceValue" in class "intactInteraction"
* improve "show" method to print the total number of
interactors, interactions, or complexes in each entry * fix a
bug related to extracting inhibitor intact IDs for
use UniProt ID instead of IntAct ID to refer interactors from
I am streamlining the code so that the output of getMips and
getGO are identical...this will help to produce one uniform
Added rowWilcoxon, faster than the current method for (at
least) less than 100 observations.
Workaround for ramp/gcc bug (optimization lowered to -O1 for
ramp.c) Several improvements for findpeaks.centWave Fixed bug
in joinOverlappingPeaks() Updated ramp.c to v1.38 (from CVS)
Removed maxGaussErr option in findpeaks.centWave Fixed bug in
joinOverlappingPeaks() Fixed bug in findpeaks.centWave where
rt was not assigned correctly
Removed maxGaussErr option in findpeaks.centWave
Removed maxGaussErr option in findpeaks.centWave, removed
debug output in joinOverlappingPeaks
Fixed mzdata problem under windows (ramp.c v1.39 from CVS)
Removed workaround for ramp/gcc bug
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
More information about the Bioc-devel