[Bioc-devel] NEWS May 2007

James MacDonald jmacdon at med.umich.edu
Fri Jun 8 16:18:53 CEST 2007

	May 2007


	remove the core parts of the quantile normalization code which
	operate on matrix objects. This code has been moved to a new
	package preprocessCore.

	Fixed featureNames for AffyBatch objects
	core c medianpolish code removed. Now calls preprocessCore for
	this purpose.

	Doc fix - mention how to retrieve se.exprs from mas5calls in
	the mas5calls help page


	fix for small issue with closing files in gzipped binary CEL
	file workflow

	fix SET_VECTOR_ELT/SET_STRING_ELT problem in read.cdffile.list
	with text cdf files


	Fixed uses of @exprs, @se.exprs, @cdfName, @weights in plot.R
	Similarly in Normalize.R for similar references Fixed bug in
	IntensityHistogramAll function Fixed bug in HeatDiagramFile
	menu commands changelog updated


	removing core quantile normalization code moved to

	remove the core medianpolish code. Use the code in
	preprocessCore for this purpose instead

	move the core RLM C code to preprocessCore


	Moved dist2() from affyQCReport to genefilter


	fix bug related to KEGG EC number extraction

	Removed setGeneric() statements for methods GOID, Term,
	Synonym, Secondary, Definition and Ontology: these statements
	are now in AnnotationDbi (>= 0.0.69).


	Reorganized the AnnObj class hierarchy (with lot of class
	renaming). Added minimal infrastructure for GO_DB
	schema. Minor bug fix (and reorganization) in the code that
	generates the man pages of an ann db pkg from its template.

	Some new generics: left.db_table, left.colname,
	right.db_table, right.colname. Added the L2Rpath slot to
	AnnDbMap objects: will replace current slots leftTable,
	leftCol, rightTable, rightCol and join so at some point they
	will be removed. Started to re-implement some methods of the
	low-level API to take advantage of the L2Rpath slot.

	Almost completed the 'join' to 'L2Rpath' migration. Removed
	slots leftTable, leftCol, rightTable, rightCol and join from
	AnnDbMap objects. Modified definition of the HGU95AV2_DB
	schema (in R/schema.HGU95AV2_DB.R) to make use of the new
	AnnDbMap slots (schema definition is now shorter and easier to
	read/modify).  Still broken: AG_DB, YEAST2_DB, YEAST_DB and
	AFFYHUEX_DB schemas + the AnnDbTable class.

	The as.character(), toList() and as.list() methods were broken
	when the data frame returned by toTable has duplicated names
	(this can now happen with the GO_DB schema e.g. with
	toTable(GOBPPARENTS, "GO:0000001")).

	16/19 maps of the GO_DB schema are ready for testing. Replaced
	the GoAnnDbMap class by 2 classes: GoAnnDbMap (one single
	right table) and Go3AnnDbMap (3 right tables). Old GoAnnDbMap
	class corresponds to new Go3AnnDbMap class. Same changes for
	the corresponding reverse classes.

	Added R/GOTerm.R from annotate (with "GOTerms" replaced by

	Removed Ontology method for signature="ANY" since (1) not
	clear it is useful and (2) it is already defined in annotate.

	Added the "filter" feature (stored in new slots Lfilter and
	Rfilter for AnnDbMap objects): 12 maps use it in the GO_DB
	schema. Did some basic testing but need to do more.

	Renamed "GOTerm" class -> "GONode" (more appropriate).

	Completed annother important move: the "L2Rpath" slot
	(AnnDbMap class) is now a list of L2Rbricks objects. This
	change allows the representation of maps that did not fit in
	the previous model (e.g. GOTERM, GOOBSOLETE, GOSYNONYM). The
	implementation of the low-level API was modified
	accordingly. All predefined maps for the HGU95AV2_DB and GO_DB
	schemas now use this new format (the other schemas will follow
	soon). The number of GO_DB maps that are ready is still
	16/19. However the last 3 maps have been added but are not yet
	fully functional (only the low-level API works on them). Added
	4 new generics to the low-level API: tagnames, colnames,
	left.filter and right.filter.

	Renamed the following generics: left.db_table -> left.table,
	right.db_table -> right.table, mapped.left.names ->
	left.mappedNames, mapped.right.names -> right.mappedNames,
	mapped.names -> mappedNames, count.mapped.left.names ->
	count.left.mappedNames, count.mapped.right.names ->
	count.right.mappedNames and count.mapped.names ->
	count.mappedNames. This breaks some examples in the HGU95AV2DB
	package template so all the HGU95AV2_DB-based packages need to
	be remade.

	Added a "sample" method for AnnDbMap objects. I put it in
	AnnDbMap-envirAPI.R because it is _not_ low-level: it is built
	on top of the "as.list" method which is itself _not_ low-level
	(as.list is defined in AnnDbMap-envirAPI.R).

	The GO_DB schema is almost ready: 18/19 maps are fully
	functional. The GOOBSOLETE map is also ready in theory but
	because it uses queries like "SELECT NULL AS blabla FROM ...",
	then it crashes RSQLite :-/.

	Use a temporary workaround for the GOOBSOLETE crash (see
	commit 24833) so now all maps in the GO_DB schema are fully
	functional (still needs more testing). Some improvements to
	the GONode class and methods (added an "initialize" method) so
	that mono-valued and multi-valued slots are treated

	Added the "FlatBimap" class and its API: "ncol", "colnames",
	"left.colname", "right.colname", "left.names", "right.names",
	"left.length", "right.length", "left.mappedNames",
	"right.mappedNames", "count.left.mappedNames",
	"count.right.mappedNames", "nrow", "dim", "links", "nlinks",
	"head", "tail" and "fold". Added some of them to the AnnDbObj
	low-level API: "ncol", "dim", "links", "nlinks" ("links" and
	"nlinks" not ready yet). Also added the "flatten" method to
	this API as a replacement for "toTable" the difference being
	that "flatten" returns a FlatBimap object instead of a naked
	data frame ("toTable" will be deprecated soon). All the
	"as.list" methods for AnnDbMap objects (envir-like API) now
	use "flatten" instead of "toTable" for retrieving the data
	from the database.

	Added the BimapAPI0 interface (the common interface to
	FlatBimap and to AnnDbMap objects). The "FlatBimap" and
	"AnnDbMap" classes extend it. Renamed the "nlinks" generic ->
	"count.links". Reorganized a little bit the code in
	R/FlatBimap.R and R/AnnDbObj-lowAPI.R + improved the
	comments. Added a new test, checkProperty0(), in

	New Package:	    
	    This package performs quality metrics on AffyBatch,
	    ExpressionSet, NchannelSet, containing microarray data
	    from any platforms, one or two channels. The results are
	    designated to allow the user to rapidly assess the quality
	    of a set of arrays.
		Audrey Kauffmann


	 bug fix to path argument in readIllumina() - added warning
	 message to readIllumina() when text or tif files are not

	 BeadLevelList has a new slot 'annotation' for storing the
	 annotation package name. - changes to readIllumina() to allow
	 data in text files to be stored in BeadLevelList instead of
	 having to use images (useImages argument) to get the
	 intensities. A new argument annoPkg and singleChannel have
	 also been added to allow users to specify the relevant
	 annotation package (if there is one - expression only at
	 present) and the type of data (one channel or two). The type
	 of background correction and normalization is also now
	 recorded in the 'arrayInfo' slot. - numBeads has a new
	 argument 'array' - createBeadSummaryData now passes on
	 'annotation' argument from BeadLevelList to
	 ExpressionSetIllumina object. - new BLData.rda added with
	 'annotation' slot - removed unused arguments 'identify' from
	 plotMA, plotXY and plotMAXY functions - updated man pages for
	 most functions to improve examples, descriptions and match

	 new argument 'annoPkg' to readBeadSummaryData - updated
	 readBeadSummaryData man page

	 added beadInfo argument to readIllumina() which fills in the
	 beadAnno slot if supplied. Updated man page to reflect this
	 change - added file argument to example in readQC man page
	 (was missing)

	 Removed various redundant functions Updated vignettes and
	 example data sets added introductory vignette to work with
	 vignette() function


	LOH / genotype reporting

	setting up Interactive determination of Copy numbers


	Bugfix (closing unopened ofstream); small memory management

	Added informative message for missing boost libraries

	Small change to bgx.Rd example; fixed memory leak in

	gene buffer overflow fix; plotDEHistogram fix/cleanup; added
	comment in example

	More informative error message when AnnotatedDataFrame fails
	to initialize

	Separate setMethod and function definition for
	annotatedDataFrameFrom This allows methods for new classes
	(e.g., BufferedMatrix) to reuse code without 'tricking' method
	dispatch.  Function defintions remain not exported, to
	discourage the end user from accessing directly.

	When implementing $ methods, don't rely on [[ doing partial
	matching Hopefully, [[ will stop partial matching at some
	point in the near future. To retain the partial matching
	semantics of $ methods, we can no longer rely on [[. In most
	cases, all cases here, the solution is to call "$"(x, n) in
	the method definition.

	Fix bug in $ methods The previous patch that implemented $
	methods using $ instead of [[ was broken. It seems that $
	stops dispatching after entering the first method or ???. This
	patch passes a basic test, but relies on an unfortunate
	eval/substitute hack.

	Add NChannelSet class * Stores data from N-channel (e.g.,
	two-color) experiments in an eSet-derived object. Matricies in
	assay data correspond to different channels, with rows in each
	matrix representing features and columns samples. phenoData
	varMetadata has a column 'channel' indicating which channel
	the phenodata is associated with (either an assay data member,
	or the special symbol _ALL_ representing phenotypic data
	common to all assays). * Object creation like ExpressionSet
	(see inst/UnitTest and man pages) * channel(object, <channel>)
	creates an ExpressionSet object * selectChannels(object,
	<channels>) subsets NChannelSet by channel.  * Additional
	miscelaneous fixes: - document tidy - AnnotatedDataFrame
	... arguments can be used to specify varMetadata information:
	adf[[covX, labelDescription="Covariate X", channel="G"]] =

	Use 'unsafeSetSlot' within eSet replacement methods. This
	relies on the _current implementation_ of S4 generics making a
	copy before entering the replacement method -- incoming
	objects have exactly one reference, and hence can be modified
	'in place'. This saves 2-4 copies of the object, but is
	terrible programming practice (relying on implementation
	detail; direct slot assignment, ...) and the problem is only
	partially 'real' (when assayData is an environment or
	lockedEnvironment, the 'big' data isn't being copied anyway).
	Safer unsafeSetSlot "sampleNames<-"(...) allowed the
	unsafeSetSlot to seep out. Instead, only use unsafeSetSlot
	after triggering a copy _within_ the replacement
	method. Idioms like phenoData(object) <- pd (appear to)
	trigger an extra copy compared to object at phenoData <- pd, so
	use direct slot access at crucial points internally

	Refactor ScalarCharacter: add ScalarInteger, ScalarNumeric
	There is now a ScalarObject class that handles the basic
	validation. There is also a factory function, mkScalar, that
	creates a Scalar<type> instance of the appropriate type.

	Added new function read.AnnotatedDataFrame2 that hopefully
	will replace read.AnnotatedDataFrame.


	Replaced the "CharBuffer" and "IntBuffer" classes by "XRaw"
	(external raw vector) and "XInteger" (external integer vector)
	respectively. The "XRaw" class is the RAWSXP-based replacement
	for the previously CHARSXP-based "CharBuffer" class. The
	"XInteger" class is the same as the "IntBuffer" class: just a
	renaming. Also followed Seth recommendation to not use
	allocString anymore for creating new CHARSXP objects: now I
	use mkChar() everywhere for this.

	Fixed some breakage introduced by the "CharBuffer to XRaw"
	migration started at r24995.

	Add exports and doc for DatPkg class and subclasses This will
	allow others to properly subclass the HyperGParams class, but
	still needs more thought. These classes might be candidates to
	move into the annotate package.

	Fix bug in applyByCategory, use rownames not colnames
	In makeChrBandGraph, check that we have a human chip The
	current implementation can only handle the parsing of human
	chromosome band annotations. For now, we fail early with an
	error message if we are given an annodation data package that
	isn't for Homo sapiens.

	Add cb_children function, use min.expected instead of

	geneIdsByCategory didn't respect conditional, fix and refactor
	Refactored methods so that more are implemented for
	HyperGResultBase and rely on condGeneIdUniverse geneIds being
	defined. This should reduce code duplication and make the
	handling of conditional test results more uniform.
	makeChrBandInciMat now returns gene sets on rows, genes on
	columns This is more convenient for application of GSEA

	Small tweak to ttperm, use known length to init list, not NULL
	Add gseaperm function This function provides a convenient
	interface for obtaining permutation based p-values for a GSEA
	analysis based on a t-statistic.

	removed the dependency on Matrix

	gseaperm optionally uses Matrix so it moves to Suggests We'd
	like to be able to import Matrix, but there are issues with
	that at the moment.


	The plate normalization in normalizeChannels.R is now done
	BEFORE calling fun. I have also simplified its code and man


	Image() bug fix new("Image", .Data=smth) did not work with new
	R!!!! corrected as res<-new("Image"); res at .Data=smth - works

	Added: image moments and moment invariants, rotation angle and
	elongation etc; IndexedImage class for object detection
	problems. Corrected: some documentation, "}" problems in
	several man pages. Updated: normalize method transfered to C
	enabling per-frame normalizations; image.Image, hist.Image
	transfered to S4 to enable inherited use by IndexedImage;
	windows dll recompiled with the newest releases of R-2.5.0,
	GTK 2.10.11-1 and ImageMagick 6.3.4-1

	'display' for IndexedImage's now normalizes the image by

	choose.image now allows to specify if images are to be read as
	grayscale (with 16-bit color precision if IM compiled so) or
	as true color (with 8-bit per color, thus loosing image info
	if images subsequently converted to grayscales)

	critical bug correction (obvious only in R2.6): in the C code
	Image objects were created incorrectly! Basically they were
	created as arrays onto which additional attributes (including
	class name) were attached. In R2.6 this led to R ignoring [
	operators redefined for Image although class(obj) still shows
	Image! Now corrected throughout.

	filter is no more a slot in the Image, it is now supplied in
	the resize function (not used anywhere else)


	fixed cex having no effect in gene.strip
	fixed bug in .X.to.probeset functions

	fixed missing rownames in pc()

	Added probeset/exon/transcript/gene translation functions for
	Ensembl ESTs and genescan predictions, fixed the use of unique
	in translation fucntions


	fix read.FCS function to support NA under the keywords
	ANASTART ANAEND of the header section

	improve read.FCS function to read a sample of records; add the
	read.FCSheader function

	Add an the which.lines argument for the read.FCS function

	Add missing slot transformId in the transforms object so that
	transformML works

	Add splitScaleTransformation (for transformML) Modify read.FCS
	to remove random sampling but leave which.lines Add parentId
	argument in class Filter Fix rectangle gate boundary allow
	Min>=Max like in GateML standard, return empty gate


	added a vignette about visualizing filters

	moved ecdfplot to use different idiom based on match.call(),
	which is kinda necessary to support nonstandard evaluation of
	groups etc.


	Replaced calls to xy2i() with calls to xy2indices().


	Add feature.exclude argument to nsFilter feature.exclude
	allows the user to specify a character vector of regular
	expressions. Probe sets (featureNames) that match one of the
	regular expressions are removed during the filtering
	process. This is especially useful for removing quality
	control probe sets.

	I moved dist2 from affyQCReport to genefilter


	fixed two bugs: (1) 'file' in the file 'readGenes.ped.R'
	should be 'filename' (2) the function 'getFounders' should be
	called without condition in 'getLD.R'

	(1) added some code to LD functions to check if ped is
	validated. (2) fixed a bug in test.R (the 'sampleInfo' should
	have at least 5 columns)

	fixed a bug in the function 'getFounders'


	Remove Suggests on Biobase Instead, we now check to see if
	Biobase is already loaded and then (on Windows) attempt to add
	the vignette to the GUI menu. This results in a warning from R
	CMD check, but seems the best compromise for now.

	Use R_VERSION to work around interface change to Rf_duplicated
	This patch allows the graph package to compile against R-2.5
	and R-2.6


	New Package:
	    This package provides classes and methods to support Gene
	    Set Enrichment Analysis (GSEA).  In particular, the
	    GeneSet class provides a common data structure for
	    representing gene sets.
	    Biocore team


	Bug fix to read.columns() to stop spurious warning message
	when text.to.search has length greater than one.

	Fixes to as.matrix for ExpressionSet and LumiBatch objects.

	as.matrix methods for ExpressionSet added to NAMESPACE

	as.matrix method for vsn objects added


	Small fix to escape underscores in probe file name in package
	man page so they will pass R CMD check.

	Better handling of creation of new strings, following recent
	changes to mkChar
	Fix memory leak Calls to CallocCharBuf need a matching call to

	adding justSNPRMA - less memory intensive / keeping
	BufferedMatrix only for SNP chips, everything else uses matrix
	/ adding dependence on preprocessCore


	The ExpressionSet error generated loading exon data and large
	tab delimited files was removed

	Bug related to Targets files with FileNames containing "-"

	Fixed bug in table output from GOenrichment function
	added the possibility to locate a legend in ocPlotPCA.R

	Affymetrix APT tools could be used to upload gene/exon level
	probe set summaries on oneChannelGUI.

	The visualization of samples in the pCA space was improved to
	fit an high number of sample. Legene was dusabled Exon data,
	loaded using the APT implementation, are consistent the gene
	meta data. A bug related to unbalance expeirmental design was
	fixed. The PDMCLASS package was implemented.

	Fixing a warning do the absence of the
	OpenCDFandTargetsfiles.R function derived from affylmGUI

	affyPLM was temporarely removed from description to allow the
	windows building


	New Package: 
	    A library of core preprocessing routines for various
	    packages (affy, affyPLM, oligo, etc.)
	    Benjamin Milo Bolstad


	output node names only for articulationPoints
	update output format for biConnComp

	remove sp.between.old

	update .gxl files to get rid of warnings


	add parsers for PSI-MI 2.5 XML format

	update parseInteractor function: don't pick intact ID when
	refType is isoform-parent
	add a new slot "confidenceValue" in class "intactInteraction"
	* improve "show" method to print the total number of
	interactors, interactions, or complexes in each entry * fix a
	bug related to extracting inhibitor intact IDs for

	use UniProt ID instead of IntAct ID to refer interactors from


	 I am streamlining the code so that the output of getMips and
	 getGO are identical...this will help to produce one uniform
	 output style


	Added rowWilcoxon, faster than the current method for (at
	least) less than 100 observations.


	 Workaround for ramp/gcc bug (optimization lowered to -O1 for
	 ramp.c) Several improvements for findpeaks.centWave Fixed bug
	 in joinOverlappingPeaks() Updated ramp.c to v1.38 (from CVS)

	 Removed maxGaussErr option in findpeaks.centWave Fixed bug in
	 joinOverlappingPeaks() Fixed bug in findpeaks.centWave where
	 rt was not assigned correctly

	 Removed maxGaussErr option in findpeaks.centWave

	 Removed maxGaussErr option in findpeaks.centWave, removed
	 debug output in joinOverlappingPeaks

	 Fixed mzdata problem under windows (ramp.c v1.39 from CVS)
	 Removed workaround for ramp/gcc bug


Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.

More information about the Bioc-devel mailing list