[Bioc-devel] Bioconductor 3.5 is released
Obenchain, Valerie
Valerie.Obenchain at RoswellPark.org
Wed Apr 26 02:31:46 CEST 2017
As usual, a few corrections:
- number of software is 1383 not 1384
- number of experiment data packages is 315 not 316
- updated section for 'Deprecated and Defunct'
Deprecated and Defunct Packages
===============================
One software package (betr) was removed from this release (after being
deprecated in BioC 3.4).
Nine software packages (AtlasRDF, coRNAi, saps, MeSHSim, GENE.E, mmnet,
CopyNumber450k, GEOsearch, pdmclass) are deprecated in this release and
will be removed in BioC 3.6.
Two experimental data packages (encoDnaseI, ggtut) were removed from this
release (after being deprecated in BioC 3.4).
One experimental data package (CopyNumber450kData) is deprecated in this
release and will be removed in BioC 3.6.
Valerie
On 04/25/2017 03:40 PM, Obenchain, Valerie wrote:
> April 25, 2017
>
> Bioconductors:
>
> We are pleased to announce Bioconductor 3.5, consisting of 1384
> software packages, 316 experiment data packages, and 911 annotation
> packages.
>
> There are 88 new software packages, and many updates and improvements
> to existing packages; Bioconductor 3.5 is compatible with R 3.4,
> and is supported on Linux, 32- and 64-bit Windows, and Mac OS X. This
> release will include an updated Bioconductor [Amazon Machine Image][1]
> and [Docker containers][2].
>
> Visit [https://bioconductor.org][3]
> for details and downloads.
>
> [1]: https://bioconductor.org/help/bioconductor-cloud-ami/
> [2]: https://bioconductor.org/help/docker/
> [3]: https://bioconductor.org
>
> Contents
> --------
>
> * [Getting Started with Bioconductor
> 3.5](#getting-started-with-bioconductor-35)
> * [New Software Packages](#new-software-packages)
> * [NEWS from new and existing
> packages](#news-from-new-and-existing-packages)
> * [Deprecated and Defunct Packages](#deprecated-and-defunct-packages)
>
> Getting Started with Bioconductor 3.5
> ======================================
>
> To update to or install Bioconductor 3.5:
>
> 1. Install R 3.4. Bioconductor 3.5 has been designed expressly for
> this version of R.
>
> 2. Follow the instructions at
> [http://bioconductor.org/install/](http://bioconductor.org/install/).
>
> New Software Packages
> =====================
>
> There are 88 new software packages in this release of Bioconductor.
>
> - [AnnotationFilter](https://bioconductor.org/packages/AnnotationFilter)
> This package provides class and other infrastructure to implement
> filters for manipulating Bioconductor annotation resources. The
> filters will be used by ensembldb, Organism.dplyr, and other
> packages.
>
> - [ATACseqQC](https://bioconductor.org/packages/ATACseqQC) ATAC-seq,
> an assay for Transposase-Accessible Chromatin using sequencing, is
> a rapid and sensitive method for chromatin accessibility analysis.
> It was developed as an alternative method to MNase-seq, FAIRE-seq
> and DNAse-seq. Comparing to the other methods, ATAC-seq requires
> less amount of the biological samples and time to process. In the
> process of analyzing several ATAC-seq dataset produced in our labs,
> we learned some of the unique aspects of the quality assessment for
> ATAC-seq data.To help users to quickly assess whether their
> ATAC-seq experiment is successful, we developed ATACseqQC package
> partially following the guideline published in Nature Method 2013
> (Greenleaf et al.), including diagnostic plot of fragment size
> distribution, proportion of mitochondria reads, nucleosome
> positioning pattern, and CTCF or other Transcript Factor
> footprints.
>
> - [banocc](https://bioconductor.org/packages/banocc) BAnOCC is a
> package designed for compositional data, where each sample sums to
> one. It infers the approximate covariance of the unconstrained data
> using a Bayesian model coded with `rstan`. It provides as output
> the `stanfit` object as well as posterior median and credible
> interval estimates for each correlation element.
>
> - [basecallQC](https://bioconductor.org/packages/basecallQC) The
> basecallQC package provides tools to work with Illumina bcl2Fastq
> (versions >= 2.1.7) software.Prior to basecalling and
> demultiplexing using the bcl2Fastq software, basecallQC functions
> allow the user to update Illumina sample sheets from versions <=
> 1.8.9 to >= 2.1.7 standards, clean sample sheets of common problems
> such as invalid sample names and IDs, create read and index
> basemasks and the bcl2Fastq command. Following the generation of
> basecalled and demultiplexed data, the basecallQC packages allows
> the user to generate HTML tables, plots and a self contained report
> of summary metrics from Illumina XML output files.
>
> - [BiocFileCache](https://bioconductor.org/packages/BiocFileCache)
> This package creates a persistent on-disk cache of files that the
> user can add, update, and retrieve. It is useful for managing
> resources (such as custom Txdb objects) that are costly or
> difficult to create, web resources, and data files used across
> sessions.
>
> - [BioCor](https://bioconductor.org/packages/BioCor) Calculates
> functional similarities based on the pathways described on KEGG and
> REACTOME or in gene sets. These similarities can be calculated for
> pathways or gene sets, genes, or clusters and combined with other
> similarities. They can be used to improve networks, gene selection,
> testing relationships...
>
> - [BioMedR](https://bioconductor.org/packages/BioMedR) The BioMedR
> package offers an R/Bioconductor package generating various
> molecular representations for chemicals, proteins, DNAs/RNAs and
> their interactions.
>
> - [biotmle](https://bioconductor.org/packages/biotmle) This package
> facilitates the discovery of biomarkers from biological sequencing
> data (e.g., microarrays, RNA-seq) based on the associations of
> potential biomarkers with exposure and outcome variables by
> implementing an estimation procedure that combines a generalization
> of the moderated t-statistic with asymptotically linear statistical
> parameters estimated via targeted minimum loss-based estimation
> (TMLE).
>
> - [BLMA](https://bioconductor.org/packages/BLMA) Suit of tools for
> bi-level meta-analysis. The package can be used in a wide range of
> applications, including general hypothesis testings, differential
> expression analysis, functional analysis, and pathway analysis.
>
> - [BPRMeth](https://bioconductor.org/packages/BPRMeth) BPRMeth
> package uses the Binomial Probit Regression likelihood to model
> methylation profiles and extract higher order features. These
> features quantitate precisely notions of shape of a methylation
> profile. Using these higher order features across promoter-proximal
> regions, we construct a powerful predictor of gene expression.
> Also, these features are used to cluster proximal-promoter regions
> using the EM algorithm.
>
> - [branchpointer](https://bioconductor.org/packages/branchpointer)
> Predicts branchpoint probability for sites in intronic branchpoint
> windows. Queries can be supplied as intronic regions; or to
> evaluate the effects of mutations, SNPs.
>
> - [BUMHMM](https://bioconductor.org/packages/BUMHMM) This is a
> probabilistic modelling pipeline for computing per- nucleotide
> posterior probabilities of modification from the data collected in
> structure probing experiments. The model supports multiple
> experimental replicates and empirically corrects coverage- and
> sequence-dependent biases. The model utilises the measure of a
> "drop-off rate" for each nucleotide, which is compared between
> replicates through a log-ratio (LDR). The LDRs between control
> replicates define a null distribution of variability in drop-off
> rate observed by chance and LDRs between treatment and control
> replicates gets compared to this distribution. Resulting empirical
> p-values (probability of being "drawn" from the null distribution)
> are used as observations in a Hidden Markov Model with a
> Beta-Uniform Mixture model used as an emission model. The resulting
> posterior probabilities indicate the probability of a nucleotide of
> having being modified in a structure probing experiment.
>
> - [CATALYST](https://bioconductor.org/packages/CATALYST) Mass
> cytometry (CyTOF) uses heavy metal isotopes rather than fluorescent
> tags as reporters to label antibodies, thereby substantially
> decreasing spectral overlap and allowing for examination of over 50
> parameters at the single cell level. While spectral overlap is
> significantly less pronounced in CyTOF than flow cytometry,
> spillover due to detection sensitivity, isotopic impurities, and
> oxide formation can impede data interpretability. We designed
> CATALYST (Cytometry dATa anALYSis Tools) to provide a pipeline for
> preprocessing of cytometry data, including i) normalization using
> bead standards, ii) single-cell deconvolution, and iii) bead-based
> compensation.
>
> - [cellbaseR](https://bioconductor.org/packages/cellbaseR) This R
> package makes use of the exhaustive RESTful Web service API that
> has been implemented for the Cellabase database. It enable
> researchers to query and obtain a wealth of biological information
> from a single database saving a lot of time. Another benefit is
> that researchers can easily make queries about different biological
> topics and link all this information together as all information is
> integrated.
>
> - [cellscape](https://bioconductor.org/packages/cellscape) CellScape
> facilitates interactive browsing of single cell clonal evolution
> datasets. The tool requires two main inputs: (i) the genomic
> content of each single cell in the form of either copy number
> segments or targeted mutation values, and (ii) a single cell
> phylogeny. Phylogenetic formats can vary from dendrogram-like
> phylogenies with leaf nodes to evolutionary model-derived
> phylogenies with observed or latent internal nodes. The CellScape
> phylogeny is flexibly input as a table of source-target edges to
> support arbitrary representations, where each node may or may not
> have associated genomic data. The output of CellScape is an
> interactive interface displaying a single cell phylogeny and a
> cell-by-locus genomic heatmap representing the mutation status in
> each cell for each locus.
>
> - [chimeraviz](https://bioconductor.org/packages/chimeraviz)
> chimeraviz manages data from fusion gene finders and provides
> useful visualization tools.
>
> - [ChIPexoQual](https://bioconductor.org/packages/ChIPexoQual)
> Package with a quality control pipeline for ChIP-exo/nexus data.
>
> - [clusterSeq](https://bioconductor.org/packages/clusterSeq)
> Identification of clusters of co-expressed genes based on their
> expression across multiple (replicated) biological samples.
>
> - [coseq](https://bioconductor.org/packages/coseq) Co-expression
> analysis for expression profiles arising from high-throughput
> sequencing data. Feature (e.g., gene) profiles are clustered using
> adapted transformations and mixture models or a K-means algorithm,
> and model selection criteria (to choose an appropriate number of
> clusters) are provided.
>
> - [cydar](https://bioconductor.org/packages/cydar) Identifies
> differentially abundant populations between samples and groups in
> mass cytometry data. Provides methods for counting cells into
> hyperspheres, controlling the spatial false discovery rate, and
> visualizing changes in abundance in the high-dimensional marker
> space.
>
> - [DaMiRseq](https://bioconductor.org/packages/DaMiRseq) The DaMiRseq
> package offers a tidy pipeline of data mining procedures to
> identify transcriptional biomarkers and exploit them for
> classification purposes.. The package accepts any kind of data
> presented as a table of raw counts and allows including covariates
> that occur with the experimental setting. A series of functions
> enable the user to clean up the data by filtering genomic features
> and samples, to adjust data by identifying and removing the
> unwanted source of variation (i.e. batches and confounding factors)
> and to select the best predictors for modeling. Finally, a
> ``Stacking'' ensemble learning technique is applied to build a
> robust classification model. Every step includes a checkpoint that
> the user may exploit to assess the effects of data management by
> looking at diagnostic plots, such as clustering and heatmaps, RLE
> boxplots, MDS or correlation plot.
>
> - [DelayedArray](https://bioconductor.org/packages/DelayedArray)
> Wrapping an array-like object (typically an on-disk object) in a
> DelayedArray object allows one to perform common array operations
> on it without loading the object in memory. In order to reduce
> memory usage and optimize performance, operations on the object are
> either delayed or executed using a block processing mechanism. Note
> that this also works on in-memory array-like objects like DataFrame
> objects (typically with Rle columns), Matrix objects, and ordinary
> arrays and data frames.
>
> - [discordant](https://bioconductor.org/packages/discordant)
> Discordant is a method to determine differential correlation of
> molecular feature pairs from -omics data using mixture models.
> Algorithm is explained further in Siska et al.
>
> - [DMRScan](https://bioconductor.org/packages/DMRScan) This package
> detects significant differentially methylated regions (for both
> qualitative and quantitative traits), using a scan statistic with
> underlying Poisson heuristics. The scan statistic will depend on a
> sequence of window sizes (# of CpGs within each window) and on a
> threshold for each window size. This threshold can be calculated by
> three different means: i) analytically using Siegmund et.al (2012)
> solution (preferred), ii) an important sampling as suggested by
> Zhang (2008), and a iii) full MCMC modeling of the data, choosing
> between a number of different options for modeling the dependency
> between each CpG.
>
> - [epiNEM](https://bioconductor.org/packages/epiNEM) epiNEM is an
> extension of the original Nested Effects Models (NEM). EpiNEM is
> able to take into account double knockouts and infer more complex
> network signalling pathways.
>
> - [EventPointer](https://bioconductor.org/packages/EventPointer)
> EventPointer is an R package to identify alternative splicing
> events that involve either simple (case-control experiment) or
> complex experimental designs such as time course experiments and
> studies including paired-samples. The algorithm can be used to
> analyze data from either junction arrays (Affymetrix Arrays) or
> sequencing data (RNA-Seq). The software returns a data.frame with
> the detected alternative splicing events: gene name, type of event
> (cassette, alternative 3',...,etc), genomic position, statistical
> significance and increment of the percent spliced in (Delta PSI)
> for all the events. The algorithm can generate a series of files to
> visualize the detected alternative splicing events in IGV. This
> eases the interpretation of results and the design of primers for
> standard PCR validation.
>
> - [flowTime](https://bioconductor.org/packages/flowTime) This package
> was developed for analysis of both dynamic and steady state
> experiments examining the function of gene regulatory networks in
> yeast (strain W303) expressing fluorescent reporter proteins using
> a BD Accuri C6 and SORP cytometers. However, the functions are for
> the most part general and may be adapted for analysis of other
> organisms using other flow cytometers. Functions in this package
> facilitate the annotation of flow cytometry data with experimental
> metadata, as is requisite for dissemination and general
> ease-of-use. Functions for creating, saving and loading gate sets
> are also included. In the past, we have typically generated summary
> statistics for each flowset for each timepoint and then annotated
> and analyzed these summary statistics. This method loses a great
> deal of the power that comes from the large amounts of individual
> cell data generated in flow cytometry, by essentially collapsing
> this data into a bulk measurement after subsetting. In addition to
> these summary functions, this package also contains functions to
> facilitate annotation and analysis of steady-state or time-lapse
> data utilizing all of the data collected from the thousands of
> individual cells in each sample.
>
> - [funtooNorm](https://bioconductor.org/packages/funtooNorm) Provides
> a function to normalize Illumina Infinium Human Methylation 450
> BeadChip (Illumina 450K), correcting for tissue and/or cell type.
>
> - [GA4GHclient](https://bioconductor.org/packages/GA4GHclient)
> GA4GHclient provides an easy way to access public data servers
> through Global Alliance for Genomics and Health (GA4GH) genomics
> API. It provides low-level access to GA4GH API and translates
> response data into Bioconductor-based class objects.
>
> - [gcapc](https://bioconductor.org/packages/gcapc) Peak calling for
> ChIP-seq data with consideration of potential GC bias in sequencing
> reads. GC bias is first estimated with generalized linear mixture
> models using weighted GC strategy, then applied into peak
> significance estimation.
>
> -
> [geneClassifiers](https://bioconductor.org/packages/geneClassifiers)
> This packages aims for easy accessible application of classifiers
> which have been published in literature using an ExpressionSet as
> input.
>
> - [GenomicDataCommons](https://bioconductor.org/packages/GenomicDataCommons)
> Programmatically access the NIH / NCI Genomic Data Commons RESTful
> service.
>
> - [GenomicScores](https://bioconductor.org/packages/GenomicScores)
> Provide infrastructure to store and access genomewide
> position-specific scores within R and Bioconductor.
>
> - [GISPA](https://bioconductor.org/packages/GISPA) GISPA is a method
> intended for the researchers who are interested in defining gene
> sets with similar, a priori specified molecular profile. GISPA
> method has been previously published in Nucleic Acid Research
> (Kowalski et al., 2016; PMID: 26826710).
>
> - [goSTAG](https://bioconductor.org/packages/goSTAG) Gene lists
> derived from the results of genomic analyses are rich in biological
> information. For instance, differentially expressed genes (DEGs)
> from a microarray or RNA-Seq analysis are related functionally in
> terms of their response to a treatment or condition. Gene lists can
> vary in size, up to several thousand genes, depending on the
> robustness of the perturbations or how widely different the
> conditions are biologically. Having a way to associate biological
> relatedness between hundreds and thousands of genes systematically
> is impractical by manually curating the annotation and function of
> each gene. Over-representation analysis (ORA) of genes was
> developed to identify biological themes. Given a Gene Ontology (GO)
> and an annotation of genes that indicate the categories each one
> fits into, significance of the over-representation of the genes
> within the ontological categories is determined by a Fisher's exact
> test or modeling according to a hypergeometric distribution.
> Comparing a small number of enriched biological categories for a
> few samples is manageable using Venn diagrams or other means for
> assessing overlaps. However, with hundreds of enriched categories
> and many samples, the comparisons are laborious. Furthermore, if
> there are enriched categories that are shared between samples,
> trying to represent a common theme across them is highly
> subjective. goSTAG uses GO subtrees to tag and annotate genes
> within a set. goSTAG visualizes the similarities between the
> over-representation of DEGs by clustering the p-values from the
> enrichment statistical tests and labels clusters with the GO term
> that has the most paths to the root within the subtree generated
> from all the GO terms in the cluster.
>
> - [GRridge](https://bioconductor.org/packages/GRridge) This package
> allows the use of multiple sources of co-data (e.g. external
> p-values, gene lists, annotation) to improve prediction of binary,
> continuous and survival response using (logistic, linear or Cox)
> group-regularized ridge regression. It also facilitates post-hoc
> variable selection and prediction diagnostics by cross-validation
> using ROC curves and AUC.
>
> - [heatmaps](https://bioconductor.org/packages/heatmaps) This package
> provides functions for plotting heatmaps of genome-wide data across
> genomic intervals, such as ChIP-seq signals at peaks or across
> promoters. Many functions are also provided for investigating
> sequence features.
>
> - [hicrep](https://bioconductor.org/packages/hicrep) Hi-C is a
> powerful technology for studying genome-wide chromatin
> interactions. However, current methods for assessing Hi-C data
> reproducibility can produce misleading results because they ignore
> spatial features in Hi-C data, such as domain structure and
> distance-dependence. We present a novel reproducibility measure
> that systematically takes these features into consideration. This
> measure can assess pairwise differences between Hi-C matrices under
> a wide range of settings, and can be used to determine optimal
> sequencing depth. Compared to existing approaches, it consistently
> shows higher accuracy in distinguishing subtle differences in
> reproducibility and depicting interrelationships of cell lineages
> than existing approaches. This R package `hicrep` implements our
> approach.
>
> - [ideal](https://bioconductor.org/packages/ideal) This package
> provides functions for an Interactive Differential Expression
> AnaLysis of RNA-sequencing datasets, to extract quickly and
> effectively information downstream the step of differential
> expression. A Shiny application encapsulates the whole package.
>
> - [IMAS](https://bioconductor.org/packages/IMAS) Integrative analysis
> of Multi-omics data for Alternative splicing.
>
> - [ImpulseDE2](https://bioconductor.org/packages/ImpulseDE2)
> ImpulseDE2 is a differential expression algorithm for longitudinal
> count data sets which arise in sequencing experiments such as
> RNA-seq, ChIP-seq, ATAC-seq and DNaseI-seq. ImpulseDE2 is based on
> a negative binomial noise model with dispersion trend smoothing by
> DESeq2 and uses the impulse model to constrain the mean expression
> trajectory of each gene. The impulse model was empirically found to
> fit global expression changes in cells after environmental and
> developmental stimuli and is therefore appropriate in most cell
> biological scenarios. The constraint on the mean expression
> trajectory prevents overfitting to small expression fluctuations.
> Secondly, ImpulseDE2 has higher statistical testing power than
> generalized linear model-based differential expression algorithms
> which fit time as a categorial variable if more than six time
> points are sampled because of the fixed number of parameters.
>
> - [IntEREst](https://bioconductor.org/packages/IntEREst) This package
> performs Intron-Exon Retention analysis on RNA-seq data (.bam
> files).
>
> - [IWTomics](https://bioconductor.org/packages/IWTomics)
> Implementation of the Interval-Wise Testing (IWT) for omics data.
> This inferential procedure tests for differences in "Omics" data
> between two groups of genomic regions (or between a group of
> genomic regions and a reference center of symmetry), and does not
> require fixing location and scale at the outset.
>
> - [karyoploteR](https://bioconductor.org/packages/karyoploteR)
> karyoploteR creates karyotype plots of arbitrary genomes and offers
> a complete set of functions to plot arbitrary data on them. It
> mimicks many R base graphics functions coupling them with a
> coordinate change function automatically mapping the chromosome and
> data coordinates into the plot coordinates. In addition to the
> provided data plotting functions, it is easy to add new ones.
>
> - [Logolas](https://bioconductor.org/packages/Logolas) Produces logo
> plots of a variety of symbols and names comprising English
> alphabets, numerics and punctuations. Can be used for sequence
> motif generation, mutation pattern generation, protein amino acid
> geenration and symbol strength representation in any generic
> context.
>
> - [mapscape](https://bioconductor.org/packages/mapscape) MapScape
> integrates clonal prevalence, clonal hierarchy, anatomic and
> mutational information to provide interactive visualization of
> spatial clonal evolution. There are four inputs to MapScape: (i)
> the clonal phylogeny, (ii) clonal prevalences, (iii) an image
> reference, which may be a medical image or drawing and (iv) pixel
> locations for each sample on the referenced image. Optionally,
> MapScape can accept a data table of mutations for each clone and
> their variant allele frequencies in each sample. The output of
> MapScape consists of a cropped anatomical image surrounded by two
> representations of each tumour sample. The first, a cellular
> aggregate, visually displays the prevalence of each clone. The
> second shows a skeleton of the clonal phylogeny while highlighting
> only those clones present in the sample. Together, these
> representations enable the analyst to visualize the distribution of
> clones throughout anatomic space.
>
> -
> [MaxContrastProjection](https://bioconductor.org/packages/MaxContrastProjection)
> A problem when recording 3D fluorescent microscopy images is how to
> properly present these results in 2D. Maximum intensity projections
> are a popular method to determine the focal plane of each pixel in
> the image. The problem with this approach, however, is that
> out-of-focus elements will still be visible, making edges and fine
> structures difficult to detect. This package aims to resolve this
> problem by using the contrast around a given pixel to determine the
> focal plane, allowing for a much cleaner structure detection than
> would be otherwise possible. For convenience, this package also
> contains functions to perform various other types of projections,
> including a maximum intensity projection.
>
> - [MCbiclust](https://bioconductor.org/packages/MCbiclust) Custom
> made algorithm and associated methods for finding, visualising and
> analysing biclusters in large gene expression data sets. Algorithm
> is based on with a supplied gene set of size n, finding the maximum
> strength correlation matrix containing m samples from the data set.
>
> - [metavizr](https://bioconductor.org/packages/metavizr) This package
> provides Websocket communication to the metaviz web app
> (http://metaviz.cbcb.umd.edu) for interactive visualization of
> metagenomics data. Objects in R/bioc interactive sessions can be
> displayed in plots and data can be explored using a facetzoom
> visualization. Fundamental Bioconductor data structures are
> supported (e.g., MRexperiment objects), while providing an easy
> mechanism to support other data structures. Visualizations (using
> d3.js) can be easily added to the web app as well.
>
> - [methylInheritance](https://bioconductor.org/packages/methylInheritance)
> Permutation analysis, based on Monte Carlo sampling, for testing
> the hypothesis that the number of conserved differentially
> methylated elements, between several generations, is associated to
> an effect inherited from a treatment and that stochastic effect can
> be dismissed.
>
> - [MIGSA](https://bioconductor.org/packages/MIGSA) Massive and
> Integrative Gene Set Analysis. The MIGSA package allows to perform
> a massive and integrative gene set analysis over several expression
> and gene sets simultaneously. It provides a common gene expression
> analytic framework that grants a comprehensive and coherent
> analysis. Only a minimal user parameter setting is required to
> perform both singular and gene set enrichment analyses in an
> integrative manner by means of the best available methods, i.e.
> dEnricher and mGSZrespectively. The greatest strengths of this big
> omics data tool are the availability of several functions to
> explore, analyze and visualize its results in order to facilitate
> the data mining task over huge information sources. MIGSA package
> also provides several functions that allow to easily load the most
> updated gene sets from several repositories.
>
> - [mimager](https://bioconductor.org/packages/mimager) Easily
> visualize and inspect microarrays for spatial artifacts.
>
> - [motifcounter](https://bioconductor.org/packages/motifcounter)
> 'motifcounter' provides functionality to compute the statistics
> related with motif matching and counting of motif matches in DNA
> sequences. As an input, 'motifcounter' requires a motif in terms of
> a position frequency matrix (PFM). Furthermore, a set of DNA
> sequences is required to estimated a higher-order background model
> (BGM). The package provides functions to investigate the the
> per-position and per strand log-likelihood scores between the PFM
> and the BGM across a given sequence of set of sequences.
> Furthermore, the package facilitates motif matching based on an
> automatically derived score threshold. To this end the distribution
> of scores is efficiently determined and the score threshold is
> chosen for a user-prescribed significance level. This allows to
> control for the false positive rate. Moreover, 'motifcounter'
> implements a motif match enrichment test based on two the number of
> motif matches that are expected in random DNA sequences. Motif
> enrichment is facilitated by either a compound Poisson
> approximation or a combinatorial approximation of the motif match
> counts. Both models take higher-order background models, the
> motif's self-similarity, and hits on both DNA strands into account.
> The package is in particular useful for long motifs and/or relaxed
> choices of score thresholds, because the implemented algorithms
> efficiently bypass the need for enumerating a (potentially huge)
> set of DNA words that can give rise to a motif match.
>
> - [msgbsR](https://bioconductor.org/packages/msgbsR) Pipeline for the
> anaysis of a MS-GBS experiment.
>
> - [multiOmicsViz](https://bioconductor.org/packages/multiOmicsViz)
> Calculate the spearman correlation between the source omics data
> and other target omics data, identify the significant correlations
> and plot the significant correlations on the heat map in which the
> x-axis and y-axis are ordered by the chromosomal location.
>
> - [MWASTools](https://bioconductor.org/packages/MWASTools) MWAS
> provides a complete pipeline to perform metabolome-wide association
> studies. Key functionalities of the package include: quality
> control analysis of metabonomic data; MWAS using different
> association models (partial correlations; generalized linear
> models); model validation using non-parametric bootstrapping;
> visualization of MWAS results; NMR metabolite identification using
> STOCSY.
>
> - [NADfinder](https://bioconductor.org/packages/NADfinder) Call peaks
> for two samples: target and control. It will count the reads for
> tiles of the genome and then convert it to ratios. The ratios will
> be corrected and smoothed. The z-scores is calculated for each
> counting windows over the background. The peaks will be detected
> based on z-scores.
>
> - [netReg](https://bioconductor.org/packages/netReg) netReg fits
> linear regression models using network-penalization. Graph prior
> knowledge, in the form of biological networks, is being
> incorporated into the likelihood of the linear model. The networks
> describe biological relationships such as co-regulation or
> dependency of the same transcription factors/metabolites/etc.
> yielding a part sparse and part smooth solution for coefficient
> profiles.
>
> - [Organism.dplyr](https://bioconductor.org/packages/Organism.dplyr)
> This package provides an alternative interface to Bioconductor
> 'annotation' resources, in particular the gene identifier mapping
> functionality of the 'org' packages (e.g., org.Hs.eg.db) and the
> genome coordinate functionality of the 'TxDb' packages (e.g.,
> TxDb.Hsapiens.UCSC.hg38.knownGene).
>
> - [pathprint](https://bioconductor.org/packages/pathprint) Algorithms
> to convert a gene expression array provided as an expression table
> or a GEO reference to a 'pathway fingerprint', a vector of discrete
> ternary scores representing high (1), low(-1) or insignificant (0)
> expression in a suite of pathways.
>
> - [pgca](https://bioconductor.org/packages/pgca) Protein Group Code
> Algorithm (PGCA) is a computationally inexpensive algorithm to
> merge protein summaries from multiple experimental quantitative
> proteomics data. The algorithm connects two or more groups with
> overlapping accession numbers. In some cases, pairwise groups are
> mutually exclusive but they may still be connected by another group
> (or set of groups) with overlapping accession numbers. Thus, groups
> created by PGCA from multiple experimental runs (i.e., global
> groups) are called "connected" groups. These identified global
> protein groups enable the analysis of quantitative data available
> for protein groups instead of unique protein identifiers.
>
> - [phosphonormalizer](https://bioconductor.org/packages/phosphonormalizer)
> It uses the overlap between enriched and non-enriched datasets to
> compensate for the bias introduced in global phosphorylation after
> applying median normalization.
>
> - [POST](https://bioconductor.org/packages/POST) Perform orthogonal
> projection of high dimensional data of a set, and statistical
> modeling of phenotye with projected vectors as predictor.
>
> - [PPInfer](https://bioconductor.org/packages/PPInfer) Interactions
> between proteins occur in many, if not most, biological processes.
> Most proteins perform their functions in networks associated with
> other proteins and other biomolecules. This fact has motivated the
> development of a variety of experimental methods for the
> identification of protein interactions. This variety has in turn
> urshered in the development of numerous different computational
> approaches for modeling and predicting protein interactions.
> Sometimes an experiment is aimed at identifying proteins closely
> related to some interesting proteins. A network based statistical
> learning method is used to infer the putative functions of proteins
> from the known functions of its neighboring proteins on a PPI
> network. This package identifies such proteins often involved in
> the same or similar biological functions.
>
> - [RaggedExperiment](https://bioconductor.org/packages/RaggedExperiment)
> This package provides a flexible representation of copy number,
> mutation, and other data that fit into the ragged array schema for
> genomic location data. The basic representation of such data
> provides a rectangular flat table interface to the user with range
> information in the rows and samples/specimen in the columns.
>
> - [ramwas](https://bioconductor.org/packages/ramwas) RaMWAS provides
> a complete toolset for methylome-wide association studies (MWAS).
> It is specifically designed for data from enrichment based
> methylation assays, but can be applied to other data as well. The
> analysis pipeline includes seven steps: (1) scanning aligned reads
> from BAM files, (2) calculation of quality control measures, (3)
> creation of methylation score (coverage) matrix, (4) principal
> component analysis for capturing batch effects and detection of
> outliers, (5) association analysis with respect to phenotypes of
> interest while correcting for top PCs and known covariates, (6)
> annotation of significant findings, and (7) multi-marker analysis
> (methylation risk score) using elastic net. Additionally, RaMWAS
> include tools for joint analysis of methlyation and genotype data.
>
> - [REMP](https://bioconductor.org/packages/REMP) Machine
> learing-based tools to predict DNA methylation of locus-specific
> repetitive elements (RE) by learning surrounding genetic and
> epigenetic information. These tools provide genomewide and
> single-base resolution of DNA methylation prediction on RE that are
> difficult to measure using array-based or sequencing-based
> platforms, which enables epigenome-wide association study (EWAS)
> and differentially methylated region (DMR) analysis on RE.
>
> - [RITAN](https://bioconductor.org/packages/RITAN) Tools for
> comprehensive gene set enrichment and extraction of multi-resource
> high confidence subnetworks.
>
> - [RIVER](https://bioconductor.org/packages/RIVER) An implementation
> of a probabilistic modeling framework that jointly analyzes
> personal genome and transcriptome data to estimate the probability
> that a variant has regulatory impact in that individual. It is
> based on a generative model that assumes that genomic annotations,
> such as the location of a variant with respect to regulatory
> elements, determine the prior probability that variant is a
> functional regulatory variant, which is an unobserved variable. The
> functional regulatory variant status then influences whether nearby
> genes are likely to display outlier levels of gene expression in
> that person. See the RIVER website for more information,
> documentation and examples.
>
> - [RJMCMCNucleosomes](https://bioconductor.org/packages/RJMCMCNucleosomes)
> This package does nucleosome positioning using informative
> Multinomial-Dirichlet prior in a t-mixture with reversible jump
> estimation of nucleosome positions for genome-wide profiling.
>
> - [RnaSeqGeneEdgeRQL](https://bioconductor.org/packages/RnaSeqGeneEdgeRQL)
> A workflow package for RNA-Seq experiments
>
> - [rqt](https://bioconductor.org/packages/rqt) Despite the recent
> advances of modern GWAS methods, it still remains an important
> problem of addressing calculation an effect size and corresponding
> p-value for the whole gene rather than for single variant. The R-
> package rqt offers gene-level GWAS meta-analysis. For more
> information, see: "Gene-set association tests for next-generation
> sequencing data" by Lee et al (2016), Bioinformatics, 32(17),
> i611-i619, <doi:10.1093/bioinformatics/btw429>.
>
> - [RTNduals](https://bioconductor.org/packages/RTNduals) RTNduals is
> a tool that searches for possible co-regulatory loops between
> regulon pairs generated by the RTN package. It compares the shared
> targets in order to infer 'dual regulons', a new concept that tests
> whether regulon pairs agree on the predicted downstream effects.
>
> - [samExploreR](https://bioconductor.org/packages/samExploreR) This R
> package is designed for subsampling procedure to simulate
> sequencing experiments with reduced sequencing depth. This package
> can be used to anlayze data generated from all major sequencing
> platforms such as Illumina GA, HiSeq, MiSeq, Roche GS-FLX, ABI
> SOLiD and LifeTech Ion PGM Proton sequencers. It supports multiple
> operating systems incluidng Linux, Mac OS X, FreeBSD and Solaris.
> Was developed with usage of Rsubread.
>
> - [sampleClassifier](https://bioconductor.org/packages/sampleClassifier)
> The package is designed to classify gene expression profiles.
>
> - [scDD](https://bioconductor.org/packages/scDD) This package
> implements a method to analyze single-cell RNA- seq Data utilizing
> flexible Dirichlet Process mixture models. Genes with differential
> distributions of expression are classified into several interesting
> patterns of differences between two conditions. The package also
> includes functions for simulating data with these patterns from
> negative binomial distributions.
>
> - [scone](https://bioconductor.org/packages/scone) SCONE is an R
> package for comparing and ranking the performance of different
> normalization schemes for single-cell RNA-seq and other
> high-throughput analyses.
>
> - [semisup](https://bioconductor.org/packages/semisup) This R
> packages moves away from testing interaction terms, and move
> towards testing whether an individual SNP is involved in any
> interaction. This reduces the multiple testing burden to one test
> per SNP, and allows for interactions with unobserved factors.
> Analysing one SNP at a time, it splits the individuals into two
> groups, based on the number of minor alleles. If the quantitative
> trait differs in mean between the two groups, the SNP has a main
> effect. If the quantitative trait differs in distribution between
> some individuals in one group and all other individuals, it
> possibly has an interactive effect. Implicitly, the membership
> probabilities may suggest potential interacting variables.
>
> - [sparseDOSSA](https://bioconductor.org/packages/sparseDOSSA) The
> package is to provide a model based Bayesian method to characterize
> and simulate microbiome data. sparseDOSSA's model captures the
> marginal distribution of each microbial feature as a truncated,
> zero-inflated log-normal distribution, with parameters distributed
> as a parent log-normal distribution. The model can be effectively
> fit to reference microbial datasets in order to parameterize their
> microbes and communities, or to simulate synthetic datasets of
> similar population structure. Most importantly, it allows users to
> include both known feature-feature and feature-metadata correlation
> structures and thus provides a gold standard to enable benchmarking
> of statistical methods for metagenomic data analysis.
>
> - [splatter](https://bioconductor.org/packages/splatter) Splatter is
> a package for the simulation of single-cell RNA sequencing count
> data. It provides a simple interface for creating complex
> simulations that are reproducible and well-documented. Parameters
> can be estimated from real data and functions are provided for
> comparing real and simulated datasets.
>
> - [STROMA4](https://bioconductor.org/packages/STROMA4) This package
> estimates four stromal properties identified in TNBC patients in
> each patient of a gene expression datasets. These stromal property
> assignments can be combined to subtype patients. These four stromal
> properties were identified in Triple negative breast cancer (TNBC)
> patients and represent the presence of different cells in the
> stroma: T-cells (T), B-cells (B), stromal infiltrating epithelial
> cells (E), and desmoplasia (D). Additionally this package can also
> be used to estimate generative properties for the Lehmann subtypes,
> an alternative TNBC subtyping scheme (PMID: 21633166).
>
> - [swfdr](https://bioconductor.org/packages/swfdr) This package
> allows users to estimate the science-wise false discovery rate from
> Jager and Leek, "Empirical estimates suggest most published medical
> research is true," 2013, Biostatistics, using an EM approach due to
> the presence of rounding and censoring. It also allows users to
> estimate the proportion of true null hypotheses in the presence of
> covariates, using a regression framework, as per Boca and Leek, "A
> regression framework for the proportion of true null hypotheses,"
> 2015, bioRxiv preprint.
>
> - [TCGAbiolinksGUI](https://bioconductor.org/packages/TCGAbiolinksGUI)
> "TCGAbiolinksGUI: A Graphical User Interface to analyze cancer
> molecular and clinical data. A demo version of GUI is found in
> https://tcgabiolinksgui.shinyapps.io/tcgabiolinks/"
>
> - [TCseq](https://bioconductor.org/packages/TCseq) Quantitative and
> differential analysis of epigenomic and transcriptomic time course
> sequencing data, clustering analysis and visualization of temporal
> patterns of time course data.
>
> - [timescape](https://bioconductor.org/packages/timescape) TimeScape
> is an automated tool for navigating temporal clonal evolution data.
> The key attributes of this implementation involve the enumeration
> of clones, their evolutionary relationships and their shifting
> dynamics over time. TimeScape requires two inputs: (i) the clonal
> phylogeny and (ii) the clonal prevalences. Optionally, TimeScape
> accepts a data table of targeted mutations observed in each clone
> and their allele prevalences over time. The output is the TimeScape
> plot showing clonal prevalence vertically, time horizontally, and
> the plot height optionally encoding tumour volume during
> tumour-shrinking events. At each sampling time point (denoted by a
> faint white line), the height of each clone accurately reflects its
> proportionate prevalence. These prevalences form the anchors for
> bezier curves that visually represent the dynamic transitions
> between time points.
>
> - [treeio](https://bioconductor.org/packages/treeio) Base classes and
> functions for parsing and exporting phylogenetic trees.
>
> - [TSRchitect](https://bioconductor.org/packages/TSRchitect) In
> recent years, large-scale transcriptional sequence data has yielded
> considerable insights into the nature of gene expression and
> regulation in eukaryotes. Techniques that identify the 5' end of
> mRNAs, most notably CAGE, have mapped the promoter landscape across
> a number of model organisms. Due to the variability of TSS
> distributions and the transcriptional noise present in datasets,
> precisely identifying the active promoter(s) for genes from these
> datasets is not straightforward. TSRchitect allows the user to
> efficiently identify the putative promoter (the transcription start
> region, or TSR) from a variety of TSS profiling data types,
> including both single-end (e.g. CAGE) as well as paired-end
> (RAMPAGE, PEAT). Along with the coordiantes of identified TSRs,
> TSRchitect also calculates the width, abundance and Shape Index,
> and handles biological replicates for expression profiling.
> Finally, TSRchitect imports annotation files, allowing the user to
> associate identified promoters with genes and other genomic
> features. Three detailed examples of TSRchitect's utility are
> provided in the User's Guide, included with this package.
>
> - [twoddpcr](https://bioconductor.org/packages/twoddpcr) The twoddpcr
> package takes Droplet Digital PCR (ddPCR) droplet amplitude data
> from Bio-Rad's QuantaSoft and can classify the droplets. A summary
> of the positive/negative droplet counts can be generated, which can
> then be used to estimate the number of molecules using the Poisson
> distribution. This is the first open source package that
> facilitates the automatic classification of general two channel
> ddPCR data. Previous work includes 'definetherain' (Jones et al.,
> 2014) and 'ddpcRquant' (Trypsteen et al., 2015) which both handle
> one channel ddPCR experiments only. The 'ddpcr' package available
> on CRAN (Attali et al., 2016) supports automatic gating of a
> specific class of two channel ddPCR experiments only.
>
> - [wiggleplotr](https://bioconductor.org/packages/wiggleplotr) Tools
> to visualise read coverage from sequencing experiments together
> with genomic annotations (genes, transcripts, peaks). Introns of
> long transcripts can be rescaled to a fixed length for better
> visualisation of exonic read coverage.
>
> NEWS from new and existing packages
> ===================================
>
> There is too much NEWS to include here, see the full release
> announcement at
>
> https://bioconductor.org/news/bioc_3_5_release/
>
>
> Deprecated and Defunct Packages
> ===================================
>
> Seven software packages (seqplots, ssviz, stepwiseCM, segmentSeq,
> EWCE, anamiR, IdMappingRetrieval) were marked as deprecated, to be
> fixed or removed in the next release.
>
> Nine previously deprecated software packages (coRNAi, saps, MeSHSim,
> GENE.E, mmnet, CopyNumber450k, AtlasRDF, GEOsearch, pdmclass) were
> removed from the release.
>
>
>
>
> This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
More information about the Bioc-devel
mailing list