[Bioc-devel] Bioconductor 3.5 is released

Obenchain, Valerie Valerie.Obenchain at RoswellPark.org
Wed Apr 26 00:39:13 CEST 2017

April 25, 2017


We are pleased to announce Bioconductor 3.5, consisting of 1384
software packages, 316 experiment data packages, and 911 annotation

There are 88 new software packages, and many updates and improvements
to existing packages; Bioconductor 3.5 is compatible with R 3.4,
and is supported on Linux, 32- and 64-bit Windows, and Mac OS X.  This
release will include an updated Bioconductor [Amazon Machine Image][1]
and [Docker containers][2].

Visit [https://bioconductor.org][3]
for details and downloads.

[1]: https://bioconductor.org/help/bioconductor-cloud-ami/
[2]: https://bioconductor.org/help/docker/
[3]: https://bioconductor.org


* [Getting Started with Bioconductor
* [New Software Packages](#new-software-packages)
* [NEWS from new and existing
* [Deprecated and Defunct Packages](#deprecated-and-defunct-packages)

Getting Started with Bioconductor 3.5

To update to or install Bioconductor 3.5:

1. Install R 3.4.  Bioconductor 3.5 has been designed expressly for
   this version of R.

2. Follow the instructions at

New Software Packages

There are 88 new software packages in this release of Bioconductor.

- [AnnotationFilter](https://bioconductor.org/packages/AnnotationFilter)
  This package provides class and other infrastructure to implement
  filters for manipulating Bioconductor annotation resources. The
  filters will be used by ensembldb, Organism.dplyr, and other

- [ATACseqQC](https://bioconductor.org/packages/ATACseqQC) ATAC-seq,
  an assay for Transposase-Accessible Chromatin using sequencing, is
  a rapid and sensitive method for chromatin accessibility analysis.
  It was developed as an alternative method to MNase-seq, FAIRE-seq
  and DNAse-seq. Comparing to the other methods, ATAC-seq requires
  less amount of the biological samples and time to process. In the
  process of analyzing several ATAC-seq dataset produced in our labs,
  we learned some of the unique aspects of the quality assessment for
  ATAC-seq data.To help users to quickly assess whether their
  ATAC-seq experiment is successful, we developed ATACseqQC package
  partially following the guideline published in Nature Method 2013
  (Greenleaf et al.), including diagnostic plot of fragment size
  distribution, proportion of mitochondria reads, nucleosome
  positioning pattern, and CTCF or other Transcript Factor

- [banocc](https://bioconductor.org/packages/banocc) BAnOCC is a
  package designed for compositional data, where each sample sums to
  one. It infers the approximate covariance of the unconstrained data
  using a Bayesian model coded with `rstan`. It provides as output
  the `stanfit` object as well as posterior median and credible
  interval estimates for each correlation element.

- [basecallQC](https://bioconductor.org/packages/basecallQC) The
  basecallQC package provides tools to work with Illumina bcl2Fastq
  (versions >= 2.1.7) software.Prior to basecalling and
  demultiplexing using the bcl2Fastq software, basecallQC functions
  allow the user to update Illumina sample sheets from versions <=
  1.8.9 to >= 2.1.7 standards, clean sample sheets of common problems
  such as invalid sample names and IDs, create read and index
  basemasks and the bcl2Fastq command. Following the generation of
  basecalled and demultiplexed data, the basecallQC packages allows
  the user to generate HTML tables, plots and a self contained report
  of summary metrics from Illumina XML output files.

- [BiocFileCache](https://bioconductor.org/packages/BiocFileCache)
  This package creates a persistent on-disk cache of files that the
  user can add, update, and retrieve. It is useful for managing
  resources (such as custom Txdb objects) that are costly or
  difficult to create, web resources, and data files used across

- [BioCor](https://bioconductor.org/packages/BioCor) Calculates
  functional similarities based on the pathways described on KEGG and
  REACTOME or in gene sets. These similarities can be calculated for
  pathways or gene sets, genes, or clusters and combined with other
  similarities. They can be used to improve networks, gene selection,
  testing relationships...

- [BioMedR](https://bioconductor.org/packages/BioMedR) The BioMedR
  package offers an R/Bioconductor package generating various
  molecular representations for chemicals, proteins, DNAs/RNAs and
  their interactions.

- [biotmle](https://bioconductor.org/packages/biotmle) This package
  facilitates the discovery of biomarkers from biological sequencing
  data (e.g., microarrays, RNA-seq) based on the associations of
  potential biomarkers with exposure and outcome variables by
  implementing an estimation procedure that combines a generalization
  of the moderated t-statistic with asymptotically linear statistical
  parameters estimated via targeted minimum loss-based estimation

- [BLMA](https://bioconductor.org/packages/BLMA) Suit of tools for
  bi-level meta-analysis. The package can be used in a wide range of
  applications, including general hypothesis testings, differential
  expression analysis, functional analysis, and pathway analysis.

- [BPRMeth](https://bioconductor.org/packages/BPRMeth) BPRMeth
  package uses the Binomial Probit Regression likelihood to model
  methylation profiles and extract higher order features. These
  features quantitate precisely notions of shape of a methylation
  profile. Using these higher order features across promoter-proximal
  regions, we construct a powerful predictor of gene expression.
  Also, these features are used to cluster proximal-promoter regions
  using the EM algorithm.

- [branchpointer](https://bioconductor.org/packages/branchpointer)
  Predicts branchpoint probability for sites in intronic branchpoint
  windows. Queries can be supplied as intronic regions; or to
  evaluate the effects of mutations, SNPs.

- [BUMHMM](https://bioconductor.org/packages/BUMHMM) This is a
  probabilistic modelling pipeline for computing per- nucleotide
  posterior probabilities of modification from the data collected in
  structure probing experiments. The model supports multiple
  experimental replicates and empirically corrects coverage- and
  sequence-dependent biases. The model utilises the measure of a
  "drop-off rate" for each nucleotide, which is compared between
  replicates through a log-ratio (LDR). The LDRs between control
  replicates define a null distribution of variability in drop-off
  rate observed by chance and LDRs between treatment and control
  replicates gets compared to this distribution. Resulting empirical
  p-values (probability of being "drawn" from the null distribution)
  are used as observations in a Hidden Markov Model with a
  Beta-Uniform Mixture model used as an emission model. The resulting
  posterior probabilities indicate the probability of a nucleotide of
  having being modified in a structure probing experiment.

- [CATALYST](https://bioconductor.org/packages/CATALYST) Mass
  cytometry (CyTOF) uses heavy metal isotopes rather than fluorescent
  tags as reporters to label antibodies, thereby substantially
  decreasing spectral overlap and allowing for examination of over 50
  parameters at the single cell level. While spectral overlap is
  significantly less pronounced in CyTOF than flow cytometry,
  spillover due to detection sensitivity, isotopic impurities, and
  oxide formation can impede data interpretability. We designed
  CATALYST (Cytometry dATa anALYSis Tools) to provide a pipeline for
  preprocessing of cytometry data, including i) normalization using
  bead standards, ii) single-cell deconvolution, and iii) bead-based

- [cellbaseR](https://bioconductor.org/packages/cellbaseR) This R
  package makes use of the exhaustive RESTful Web service API that
  has been implemented for the Cellabase database. It enable
  researchers to query and obtain a wealth of biological information
  from a single database saving a lot of time. Another benefit is
  that researchers can easily make queries about different biological
  topics and link all this information together as all information is

- [cellscape](https://bioconductor.org/packages/cellscape) CellScape
  facilitates interactive browsing of single cell clonal evolution
  datasets. The tool requires two main inputs: (i) the genomic
  content of each single cell in the form of either copy number
  segments or targeted mutation values, and (ii) a single cell
  phylogeny. Phylogenetic formats can vary from dendrogram-like
  phylogenies with leaf nodes to evolutionary model-derived
  phylogenies with observed or latent internal nodes. The CellScape
  phylogeny is flexibly input as a table of source-target edges to
  support arbitrary representations, where each node may or may not
  have associated genomic data. The output of CellScape is an
  interactive interface displaying a single cell phylogeny and a
  cell-by-locus genomic heatmap representing the mutation status in
  each cell for each locus.

- [chimeraviz](https://bioconductor.org/packages/chimeraviz)
  chimeraviz manages data from fusion gene finders and provides
  useful visualization tools.

- [ChIPexoQual](https://bioconductor.org/packages/ChIPexoQual)
  Package with a quality control pipeline for ChIP-exo/nexus data.

- [clusterSeq](https://bioconductor.org/packages/clusterSeq)
  Identification of clusters of co-expressed genes based on their
  expression across multiple (replicated) biological samples.

- [coseq](https://bioconductor.org/packages/coseq) Co-expression
  analysis for expression profiles arising from high-throughput
  sequencing data. Feature (e.g., gene) profiles are clustered using
  adapted transformations and mixture models or a K-means algorithm,
  and model selection criteria (to choose an appropriate number of
  clusters) are provided.

- [cydar](https://bioconductor.org/packages/cydar) Identifies
  differentially abundant populations between samples and groups in
  mass cytometry data. Provides methods for counting cells into
  hyperspheres, controlling the spatial false discovery rate, and
  visualizing changes in abundance in the high-dimensional marker

- [DaMiRseq](https://bioconductor.org/packages/DaMiRseq) The DaMiRseq
  package offers a tidy pipeline of data mining procedures to
  identify transcriptional biomarkers and exploit them for
  classification purposes.. The package accepts any kind of data
  presented as a table of raw counts and allows including covariates
  that occur with the experimental setting. A series of functions
  enable the user to clean up the data by filtering genomic features
  and samples, to adjust data by identifying and removing the
  unwanted source of variation (i.e. batches and confounding factors)
  and to select the best predictors for modeling. Finally, a
  ``Stacking'' ensemble learning technique is applied to build a
  robust classification model. Every step includes a checkpoint that
  the user may exploit to assess the effects of data management by
  looking at diagnostic plots, such as clustering and heatmaps, RLE
  boxplots, MDS or correlation plot.

- [DelayedArray](https://bioconductor.org/packages/DelayedArray)
  Wrapping an array-like object (typically an on-disk object) in a
  DelayedArray object allows one to perform common array operations
  on it without loading the object in memory. In order to reduce
  memory usage and optimize performance, operations on the object are
  either delayed or executed using a block processing mechanism. Note
  that this also works on in-memory array-like objects like DataFrame
  objects (typically with Rle columns), Matrix objects, and ordinary
  arrays and data frames.

- [discordant](https://bioconductor.org/packages/discordant)
  Discordant is a method to determine differential correlation of
  molecular feature pairs from -omics data using mixture models.
  Algorithm is explained further in Siska et al.

- [DMRScan](https://bioconductor.org/packages/DMRScan) This package
  detects significant differentially methylated regions (for both
  qualitative and quantitative traits), using a scan statistic with
  underlying Poisson heuristics. The scan statistic will depend on a
  sequence of window sizes (# of CpGs within each window) and on a
  threshold for each window size. This threshold can be calculated by
  three different means: i) analytically using Siegmund et.al (2012)
  solution (preferred), ii) an important sampling as suggested by
  Zhang (2008), and a iii) full MCMC modeling of the data, choosing
  between a number of different options for modeling the dependency
  between each CpG.

- [epiNEM](https://bioconductor.org/packages/epiNEM) epiNEM is an
  extension of the original Nested Effects Models (NEM). EpiNEM is
  able to take into account double knockouts and infer more complex
  network signalling pathways.

- [EventPointer](https://bioconductor.org/packages/EventPointer)
  EventPointer is an R package to identify alternative splicing
  events that involve either simple (case-control experiment) or
  complex experimental designs such as time course experiments and
  studies including paired-samples. The algorithm can be used to
  analyze data from either junction arrays (Affymetrix Arrays) or
  sequencing data (RNA-Seq). The software returns a data.frame with
  the detected alternative splicing events: gene name, type of event
  (cassette, alternative 3',...,etc), genomic position, statistical
  significance and increment of the percent spliced in (Delta PSI)
  for all the events. The algorithm can generate a series of files to
  visualize the detected alternative splicing events in IGV. This
  eases the interpretation of results and the design of primers for
  standard PCR validation.

- [flowTime](https://bioconductor.org/packages/flowTime) This package
  was developed for analysis of both dynamic and steady state
  experiments examining the function of gene regulatory networks in
  yeast (strain W303) expressing fluorescent reporter proteins using
  a BD Accuri C6 and SORP cytometers. However, the functions are for
  the most part general and may be adapted for analysis of other
  organisms using other flow cytometers. Functions in this package
  facilitate the annotation of flow cytometry data with experimental
  metadata, as is requisite for dissemination and general
  ease-of-use. Functions for creating, saving and loading gate sets
  are also included. In the past, we have typically generated summary
  statistics for each flowset for each timepoint and then annotated
  and analyzed these summary statistics. This method loses a great
  deal of the power that comes from the large amounts of individual
  cell data generated in flow cytometry, by essentially collapsing
  this data into a bulk measurement after subsetting. In addition to
  these summary functions, this package also contains functions to
  facilitate annotation and analysis of steady-state or time-lapse
  data utilizing all of the data collected from the thousands of
  individual cells in each sample.

- [funtooNorm](https://bioconductor.org/packages/funtooNorm) Provides
  a function to normalize Illumina Infinium Human Methylation 450
  BeadChip (Illumina 450K), correcting for tissue and/or cell type.

- [GA4GHclient](https://bioconductor.org/packages/GA4GHclient)
  GA4GHclient provides an easy way to access public data servers
  through Global Alliance for Genomics and Health (GA4GH) genomics
  API. It provides low-level access to GA4GH API and translates
  response data into Bioconductor-based class objects.

- [gcapc](https://bioconductor.org/packages/gcapc) Peak calling for
  ChIP-seq data with consideration of potential GC bias in sequencing
  reads. GC bias is first estimated with generalized linear mixture
  models using weighted GC strategy, then applied into peak
  significance estimation.

  This packages aims for easy accessible application of classifiers
  which have been published in literature using an ExpressionSet as

- [GenomicDataCommons](https://bioconductor.org/packages/GenomicDataCommons)
  Programmatically access the NIH / NCI Genomic Data Commons RESTful

- [GenomicScores](https://bioconductor.org/packages/GenomicScores)
  Provide infrastructure to store and access genomewide
  position-specific scores within R and Bioconductor.

- [GISPA](https://bioconductor.org/packages/GISPA) GISPA is a method
  intended for the researchers who are interested in defining gene
  sets with similar, a priori specified molecular profile. GISPA
  method has been previously published in Nucleic Acid Research
  (Kowalski et al., 2016; PMID: 26826710).

- [goSTAG](https://bioconductor.org/packages/goSTAG) Gene lists
  derived from the results of genomic analyses are rich in biological
  information. For instance, differentially expressed genes (DEGs)
  from a microarray or RNA-Seq analysis are related functionally in
  terms of their response to a treatment or condition. Gene lists can
  vary in size, up to several thousand genes, depending on the
  robustness of the perturbations or how widely different the
  conditions are biologically. Having a way to associate biological
  relatedness between hundreds and thousands of genes systematically
  is impractical by manually curating the annotation and function of
  each gene. Over-representation analysis (ORA) of genes was
  developed to identify biological themes. Given a Gene Ontology (GO)
  and an annotation of genes that indicate the categories each one
  fits into, significance of the over-representation of the genes
  within the ontological categories is determined by a Fisher's exact
  test or modeling according to a hypergeometric distribution.
  Comparing a small number of enriched biological categories for a
  few samples is manageable using Venn diagrams or other means for
  assessing overlaps. However, with hundreds of enriched categories
  and many samples, the comparisons are laborious. Furthermore, if
  there are enriched categories that are shared between samples,
  trying to represent a common theme across them is highly
  subjective. goSTAG uses GO subtrees to tag and annotate genes
  within a set. goSTAG visualizes the similarities between the
  over-representation of DEGs by clustering the p-values from the
  enrichment statistical tests and labels clusters with the GO term
  that has the most paths to the root within the subtree generated
  from all the GO terms in the cluster.

- [GRridge](https://bioconductor.org/packages/GRridge) This package
  allows the use of multiple sources of co-data (e.g. external
  p-values, gene lists, annotation) to improve prediction of binary,
  continuous and survival response using (logistic, linear or Cox)
  group-regularized ridge regression. It also facilitates post-hoc
  variable selection and prediction diagnostics by cross-validation
  using ROC curves and AUC.

- [heatmaps](https://bioconductor.org/packages/heatmaps) This package
  provides functions for plotting heatmaps of genome-wide data across
  genomic intervals, such as ChIP-seq signals at peaks or across
  promoters. Many functions are also provided for investigating
  sequence features.

- [hicrep](https://bioconductor.org/packages/hicrep) Hi-C is a
  powerful technology for studying genome-wide chromatin
  interactions. However, current methods for assessing Hi-C data
  reproducibility can produce misleading results because they ignore
  spatial features in Hi-C data, such as domain structure and
  distance-dependence. We present a novel reproducibility measure
  that systematically takes these features into consideration. This
  measure can assess pairwise differences between Hi-C matrices under
  a wide range of settings, and can be used to determine optimal
  sequencing depth. Compared to existing approaches, it consistently
  shows higher accuracy in distinguishing subtle differences in
  reproducibility and depicting interrelationships of cell lineages
  than existing approaches. This R package `hicrep` implements our

- [ideal](https://bioconductor.org/packages/ideal) This package
  provides functions for an Interactive Differential Expression
  AnaLysis of RNA-sequencing datasets, to extract quickly and
  effectively information downstream the step of differential
  expression. A Shiny application encapsulates the whole package.

- [IMAS](https://bioconductor.org/packages/IMAS) Integrative analysis
  of Multi-omics data for Alternative splicing.

- [ImpulseDE2](https://bioconductor.org/packages/ImpulseDE2)
  ImpulseDE2 is a differential expression algorithm for longitudinal
  count data sets which arise in sequencing experiments such as
  RNA-seq, ChIP-seq, ATAC-seq and DNaseI-seq. ImpulseDE2 is based on
  a negative binomial noise model with dispersion trend smoothing by
  DESeq2 and uses the impulse model to constrain the mean expression
  trajectory of each gene. The impulse model was empirically found to
  fit global expression changes in cells after environmental and
  developmental stimuli and is therefore appropriate in most cell
  biological scenarios. The constraint on the mean expression
  trajectory prevents overfitting to small expression fluctuations.
  Secondly, ImpulseDE2 has higher statistical testing power than
  generalized linear model-based differential expression algorithms
  which fit time as a categorial variable if more than six time
  points are sampled because of the fixed number of parameters.

- [IntEREst](https://bioconductor.org/packages/IntEREst) This package
  performs Intron-Exon Retention analysis on RNA-seq data (.bam

- [IWTomics](https://bioconductor.org/packages/IWTomics)
  Implementation of the Interval-Wise Testing (IWT) for omics data.
  This inferential procedure tests for differences in "Omics" data
  between two groups of genomic regions (or between a group of
  genomic regions and a reference center of symmetry), and does not
  require fixing location and scale at the outset.

- [karyoploteR](https://bioconductor.org/packages/karyoploteR)
  karyoploteR creates karyotype plots of arbitrary genomes and offers
  a complete set of functions to plot arbitrary data on them. It
  mimicks many R base graphics functions coupling them with a
  coordinate change function automatically mapping the chromosome and
  data coordinates into the plot coordinates. In addition to the
  provided data plotting functions, it is easy to add new ones.

- [Logolas](https://bioconductor.org/packages/Logolas) Produces logo
  plots of a variety of symbols and names comprising English
  alphabets, numerics and punctuations. Can be used for sequence
  motif generation, mutation pattern generation, protein amino acid
  geenration and symbol strength representation in any generic

- [mapscape](https://bioconductor.org/packages/mapscape) MapScape
  integrates clonal prevalence, clonal hierarchy, anatomic and
  mutational information to provide interactive visualization of
  spatial clonal evolution. There are four inputs to MapScape: (i)
  the clonal phylogeny, (ii) clonal prevalences, (iii) an image
  reference, which may be a medical image or drawing and (iv) pixel
  locations for each sample on the referenced image. Optionally,
  MapScape can accept a data table of mutations for each clone and
  their variant allele frequencies in each sample. The output of
  MapScape consists of a cropped anatomical image surrounded by two
  representations of each tumour sample. The first, a cellular
  aggregate, visually displays the prevalence of each clone. The
  second shows a skeleton of the clonal phylogeny while highlighting
  only those clones present in the sample. Together, these
  representations enable the analyst to visualize the distribution of
  clones throughout anatomic space.

  A problem when recording 3D fluorescent microscopy images is how to
  properly present these results in 2D. Maximum intensity projections
  are a popular method to determine the focal plane of each pixel in
  the image. The problem with this approach, however, is that
  out-of-focus elements will still be visible, making edges and fine
  structures difficult to detect. This package aims to resolve this
  problem by using the contrast around a given pixel to determine the
  focal plane, allowing for a much cleaner structure detection than
  would be otherwise possible. For convenience, this package also
  contains functions to perform various other types of projections,
  including a maximum intensity projection.

- [MCbiclust](https://bioconductor.org/packages/MCbiclust) Custom
  made algorithm and associated methods for finding, visualising and
  analysing biclusters in large gene expression data sets. Algorithm
  is based on with a supplied gene set of size n, finding the maximum
  strength correlation matrix containing m samples from the data set.

- [metavizr](https://bioconductor.org/packages/metavizr) This package
  provides Websocket communication to the metaviz web app
  (http://metaviz.cbcb.umd.edu) for interactive visualization of
  metagenomics data. Objects in R/bioc interactive sessions can be
  displayed in plots and data can be explored using a facetzoom
  visualization. Fundamental Bioconductor data structures are
  supported (e.g., MRexperiment objects), while providing an easy
  mechanism to support other data structures. Visualizations (using
  d3.js) can be easily added to the web app as well.

- [methylInheritance](https://bioconductor.org/packages/methylInheritance)
  Permutation analysis, based on Monte Carlo sampling, for testing
  the hypothesis that the number of conserved differentially
  methylated elements, between several generations, is associated to
  an effect inherited from a treatment and that stochastic effect can
  be dismissed.

- [MIGSA](https://bioconductor.org/packages/MIGSA) Massive and
  Integrative Gene Set Analysis. The MIGSA package allows to perform
  a massive and integrative gene set analysis over several expression
  and gene sets simultaneously. It provides a common gene expression
  analytic framework that grants a comprehensive and coherent
  analysis. Only a minimal user parameter setting is required to
  perform both singular and gene set enrichment analyses in an
  integrative manner by means of the best available methods, i.e.
  dEnricher and mGSZrespectively. The greatest strengths of this big
  omics data tool are the availability of several functions to
  explore, analyze and visualize its results in order to facilitate
  the data mining task over huge information sources. MIGSA package
  also provides several functions that allow to easily load the most
  updated gene sets from several repositories.

- [mimager](https://bioconductor.org/packages/mimager) Easily
  visualize and inspect microarrays for spatial artifacts.

- [motifcounter](https://bioconductor.org/packages/motifcounter)
  'motifcounter' provides functionality to compute the statistics
  related with motif matching and counting of motif matches in DNA
  sequences. As an input, 'motifcounter' requires a motif in terms of
  a position frequency matrix (PFM). Furthermore, a set of DNA
  sequences is required to estimated a higher-order background model
  (BGM). The package provides functions to investigate the the
  per-position and per strand log-likelihood scores between the PFM
  and the BGM across a given sequence of set of sequences.
  Furthermore, the package facilitates motif matching based on an
  automatically derived score threshold. To this end the distribution
  of scores is efficiently determined and the score threshold is
  chosen for a user-prescribed significance level. This allows to
  control for the false positive rate. Moreover, 'motifcounter'
  implements a motif match enrichment test based on two the number of
  motif matches that are expected in random DNA sequences. Motif
  enrichment is facilitated by either a compound Poisson
  approximation or a combinatorial approximation of the motif match
  counts. Both models take higher-order background models, the
  motif's self-similarity, and hits on both DNA strands into account.
  The package is in particular useful for long motifs and/or relaxed
  choices of score thresholds, because the implemented algorithms
  efficiently bypass the need for enumerating a (potentially huge)
  set of DNA words that can give rise to a motif match.

- [msgbsR](https://bioconductor.org/packages/msgbsR) Pipeline for the
  anaysis of a MS-GBS experiment.

- [multiOmicsViz](https://bioconductor.org/packages/multiOmicsViz)
  Calculate the spearman correlation between the source omics data
  and other target omics data, identify the significant correlations
  and plot the significant correlations on the heat map in which the
  x-axis and y-axis are ordered by the chromosomal location.

- [MWASTools](https://bioconductor.org/packages/MWASTools) MWAS
  provides a complete pipeline to perform metabolome-wide association
  studies. Key functionalities of the package include: quality
  control analysis of metabonomic data; MWAS using different
  association models (partial correlations; generalized linear
  models); model validation using non-parametric bootstrapping;
  visualization of MWAS results; NMR metabolite identification using

- [NADfinder](https://bioconductor.org/packages/NADfinder) Call peaks
  for two samples: target and control. It will count the reads for
  tiles of the genome and then convert it to ratios. The ratios will
  be corrected and smoothed. The z-scores is calculated for each
  counting windows over the background. The peaks will be detected
  based on z-scores.

- [netReg](https://bioconductor.org/packages/netReg) netReg fits
  linear regression models using network-penalization. Graph prior
  knowledge, in the form of biological networks, is being
  incorporated into the likelihood of the linear model. The networks
  describe biological relationships such as co-regulation or
  dependency of the same transcription factors/metabolites/etc.
  yielding a part sparse and part smooth solution for coefficient

- [Organism.dplyr](https://bioconductor.org/packages/Organism.dplyr)
  This package provides an alternative interface to Bioconductor
  'annotation' resources, in particular the gene identifier mapping
  functionality of the 'org' packages (e.g., org.Hs.eg.db) and the
  genome coordinate functionality of the 'TxDb' packages (e.g.,

- [pathprint](https://bioconductor.org/packages/pathprint) Algorithms
  to convert a gene expression array provided as an expression table
  or a GEO reference to a 'pathway fingerprint', a vector of discrete
  ternary scores representing high (1), low(-1) or insignificant (0)
  expression in a suite of pathways.

- [pgca](https://bioconductor.org/packages/pgca) Protein Group Code
  Algorithm (PGCA) is a computationally inexpensive algorithm to
  merge protein summaries from multiple experimental quantitative
  proteomics data. The algorithm connects two or more groups with
  overlapping accession numbers. In some cases, pairwise groups are
  mutually exclusive but they may still be connected by another group
  (or set of groups) with overlapping accession numbers. Thus, groups
  created by PGCA from multiple experimental runs (i.e., global
  groups) are called "connected" groups. These identified global
  protein groups enable the analysis of quantitative data available
  for protein groups instead of unique protein identifiers.

- [phosphonormalizer](https://bioconductor.org/packages/phosphonormalizer)
  It uses the overlap between enriched and non-enriched datasets to
  compensate for the bias introduced in global phosphorylation after
  applying median normalization.

- [POST](https://bioconductor.org/packages/POST) Perform orthogonal
  projection of high dimensional data of a set, and statistical
  modeling of phenotye with projected vectors as predictor.

- [PPInfer](https://bioconductor.org/packages/PPInfer) Interactions
  between proteins occur in many, if not most, biological processes.
  Most proteins perform their functions in networks associated with
  other proteins and other biomolecules. This fact has motivated the
  development of a variety of experimental methods for the
  identification of protein interactions. This variety has in turn
  urshered in the development of numerous different computational
  approaches for modeling and predicting protein interactions.
  Sometimes an experiment is aimed at identifying proteins closely
  related to some interesting proteins. A network based statistical
  learning method is used to infer the putative functions of proteins
  from the known functions of its neighboring proteins on a PPI
  network. This package identifies such proteins often involved in
  the same or similar biological functions.

- [RaggedExperiment](https://bioconductor.org/packages/RaggedExperiment)
  This package provides a flexible representation of copy number,
  mutation, and other data that fit into the ragged array schema for
  genomic location data. The basic representation of such data
  provides a rectangular flat table interface to the user with range
  information in the rows and samples/specimen in the columns.

- [ramwas](https://bioconductor.org/packages/ramwas) RaMWAS provides
  a complete toolset for methylome-wide association studies (MWAS).
  It is specifically designed for data from enrichment based
  methylation assays, but can be applied to other data as well. The
  analysis pipeline includes seven steps: (1) scanning aligned reads
  from BAM files, (2) calculation of quality control measures, (3)
  creation of methylation score (coverage) matrix, (4) principal
  component analysis for capturing batch effects and detection of
  outliers, (5) association analysis with respect to phenotypes of
  interest while correcting for top PCs and known covariates, (6)
  annotation of significant findings, and (7) multi-marker analysis
  (methylation risk score) using elastic net. Additionally, RaMWAS
  include tools for joint analysis of methlyation and genotype data.

- [REMP](https://bioconductor.org/packages/REMP) Machine
  learing-based tools to predict DNA methylation of locus-specific
  repetitive elements (RE) by learning surrounding genetic and
  epigenetic information. These tools provide genomewide and
  single-base resolution of DNA methylation prediction on RE that are
  difficult to measure using array-based or sequencing-based
  platforms, which enables epigenome-wide association study (EWAS)
  and differentially methylated region (DMR) analysis on RE.

- [RITAN](https://bioconductor.org/packages/RITAN) Tools for
  comprehensive gene set enrichment and extraction of multi-resource
  high confidence subnetworks.

- [RIVER](https://bioconductor.org/packages/RIVER) An implementation
  of a probabilistic modeling framework that jointly analyzes
  personal genome and transcriptome data to estimate the probability
  that a variant has regulatory impact in that individual. It is
  based on a generative model that assumes that genomic annotations,
  such as the location of a variant with respect to regulatory
  elements, determine the prior probability that variant is a
  functional regulatory variant, which is an unobserved variable. The
  functional regulatory variant status then influences whether nearby
  genes are likely to display outlier levels of gene expression in
  that person. See the RIVER website for more information,
  documentation and examples.

- [RJMCMCNucleosomes](https://bioconductor.org/packages/RJMCMCNucleosomes)
  This package does nucleosome positioning using informative
  Multinomial-Dirichlet prior in a t-mixture with reversible jump
  estimation of nucleosome positions for genome-wide profiling.

- [RnaSeqGeneEdgeRQL](https://bioconductor.org/packages/RnaSeqGeneEdgeRQL)
  A workflow package for RNA-Seq experiments

- [rqt](https://bioconductor.org/packages/rqt) Despite the recent
  advances of modern GWAS methods, it still remains an important
  problem of addressing calculation an effect size and corresponding
  p-value for the whole gene rather than for single variant. The R-
  package rqt offers gene-level GWAS meta-analysis. For more
  information, see: "Gene-set association tests for next-generation
  sequencing data" by Lee et al (2016), Bioinformatics, 32(17),
  i611-i619, <doi:10.1093/bioinformatics/btw429>.

- [RTNduals](https://bioconductor.org/packages/RTNduals) RTNduals is
  a tool that searches for possible co-regulatory loops between
  regulon pairs generated by the RTN package. It compares the shared
  targets in order to infer 'dual regulons', a new concept that tests
  whether regulon pairs agree on the predicted downstream effects.

- [samExploreR](https://bioconductor.org/packages/samExploreR) This R
  package is designed for subsampling procedure to simulate
  sequencing experiments with reduced sequencing depth. This package
  can be used to anlayze data generated from all major sequencing
  platforms such as Illumina GA, HiSeq, MiSeq, Roche GS-FLX, ABI
  SOLiD and LifeTech Ion PGM Proton sequencers. It supports multiple
  operating systems incluidng Linux, Mac OS X, FreeBSD and Solaris.
  Was developed with usage of Rsubread.

- [sampleClassifier](https://bioconductor.org/packages/sampleClassifier)
  The package is designed to classify gene expression profiles.

- [scDD](https://bioconductor.org/packages/scDD) This package
  implements a method to analyze single-cell RNA- seq Data utilizing
  flexible Dirichlet Process mixture models. Genes with differential
  distributions of expression are classified into several interesting
  patterns of differences between two conditions. The package also
  includes functions for simulating data with these patterns from
  negative binomial distributions.

- [scone](https://bioconductor.org/packages/scone) SCONE is an R
  package for comparing and ranking the performance of different
  normalization schemes for single-cell RNA-seq and other
  high-throughput analyses.

- [semisup](https://bioconductor.org/packages/semisup) This R
  packages moves away from testing interaction terms, and move
  towards testing whether an individual SNP is involved in any
  interaction. This reduces the multiple testing burden to one test
  per SNP, and allows for interactions with unobserved factors.
  Analysing one SNP at a time, it splits the individuals into two
  groups, based on the number of minor alleles. If the quantitative
  trait differs in mean between the two groups, the SNP has a main
  effect. If the quantitative trait differs in distribution between
  some individuals in one group and all other individuals, it
  possibly has an interactive effect. Implicitly, the membership
  probabilities may suggest potential interacting variables.

- [sparseDOSSA](https://bioconductor.org/packages/sparseDOSSA) The
  package is to provide a model based Bayesian method to characterize
  and simulate microbiome data. sparseDOSSA's model captures the
  marginal distribution of each microbial feature as a truncated,
  zero-inflated log-normal distribution, with parameters distributed
  as a parent log-normal distribution. The model can be effectively
  fit to reference microbial datasets in order to parameterize their
  microbes and communities, or to simulate synthetic datasets of
  similar population structure. Most importantly, it allows users to
  include both known feature-feature and feature-metadata correlation
  structures and thus provides a gold standard to enable benchmarking
  of statistical methods for metagenomic data analysis.

- [splatter](https://bioconductor.org/packages/splatter) Splatter is
  a package for the simulation of single-cell RNA sequencing count
  data. It provides a simple interface for creating complex
  simulations that are reproducible and well-documented. Parameters
  can be estimated from real data and functions are provided for
  comparing real and simulated datasets.

- [STROMA4](https://bioconductor.org/packages/STROMA4) This package
  estimates four stromal properties identified in TNBC patients in
  each patient of a gene expression datasets. These stromal property
  assignments can be combined to subtype patients. These four stromal
  properties were identified in Triple negative breast cancer (TNBC)
  patients and represent the presence of different cells in the
  stroma: T-cells (T), B-cells (B), stromal infiltrating epithelial
  cells (E), and desmoplasia (D). Additionally this package can also
  be used to estimate generative properties for the Lehmann subtypes,
  an alternative TNBC subtyping scheme (PMID: 21633166).

- [swfdr](https://bioconductor.org/packages/swfdr) This package
  allows users to estimate the science-wise false discovery rate from
  Jager and Leek, "Empirical estimates suggest most published medical
  research is true," 2013, Biostatistics, using an EM approach due to
  the presence of rounding and censoring. It also allows users to
  estimate the proportion of true null hypotheses in the presence of
  covariates, using a regression framework, as per Boca and Leek, "A
  regression framework for the proportion of true null hypotheses,"
  2015, bioRxiv preprint.

- [TCGAbiolinksGUI](https://bioconductor.org/packages/TCGAbiolinksGUI)
  "TCGAbiolinksGUI: A Graphical User Interface to analyze cancer
  molecular and clinical data. A demo version of GUI is found in

- [TCseq](https://bioconductor.org/packages/TCseq) Quantitative and
  differential analysis of epigenomic and transcriptomic time course
  sequencing data, clustering analysis and visualization of temporal
  patterns of time course data.

- [timescape](https://bioconductor.org/packages/timescape) TimeScape
  is an automated tool for navigating temporal clonal evolution data.
  The key attributes of this implementation involve the enumeration
  of clones, their evolutionary relationships and their shifting
  dynamics over time. TimeScape requires two inputs: (i) the clonal
  phylogeny and (ii) the clonal prevalences. Optionally, TimeScape
  accepts a data table of targeted mutations observed in each clone
  and their allele prevalences over time. The output is the TimeScape
  plot showing clonal prevalence vertically, time horizontally, and
  the plot height optionally encoding tumour volume during
  tumour-shrinking events. At each sampling time point (denoted by a
  faint white line), the height of each clone accurately reflects its
  proportionate prevalence. These prevalences form the anchors for
  bezier curves that visually represent the dynamic transitions
  between time points.

- [treeio](https://bioconductor.org/packages/treeio) Base classes and
  functions for parsing and exporting phylogenetic trees.

- [TSRchitect](https://bioconductor.org/packages/TSRchitect) In
  recent years, large-scale transcriptional sequence data has yielded
  considerable insights into the nature of gene expression and
  regulation in eukaryotes. Techniques that identify the 5' end of
  mRNAs, most notably CAGE, have mapped the promoter landscape across
  a number of model organisms. Due to the variability of TSS
  distributions and the transcriptional noise present in datasets,
  precisely identifying the active promoter(s) for genes from these
  datasets is not straightforward. TSRchitect allows the user to
  efficiently identify the putative promoter (the transcription start
  region, or TSR) from a variety of TSS profiling data types,
  including both single-end (e.g. CAGE) as well as paired-end
  (RAMPAGE, PEAT). Along with the coordiantes of identified TSRs,
  TSRchitect also calculates the width, abundance and Shape Index,
  and handles biological replicates for expression profiling.
  Finally, TSRchitect imports annotation files, allowing the user to
  associate identified promoters with genes and other genomic
  features. Three detailed examples of TSRchitect's utility are
  provided in the User's Guide, included with this package.

- [twoddpcr](https://bioconductor.org/packages/twoddpcr) The twoddpcr
  package takes Droplet Digital PCR (ddPCR) droplet amplitude data
  from Bio-Rad's QuantaSoft and can classify the droplets. A summary
  of the positive/negative droplet counts can be generated, which can
  then be used to estimate the number of molecules using the Poisson
  distribution. This is the first open source package that
  facilitates the automatic classification of general two channel
  ddPCR data. Previous work includes 'definetherain' (Jones et al.,
  2014) and 'ddpcRquant' (Trypsteen et al., 2015) which both handle
  one channel ddPCR experiments only. The 'ddpcr' package available
  on CRAN (Attali et al., 2016) supports automatic gating of a
  specific class of two channel ddPCR experiments only.

- [wiggleplotr](https://bioconductor.org/packages/wiggleplotr) Tools
  to visualise read coverage from sequencing experiments together
  with genomic annotations (genes, transcripts, peaks). Introns of
  long transcripts can be rescaled to a fixed length for better
  visualisation of exonic read coverage.

NEWS from new and existing packages

There is too much NEWS to include here, see the full release
announcement at


Deprecated and Defunct Packages

Seven software packages (seqplots, ssviz, stepwiseCM, segmentSeq,
EWCE, anamiR, IdMappingRetrieval) were marked as deprecated, to be
fixed or removed in the next release.

Nine previously deprecated software packages (coRNAi, saps, MeSHSim,
GENE.E, mmnet, CopyNumber450k, AtlasRDF, GEOsearch, pdmclass) were
removed from the release.

This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.

More information about the Bioc-devel mailing list