[Bioc-devel] Bioconductor 3.4 is released

Hervé Pagès hpages at fredhutch.org
Tue Oct 18 23:21:31 CEST 2016


Thanks to all the developers for your contribution to the project!

-------------------------------------------------------------------
-------------------------------------------------------------------

October 18, 2016

Bioconductors:

We are pleased to announce Bioconductor 3.4, consisting of 1294
software packages, 309 experiment data packages, and 933
up-to-date annotation packages.

There are 100 new software packages, and many updates and improvements
to existing packages; Bioconductor 3.4 is compatible with R 3.3,
and is supported on Linux, 32- and 64-bit Windows, and Mac OS X.  This
release will include an updated Bioconductor Amazon Machine Image[1]
and Docker containers[2].

Visit http://bioconductor.org[3] for details and downloads.

[1]: http://bioconductor.org/help/bioconductor-cloud-ami/
[2]: http://bioconductor.org/help/docker/
[3]: http://bioconductor.org

Contents
--------

* Getting Started with Bioconductor 3.4
* New Software Packages
* NEWS from new and existing packages
* Deprecated and Defunct Packages

Getting Started with Bioconductor 3.4
======================================

To update to or install Bioconductor 3.4:

1. Install R 3.3 (>= 3.3.1 recommended).  Bioconductor 3.4 has been
    designed expressly for this version of R.

2. Follow the instructions at http://bioconductor.org/install/

New Software Packages
=====================

There are 100 new software packages in this release of Bioconductor.

alpine - Fragment sequence bias modeling and correction for RNA-seq 
transcript abundance estimation.

AMOUNTAIN-  A pure data-driven gene network, weighted gene co-expression 
network (WGCN) could be constructed only from expression profile. 
Different layers in such networks may represent different time points, 
multiple conditions or various species. AMOUNTAIN aims to search active 
modules in multi-layer WGCN using a continuous optimization approach.

anamiR - This package is intended to identify potential interactions of 
miRNA-target gene interactions from miRNA and mRNA expression data. It 
contains functions for statistical test, databases of miRNA-target gene 
interaction and functional analysis.

Anaquin - The project is intended to support the use of sequins 
(synthetic sequencing spike-in controls) owned and made available by the 
Garvan Institute of Medical Research. The goal is to provide a standard 
open source library for quantitative analysis, modelling and 
visualization of spike-in controls.

annotatr - Given a set of genomic sites/regions (e.g. ChIP-seq peaks, 
CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is often 
of interest to investigate the intersecting genomic annotations. Such 
annotations include those relating to gene models (promoters, 5'UTRs, 
exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpG 
shelves), or regulatory sequences such as enhancers. The annotatr 
package provides an easy way to summarize and visualize the intersection 
of genomic sites/regions with genomic annotations.

ASAFE - Given admixed individuals' bi-allelic SNP genotypes and ancestry 
pairs (where each ancestry can take one of three values) for multiple 
SNPs, perform an EM algorithm to deal with the fact that SNP genotypes 
are unphased with respect to ancestry pairs, in order to estimate 
ancestry-specific allele frequencies for all SNPs.

ASpli - Integrative pipeline for the analyisis of alternative splicing 
using RNAseq.

BaalChIP - The package offers functions to process multiple ChIP-seq BAM 
files and detect allele-specific events. Computes allele counts at 
individual variants (SNPs/SNVs), implements extensive QC steps to remove 
problematic variants, and utilizes a bayesian framework to identify 
statistically significant allele- specific events. BaalChIP is able to 
account for copy number differences between the two alleles, a known 
phenotypical feature of cancer samples.

BayesKnockdown - A simple, fast Bayesian method for computing posterior 
probabilities for relationships between a single predictor variable and 
multiple potential outcome variables, incorporating prior probabilities 
of relationships. In the context of knockdown experiments, the predictor 
variable is the knocked-down gene, while the other genes are potential 
targets. Can also be used for differential expression/2-class data.

bigmelon - Methods for working with Illumina arrays using gdsfmt.

bioCancer - bioCancer is a Shiny App to visualize and analyse 
interactively Multi-Assays of Cancer Genomic Data.

BiocWorkflowTools - Provides functions to ease the transition between 
Rmarkdown and LaTeX documents when authoring a Bioconductor Workflow.

CancerInSilico - The CancerInSilico package provides an R interface for 
running mathematical models of tumor progresson. This package has the 
underlying models implemented in C++ and the output and analysis 
features implemented in R.

CancerSubtypes - CancerSubtypes integrates the current common 
computational biology methods for cancer subtypes identification and 
provides a standardized framework for cancer subtype analysis based on 
the genomic datasets.

ccmap - Finds drugs and drug combinations that are predicted to reverse 
or mimic gene expression signatures. These drugs might reverse diseases 
or mimic healthy lifestyles.

CCPROMISE - Perform Canonical correlation between two forms of high 
demensional genetic data, and associate the first compoent of each form 
of data with a specific biologically interesting pattern of associations 
with multiple endpoints. A probe level analysis is also implemented.

CellMapper - Infers cell type-specific expression based on co-expression 
similarity with known cell type marker genes. Can make accurate 
predictions using publicly available expression data, even when a cell 
type has not been isolated before.

chromstaR - This package implements functions for combinatorial and 
differential analysis of ChIP-seq data. It includes uni- and 
multivariate peak-calling, export to genome browser viewable files, and 
functions for enrichment analyses.

clusterExperiment - This package provides functions for running and 
comparing many different clusterings of single-cell sequencing data.

covEB - Using bayesian methods to estimate correlation matrices assuming 
that they can be written and estimated as block diagonal matrices. These 
block diagonal matrices are determined using shrinkage parameters that 
values below this parameter to zero.

covRNA - This package provides the analysis methods fourthcorner and RLQ 
analysis for large-scale transcriptomic data.

crisprseekplus - Bioinformatics platform containing interface to work 
with offTargetAnalysis and compare2Sequences in the CRISPRseek package, 
and GUIDEseqAnalysis.

crossmeta] - Implements cross-platform and cross-species meta-analyses 
of Affymentrix, Illumina, and Agilent microarray data. This package 
automates common tasks such as downloading, normalizing, and annotating 
raw GEO data. A user interface makes it easy to select control and 
treatment samples for each contrast and study. This input is used for 
subsequent surrogate variable analysis (models unaccounted sources of 
variation) and differential expression analysis. Final meta-analysis of 
differential expression values can include genes measured in only a 
subset of studies.

ctsGE - Methodology for supervised clustering of potentially many 
predictor variables, such as genes etc., in time series datasets 
Provides functions that help the user assigning genes to predefined set 
of model profiles.

CVE - Shiny app for interactive variant prioritisation in precision 
cancer medicine. The input file for CVE is the output file of the 
recently released Oncotator Variant Annotation tool summarising 
variant-centric information from 14 different publicly available 
resources relevant for cancer researches. Interactive priortisation in 
CVE is based on known germline and cancer variants, DNA repair genes and 
functional prediction scores. An optional feature of CVE is the 
exploration of the tumour-specific pathway context that is facilitated 
using co-expression modules generated from publicly available 
transcriptome data. Finally druggability of prioritised variants is 
assessed using the Drug Gene Interaction Database (DGIdb).

CytoML - This package is designed to use GatingML2.0 as the standard 
format to exchange the gated data with other software platform.

DeepBlueR - Accessing the DeepBlue Epigenetics Data Server through R.

DEsubs - DEsubs is a network-based systems biology package that extracts 
disease-perturbed subpathways within a pathway network as recorded by 
RNA-seq experiments. It contains an extensive and customizable framework 
covering a broad range of operation modes at all stages of the 
subpathway analysis, enabling a case-specific approach. The operation 
modes refer to the pathway network construction and processing, the 
subpathway extraction, visualization and enrichment analysis with regard 
to various biological and pharmacological features. Its capabilities 
render it a tool-guide for both the modeler and experimentalist for the 
identification of more robust systems-level biomarkers for complex diseases.

Director - Director is an R package designed to streamline the 
visualization of molecular effects in regulatory cascades. It utilizes 
the R package htmltools and a modified Sankey plugin of the JavaScript 
library D3 to provide a fast and easy, browser-enabled solution to 
discovering potentially interesting downstream effects of regulatory 
and/or co-expressed molecules. The diagrams are robust, interactive, and 
packaged as highly-portable HTML files that eliminate the need for 
third-party software to view. This enables a straightforward approach 
for scientists to interpret the data produced, and bioinformatics 
developers an alternative means to present relevant data.

dSimer - dSimer is an R package which provides computation of nine 
methods for measuring disease-disease similarity, including a standard 
cosine similarity measure and eight function-based methods. The disease 
similarity matrix obtained from these nine methods can be visualized 
through heatmap and network. Biological data widely used in 
disease-disease associations study are also provided by dSimer.

eegc - This package has been developed to evaluate cellular engineering 
processes for direct differentiation of stem cells or conversion 
(transdifferentiation) of somatic cells to primary cells based on high 
throughput gene expression data screened either by DNA microarray or RNA 
sequencing. The package takes gene expression profiles as inputs from 
three types of samples: (i) somatic or stem cells to be 
(trans)differentiated (input of the engineering process), (ii) induced 
cells to be evaluated (output of the engineering process) and (iii) 
target primary cells (reference for the output). The package performs 
differential gene expression analysis for each pair-wise sample 
comparison to identify and evaluate the transcriptional differences 
among the 3 types of samples (input, output, reference). The ideal goal 
is to have induced and primary reference cell showing overlapping 
profiles, both very different from the original cells.

esetVis - Utility functions for visualization of expressionSet (or 
SummarizedExperiment) Bioconductor object, including spectral map, tsne 
and linear discriminant analysis. Static plot via the ggplot2 package or 
interactive via the ggvis or rbokeh packages are available.

ExperimentHub - This package provides a client for the Bioconductor 
ExperimentHub web resource. ExperimentHub provides a central location 
where curated data from experiments, publications or training courses 
can be accessed. Each resource has associated metadata, tags and date of 
modification. The client creates and manages a local cache of files 
retrieved enabling quick and reproducible access.

ExperimentHubData - Functions to add metadata to ExperimentHub db and 
resource files to AWS S3 buckets.

fCCAC - An application of functional canonical correlation analysis to 
assess covariance of nucleic acid sequencing datasets such as chromatin 
immunoprecipitation followed by deep sequencing (ChIP-seq).

fgsea - The package implements an algorithm for fast gene set enrichment 
analysis. Using the fast algorithm allows to make more permutations and 
get more fine grained p-values, which allows to use accurate stantard 
approaches to multiple hypothesis correction.

FitHiC - Fit-Hi-C is a tool for assigning statistical confidence 
estimates to intra-chromosomal contact maps produced by genome-wide 
genome architecture assays such as Hi-C.

flowPloidy - Determine sample ploidy via flow cytometry histogram 
analysis. Reads Flow Cytometry Standard (FCS) files via the flowCore 
bioconductor package, and provides functions for determining the DNA 
ploidy of samples based on internal standards.

FunChIP - Preprocessing and smoothing of ChIP-Seq peaks and efficient 
implementation of the k-mean alignment algorithm to classify them.

GAprediction - [GAprediction] predicts gestational age using Illumina 
HumanMethylation450 CpG data.

gCrisprTools - Set of tools for evaluating pooled high-throughput 
screening experiments, typically employing CRISPR/Cas9 or shRNA 
expression cassettes. Contains methods for interrogating library and 
cassette behavior within an experiment, identifying differentially 
abundant cassettes, aggregating signals to identify candidate targets 
for empirical validation, hypothesis testing, and comprehensive reporting.

GEM - Tools for analyzing EWAS, methQTL and GxE genome widely.

geneAttribution - Identification of the most likely gene or genes 
through which variation at a given genomic locus in the human genome 
acts. The most basic functionality assumes that the closer gene is to 
the input locus, the more likely the gene is to be causative. 
Additionally, any empirical data that links genomic regions to genes 
(e.g. eQTL or genome conformation data) can be used if it is supplied in 
the UCSC .BED file format.

GeneGeneInteR - The aim of this package is to propose several methods 
for testing gene-gene interaction in case-control association studies. 
Such a test can be done by aggregating SNP-SNP interaction tests 
performed at the SNP level (SSI) or by using gene-gene multidimensionnal 
methods (GGI) methods. The package also proposes tools for a graphic 
display of the results.

geneplast - Geneplast is designed for evolutionary and plasticity 
analysis based on orthologous groups distribution in a given species 
tree. It uses Shannon information theory and orthologs abundance to 
estimate the Evolutionary Plasticity Index. Additionally, it implements 
the Bridge algorithm to determine the evolutionary root of a given gene 
based on its orthologs distribution.

geneXtendeR - geneXtendeR is designed to optimally annotate a histone 
modification ChIP-seq peak input file with functionally important 
genomic features (e.g., genes associated with peaks) based on 
optimization calculations.  geneXtendeR optimally extends the boundaries 
of every gene in a genome by some genomic distance (in DNA base pairs) 
for the purpose of flexibly incorporating cis-regulatory elements 
(CREs), such as enhancers and promoters, as well as downstream elements 
that are important to the function of the gene relative to an epigenetic 
histone modification ChIP-seq dataset. geneXtender computes optimal gene 
extensions tailored to the broadness of the specific epigenetic mark 
(e.g., H3K9me1, H3K27me3), as determined by a user-supplied ChIP-seq 
peak input file. As such, geneXtender maximizes the signal-to-noise 
ratio of locating genes closest to and directly under peaks. By 
performing a computational expansion of this nature, ChIP-seq reads that 
would initially not map strictly to a specific gene can now be optimally 
mapped to the regulatory regions of the gene, thereby implicating the 
gene as a potential candidate, and thereby making the ChIP-seq 
experiment more successful. Such an approach becomes particularly 
important when working with epigenetic histone modifications that have 
inherently broad peaks.

GOpro - Find the most characteristic gene ontology terms for groups of 
human genes. This package was created as a part of the thesis which was 
developed under the auspices of MI^2 Group (http://mi2.mini.pw.edu.pl/, 
https://github.com/geneticsMiNIng).

GRmetrics- Functions for calculating and visualizing growth-rate 
inhibition (GR) metrics.

HelloRanges - Translates bedtools command-line invocations to R code 
calling functions from the Bioconductor *Ranges infrastructure. This is 
intended to educate novice Bioconductor users and to compare the syntax 
and semantics of the two frameworks.

ImpulseDE - ImpulseDE is suited to capture single impulse-like patterns 
in high throughput time series datasets. By fitting a representative 
impulse model to each gene, it reports differentially expressed genes 
whether across time points in a single experiment or between two time 
courses from two experiments. To optimize the running time, the code 
makes use of clustering steps and multi-threading.

IPO - The outcome of XCMS data processing strongly depends on the 
parameter settings. IPO (`Isotopologue Parameter Optimization`) is a 
parameter optimization tool that is applicable for different kinds of 
samples and liquid chromatography coupled to high resolution mass 
spectrometry devices, fast and free of labeling steps. IPO uses natural, 
stable 13C isotopes to calculate a peak picking score. Retention time 
correction is optimized by minimizing the relative retention time 
differences within features and grouping parameters are optimized by 
maximizing the number of features showing exactly one peak from each 
injection of a pooled sample. The different parameter settings are 
achieved by design of experiment. The resulting scores are evaluated 
using response surface models.

KEGGlincs - See what is going on 'under the hood' of KEGG pathways by 
explicitly re-creating the pathway maps from information obtained from 
KGML files.

LINC - This package provides methods to compute co-expression networks 
of lincRNAs and protein-coding genes. Biological terms associated with 
the sets of protein-coding genes predict the biological contexts of 
lincRNAs according to the 'Guilty by Association' approach.

LOBSTAHS - LOBSTAHS is a multifunction package for screening, 
annotation, and putative identification of mass spectral features in 
large, HPLC-MS lipid datasets. In silico data for a wide range of 
lipids, oxidized lipids, and oxylipins can be generated from 
user-supplied structural criteria with a database generation function. 
LOBSTAHS then applies these databases to assign putative compound 
identities to features in any high-mass accuracy dataset that has been 
processed using xcms and CAMERA. Users can then apply a series of 
orthogonal screening criteria based on adduct ion formation patterns, 
chromatographic retention time, and other properties, to evaluate and 
assign confidence scores to this list of preliminary assignments. During 
the screening routine, LOBSTAHS rejects assignments that do not meet the 
specified criteria, identifies potential isomers and isobars, and 
assigns a variety of annotation codes to assist the user in evaluating 
the accuracy of each assignment.

M3Drop - This package fits a Michaelis-Menten model to the pattern of 
dropouts in single-cell RNASeq data. This model is used as a null to 
identify significantly variable (i.e. differentially expressed) genes 
for use in downstream analysis, such as clustering cells.

MADSEQ - The MADSEQ package provides a group of hierarchical Bayeisan 
models for the detection of mosaic aneuploidy, the inference of the type 
of aneuploidy and also for the quantification of the fraction of 
aneuploid cells in the sample.

maftools - Analyze and visualize Mutation Annotation Format (MAF) files 
from large scale sequencing studies. This package provides various 
functions to perform most commonly used analyses in cancer genomics and 
to create feature rich customizable visualzations with minimal effort.

MAST - Methods and models for handling zero-inflated single cell assay data.

matter - Memory-efficient reading, writing, and manipulation of 
structured binary data on disk as vectors, matrices, and arrays. This 
package is designed to be used as a back-end for Cardinal for working 
with high-resolution mass spectrometry imaging data.

meshes - MeSH (Medical Subject Headings) is the NLM controlled 
vocabulary used to manually index articles for MEDLINE/PubMed. MeSH 
terms were associated by Entrez Gene ID by three methods, gendoo, 
gene2pubmed and RBBH. This association is fundamental for enrichment and 
semantic analyses. meshes supports enrichment analysis 
(over-representation and gene set enrichment analysis) of gene list or 
whole expression profile. The semantic comparisons of MeSH terms provide 
quantitative ways to compute similarities between genes and gene groups. 
meshes implemented five methods proposed by Resnik, Schlicker, Jiang, 
Lin and Wang respectively and supports more than 70 species.

MetaboSignal - MetaboSignal is an R package that allows merging, 
analyzing and customizing metabolic and signaling KEGG pathways. It is a 
network-based approach designed to explore the topological relationship 
between genes (signaling- or enzymatic-genes) and metabolites, 
representing a powerful tool to investigate the genetic landscape and 
regulatory networks of metabolic phenotypes.

MetCirc - MetCirc comprises a workflow to interactively explore 
metabolomics data: create MSP, bin m/z values, calculate similarity 
between precursors and visualise similarities.

methylKit - methylKit is an R package for DNA methylation analysis and 
annotation from high-throughput bisulfite sequencing. The package is 
designed to deal with sequencing data from RRBS and its variants, but 
also target-capture methods and whole genome bisulfite sequencing. It 
also has functions to analyze base-pair resolution 5hmC data from 
experimental protocols such as oxBS-Seq and TAB-Seq. Perl is needed to 
read SAM files only.

MGFR - The package is designed to detect marker genes from RNA-seq data.

MODA - MODA can be used to estimate and construct condition-specific 
gene co-expression networks, and identify differentially expressed 
subnetworks as conserved or condition specific modules which are 
potentially associated with relevant biological processes.

MoonlightR - Motivation: The understanding of cancer mechanism requires 
the identification of genes playing a role in the development of the 
pathology and the characterization of their role (notably oncogenes and 
tumor suppressors). Results: We present an R/bioconductor package called 
MoonlightR which returns a list of candidate driver genes for specific 
cancer types on the basis of TCGA expression data. The method first 
infers gene regulatory networks and then carries out a functional 
enrichment analysis (FEA) (implementing an upstream regulator analysis, 
URA) to score the importance of well-known biological processes with 
respect to the studied cancer type. Eventually, by means of random 
forests, MoonlightR predicts two specific roles for the candidate driver 
genes: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As a 
consequence, this methodology does not only identify genes playing a 
dual role (e.g. TSG in one cancer type and OCG in another) but also 
helps in elucidating the biological processes underlying their specific 
roles. In particular, MoonlightR can be used to discover OCGs and TSGs 
in the same cancer type. This may help in answering the question whether 
some genes change role between early stages (I, II) and late stages 
(III, IV) in breast cancer. In the future, this analysis could be useful 
to determine the causes of different resistances to chemotherapeutic 
treatments.

msPurity - Assess the contribution of the targeted precursor in 
fragmentation acquired or anticipated isolation windows using a metric 
called "precursor purity". Also provides simple processing steps 
(averaging, filtering, blank subtraction, etc) for DI-MS data. Works for 
both LC-MS(/MS) and DI-MS(/MS) data.

MultiAssayExperiment - Develop an integrative environment where multiple 
assays are managed and preprocessed for genomic data analysis.

MutationalPatterns - An extensive toolset for the characterization and 
visualization of a wide range of mutational patterns in base 
substitution data.

netprioR - A model for semi-supervised prioritisation of genes 
integrating network data, phenotypes and additional prior knowledge 
about TP and TN gene labels from the literature or experts.

normr - Robust normalization and difference calling procedures for 
ChIP-seq and alike data. Read counts are modeled jointly as a binomial 
mixture model with a user-specified number of components. A fitted 
background estimate accounts for the effect of enrichment in certain 
regions and, therefore, represents an appropriate null hypothesis. This 
robust background is used to identify significantly enriched or depleted 
regions.

PathoStat - The purpose of this package is to perform Statistical 
Microbiome Analysis on metagenomics results from sequencing data 
samples. In particular, it supports analyses on the PathoScope generated 
report files. PathoStat provides various functionalities including 
Relative Abundance charts, Diversity estimates and plots, tests of 
Differential Abundance, Time Series visualization, and Core OTU analysis.

PharmacoGx - Contains a set of functions to perform large-scale analysis 
of pharmacogenomic data.

philr - PhILR is short for Phylogenetic Isometric Log-Ratio Transform. 
This package provides functions for the analysis of compositional data 
(e.g., data representing proportions of different variables/parts). 
Specifically this package allows analysis of compositional data where 
the parts can be related through a phylogenetic tree (as is common in 
microbiota survey data) and makes available the Isometric Log Ratio 
transform built from the phylogenetic tree and utilizing a weighted 
reference measure.

Pi - Priority index or Pi is developed as a genomic-led target 
prioritisation system, with the focus on leveraging human genetic data 
to prioritise potential drug targets at the gene, pathway and network 
level. The long term goal is to use such information to enhance 
early-stage target validation. Based on evidence of disease association 
from genome-wide association studies (GWAS), this prioritisation system 
is able to generate evidence to support identification of the specific 
modulated genes (seed genes) that are responsible for the genetic 
association signal by utilising knowledge of linkage disequilibrium 
(co-inherited genetic variants), distance of associated variants from 
the gene, and evidence of independent genetic association with gene 
expression in disease-relevant tissues, cell types and states. Seed 
genes are scored in an integrative way, quantifying the genetic 
influence. Scored seed genes are subsequently used as baits to rank seed 
genes plus additional (non-seed) genes; this is achieved by iteratively 
exploring the global connectivity of a gene interaction network. Genes 
with the highest priority are further used to identify/prioritise 
pathways that are significantly enriched with highly prioritised genes. 
Prioritised genes are also used to identify a gene network 
interconnecting highly prioritised genes and a minimal number of less 
prioritised genes (which act as linkers bringing together highly 
prioritised genes).

Pigengene - Pigengene package provides an efficient way to infer 
biological signatures from gene expression profiles. The signatures are 
independent from the underlying platform, e.g., the input can be 
microarray or RNA Seq data. It can even infer the signatures using data 
from one platform, and evaluate them on the other. Pigengene identifies 
the modules (clusters) of highly coexpressed genes using coexpression 
network analysis, summarizes the biological information of each module 
in an eigengene, learns a Bayesian network that models the probabilistic 
dependencies between modules, and builds a decision tree based on the 
expression of eigengenes.

proFIA - Flow Injection Analysis coupled to High-Resolution Mass 
Spectrometry is a promising approach for high-throughput metabolomics. 
FIA- HRMS data, however, cannot be pre-processed with current software 
tools which rely on liquid chromatography separation, or handle low 
resolution data only. Here we present the proFIA package, which 
implements a new methodology to pre-process FIA-HRMS raw data (netCDF, 
mzData, mzXML, and mzML) including noise modelling and injection peak 
reconstruction, and generate the peak table. The workflow includes noise 
modelling, band detection and filtering then signal matching and missing 
value imputation. The peak table can then be exported as a .tsv file for 
further analysis. Visualisations to assess the quality of the data and 
of the signal made are easely produced.

psichomics - Automatically retrieve data from RNA-Seq sources such as 
The Cancer Genome Atlas or load your own files and process the data. 
This tool allows you to analyse and visualise alternative splicing.

qsea - qsea (quantitative sequencing enrichment analysis) was developed 
as the successor of the MEDIPS package for analyzing data derived from 
methylated DNA immunoprecipitation (MeDIP) experiments followed by 
sequencing (MeDIP-seq). However, qsea provides several functionalities 
for the analysis of other kinds of quantitative sequencing data (e.g. 
ChIP-seq, MBD-seq, CMS-seq and others) including calculation of 
differential enrichment between groups of samples.

RCAS - RCAS is an automated system that provides dynamic genome 
annotations for custom input files that contain transcriptomic regions. 
Such transcriptomic regions could be, for instance, peak regions 
detected by CLIP-Seq analysis that detect protein-RNA interactions, RNA 
modifications (alias the epitranscriptome), CAGE-tag locations, or any 
other collection of target regions at the level of the transcriptome. 
RCAS is designed as a reporting tool for the functional analysis of 
RNA-binding sites detected by high-throughput experiments. It takes as 
input a BED format file containing the genomic coordinates of the RNA 
binding sites and a GTF file that contains the genomic annotation 
features usually provided by publicly available databases such as 
Ensembl and UCSC. RCAS performs overlap operations between the genomic 
coordinates of the RNA binding sites and the genomic annotation features 
and produces in-depth annotation summaries such as the distribution of 
binding sites with respect to gene features (exons, introns, 5'/3' UTR 
regions, exon-intron boundaries, promoter regions, and whole 
transcripts). Moreover, by detecting the collection of targeted 
transcripts, RCAS can carry out functional annotation tables for 
enriched gene sets (annotated by the Molecular Signatures Database) and 
GO terms. As one of the most important questions that arise during 
protein-RNA interaction analysis; RCAS has a module for detecting 
sequence motifs enriched in the targeted regions of the transcriptome. A 
full interactive report in HTML format can be generated that contains 
interactive figures and tables that are ready for publication purposes.

rDGIdb - The rDGIdb package provides a wrapper for the Drug Gene 
Interaction Database (DGIdb). For simplicity, the wrapper query function 
and output resembles the user interface and results format provided on 
the DGIdb website (http://dgidb.genome.wustl.edu/).

readat - This package contains functionality to import, transform and 
annotate data from ADAT files generated by the SomaLogic SOMAscan platform.

recount - Explore and download data from the recount project available 
at https://jhubiostatistics.shinyapps.io/recount/. Using the recount 
package you can download RangedSummarizedExperiment objects at the gene, 
exon or exon-exon junctions level, the raw counts, the phenotype 
metadata used, the urls to the sample coverage bigWig files or the mean 
coverage bigWig file for a particular study. The 
RangedSummarizedExperiment objects can be used by different packages for 
performing differential expression analysis. Using 
http://bioconductor.org/packages/derfinder you can perform 
annotation-agnostic differential expression analyses with the data from 
the recount project as described at 
http://biorxiv.org/content/early/2016/08/08/068478.

regsplice - Statistical methods for detection of differential exon usage 
in RNA-seq and exon microarray data sets, using L1 regularization 
(lasso) to improve power.

sights - SIGHTS is a suite of normalization methods, statistical tests, 
and diagnostic graphical tools for high throughput screening (HTS) 
assays. HTS assays use microtitre plates to screen large libraries of 
compounds for their biological, chemical, or biochemical activity.

signeR - The signeR package provides an empirical Bayesian approach to 
mutational signature discovery. It is designed to analyze single 
nucleotide variaton (SNV) counts in cancer genomes, but can also be 
applied to other features as well. Functionalities to characterize 
signatures or genome samples according to exposure patterns are also 
provided.

SIMLR - Single-cell RNA-seq technologies enable high throughput gene 
expression measurement of individual cells, and allow the discovery of 
heterogeneity within cell populations. Measurement of cell-to-cell gene 
expression similarity is critical to identification, visualization and 
analysis of cell populations. However, single-cell data introduce 
challenges to conventional measures of gene expression similarity 
because of the high level of noise, outliers and dropouts. We develop a 
novel similarity-learning framework, SIMLR (Single-cell Interpretation 
via Multi-kernel LeaRning), which learns an appropriate distance metric 
from the data for dimension reduction, clustering and visualization. 
SIMLR is capable of separating known subpopulations more accurately in 
single-cell data sets than do existing dimension reduction methods. 
Additionally, SIMLR demonstrates high sensitivity and accuracy on 
high-throughput peripheral blood mononuclear cells (PBMC) data sets 
generated by the GemCode single-cell technology from 10x Genomics.

SNPediaR - SNPediaR provides some tools for downloading and parsing data 
from the SNPedia web site <http://www.snpedia.com>. The implemented 
functions allow users to import the wiki text available in SNPedia pages 
and to extract the most relevant information out of them. If some 
information in the downloaded pages is not automatically processed by 
the library functions, users can easily implement their own parsers to 
access it in an efficient way.

SPLINTER - SPLINTER provides tools to analyze alternative splicing 
sites, interpret outcomes based on sequence information, select and 
design primers for site validiation and give visual representation of 
the event to guide downstream experiments.

SRGnet - We developed SRMnet to analyze synergistic regulatory 
mechanisms in transcriptome profiles that act to enhance the overall 
cell response to combination of mutations, drugs or environmental 
exposure. This package can be used to identify regulatory modules 
downstream of synergistic response genes, prioritize synergistic 
regulatory genes that may be potential intervention targets, and 
contextualize gene perturbation experiments.

StarBioTrek - This tool StarBioTrek presents some methodologies to 
measure pathway activity and cross-talk among pathways integrating also 
the information of network data.

statTarget - An easy to use tool provide a graphical user interface for 
quality control based shift signal correction, integration of 
metabolomic data from multi-batch experiments, and the comprehensive 
statistic analysis in non-targeted or targeted metabolomics.

SVAPLSseq - The package contains functions that are intended for the 
identification of differentially expressed genes between two groups of 
samples from RNAseq data after adjusting for various hidden biological 
and technical factors of variability.

switchde - Inference and detection of switch-like differential 
expression across single-cell RNA-seq trajectories.

synergyfinder - Efficient implementations for all the popular synergy 
scoring models for drug combinations, including HSA, Loewe, Bliss and 
ZIP and visualization of the synergy scores as either a two-dimensional 
or a three-dimensional interaction surface over the dose matrix.

TVTB - The package provides S4 classes and methods to filter, summarise 
and visualise genetic variation data stored in VCF files. In particular, 
the package extends the FilterRules class (S4Vectors package) to define 
news classes of filter rules applicable to the various slots of VCF 
objects. Functionalities are integrated and demonstrated in a Shiny 
web-application, the Shiny Variant Explorer (tSVE).

uSORT - This package is designed to uncover the intrinsic cell 
progression path from single-cell RNA-seq data. It incorporates data 
pre-processing, preliminary PCA gene selection, preliminary cell 
ordering, feature selection, refined cell ordering, and post-analysis 
interpretation and visualization.

yamss - Tools to analyze and visualize high-throughput metabolomics data 
aquired using chromatography-mass spectrometry. These tools preprocess 
data in a way that enables reliable and powerful differential analysis.

YAPSA - This package provides functions and routines useful in the 
analysis of somatic signatures (cf. L. Alexandrov et al., Nature 2013). 
In particular, functions to perform a signature analysis with known 
signatures (LCD = linear combination decomposition) and a signature 
analysis on stratified mutational catalogue (SMC = stratify mutational 
catalogue) are provided.

yarn - Expedite large RNA-Seq analyses using a combination of previously 
developed tools. YARN is meant to make it easier for the user in 
performing basic mis-annotation quality control, filtering, and 
condition-aware normalization. YARN leverages many Bioconductor tools 
and statistical techniques to account for the large heterogeneity and 
sparsity found in very large RNA-seq experiments.

NEWS from new and existing packages
===================================

There is too much NEWS to include here, see the full release 
announcement at

   https://bioconductor.org/news/bioc_3_4_release/

Deprecated and Defunct Packages
===============================

1 software package (betr) was marked as deprecated, to be removed in the 
next release.

17 previously deprecated software packages were removed from this release.



More information about the Bioc-devel mailing list