Type: | Package |
Title: | Finding Rhythms Using Extended Circadian Harmonic Oscillators (ECHO) |
Version: | 4.0.1 |
Description: | Provides a function (echo_find()) designed to find rhythms from data using extended harmonic oscillators. For more information, see H. De los Santos et al. (2020) <doi:10.1093/bioinformatics/btz617> . |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
Imports: | minpack.lm (≥ 1.2.1), boot (≥ 1.3-22) |
URL: | https://github.com/delosh653/ECHO |
RoxygenNote: | 6.1.1 |
Suggests: | knitr, rmarkdown, ggplot2 |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2020-06-10 13:45:25 UTC; delosh |
Author: | Hannah De los Santos [aut], Emily Collins [aut], Kristin Bennett [aut], Jennifer Hurley [aut, cre], R Development Core Team [aut] |
Maintainer: | Jennifer Hurley <hurlej2@rpi.edu> |
Repository: | CRAN |
Date/Publication: | 2020-06-10 19:50:14 UTC |
echo.find: Provides a function (echo_find) designed to find rhythms from data using extended harmonic oscillators.
Description
To read more about our inital work on this project and cite us, see Circadian Rhythms in Neurospora Exhibit Biologically Relevant Driven and Damped Harmonic Oscillations by H. De los Santos et al. (2017)
Function to calculate the results for all genes using the extended circadian harmonic oscillator (ECHO) method.
Description
Function to calculate the results for all genes using the extended circadian harmonic oscillator (ECHO) method.
Usage
echo_find(genes, begin, end, resol, num_reps, low = 1, high = 2,
run_all_per, paired, rem_unexpr, rem_unexpr_amt = 70,
rem_unexpr_amt_below = 0, is_normal, is_de_linear_trend, is_smooth,
run_conf = F, which_conf = "Bootstrap", harm_cut = 0.03,
over_cut = 0.15, seed = 30)
Arguments
genes |
data frame of genes with the following specifications: first row is column labels, first column has gene labels/names, and all other columns have expression data. This expression data must be ordered by time point then by replicate, and must have evenly spaced time points. Any missing data must have cells left blank. |
begin |
first time point for dataset |
end |
last time point for dataset |
resol |
resolution of time points |
num_reps |
number of replicates |
low |
lower limit when looking for rhythms, in hours. May be unused if finding rhythms of any length within timecouse (run_all_per is TRUE). |
high |
upper limit when looking for rhythms, in hours. May be unused if finding rhythms of any length within timecouse (run_all_per is TRUE). |
run_all_per |
boolean which indicates whether or not rhythms of any length within timecourse should be searched for. |
paired |
if replicate data, whether the replicates are related (paired) or not (unpaired) |
rem_unexpr |
boolean indicating whether genes with less than rem_unexpr_amt percent expression should not be considered |
rem_unexpr_amt |
percentage of expression for which genes should not be considered if rem_unexpr is TRUE |
rem_unexpr_amt_below |
cutoff for expression |
is_normal |
boolean that indicates whether data should be normalized or not |
is_de_linear_trend |
boolean that indicates whether linear trends should be removed from data or not |
is_smooth |
boolean that indicates whether data should be smoothed or not |
run_conf |
boolean of whether or not to run confidence intervals |
which_conf |
string of which type of confidence interval to compute ("Bootstrap" or "Jackknife") |
harm_cut |
postive number indicating the cutoff for a gene to be considered harmonic |
over_cut |
postive number indicating the cutoff for a gene to be considered repressed/overexpressed |
seed |
number for random seed to fix for bootstrapping for confidence intervals |
Value
results, a data frame which contains:
Gene Name |
gene name |
Convergence |
depreciated result, always 0, will be removed in future versions |
Iterations |
depreciated result, always 0, will be removed in future versions |
Amplitude.Change.Coefficient |
Amplitude change coefficient value for fit |
Oscillation Type |
Type of oscillation (damped, driven, etc.) |
Initial.Amplitude |
Initial amplitude value for fit |
Radian.Frequency |
Radian frequency for fit |
Period |
Period for fit (in time units) |
Phase Shift |
Phase shift for fit (radians) |
Hours Shifted |
Phase shift for fit (hours) |
Equilibrium Value |
Equilibrium shift for fit |
Slope |
Slope value of original data, if linear baseline is removed |
Tau |
Kendall's tau between original and fitted values |
P-value |
P-value calculated based on Kendall's tau |
BH Adj P-Value |
Benjamini-Hochberg adjusted p-values |
BY Adj P-Value |
Benjamini-Yekutieli adjusted p-values |
CI.PARAM.Low |
Lower confidence interval bound for all parameters, if calculated |
CI.PARAM.High |
Higher confidence interval bound for all parameters, if calculated |
Original TPX.Y |
Processed values for gene expression at time point X, replicate Y |
Fitted TPX |
Fitted values for gene expression at time point X |
Examples
# for more elaboration, please see the vignette
# "expressions" is the example echo.find data frame
# long example - commented out
echo_find(genes = expressions, begin = 2, end = 48, resol = 2,
num_reps = 3, low = 20, high = 26, run_all_per = FALSE,
paired = FALSE, rem_unexpr = FALSE, rem_unexpr_amt = 70, rem_unexpr_amt_below=0,
is_normal = FALSE, is_de_linear_trend = FALSE, is_smooth = FALSE)
Synthetic expression data for 12 genes.
Description
A dataset containing the names and expression values for 12 synthetically generated samples. This example data has time points from 2 to 48 hours with 2 hour resolution and 3 replicates. Random missing data is also included. Synthetic data was created by randomly selecting parameters for the extended harmonic oscillator equation (see journal paper link in vignette for the equation), then adding random uniform noise to each expression.
Usage
expressions
Format
A data frame with 12 rows and 73 variables (column 1: sample labels, columns to 2 to 73: numerical values for gene expression in the forsmat CTX.Y (time point X, replicate Y)).
Details
Note the data format: its first column first column has gene labels/names, and all other columns have expression data. This expression data is ordered by time point then by replicate, and has evenly spaced time points. Any missing data has cells left blank.