Outputs of the FIRE pipeline

The FIRE pipeline returns outputs for a sample in the directory results/{sample}/ and files are labeled with the sample ({sample}) and the version of the FIRE pipeline ({v}). The following files and directories are generated:

OutputDescription
{sample}-fire-{v}-filtered.cramCRAM file containing the all the data used in the FIRE pipeline.
{sample}-fire-{v}-peaks.bed.gzBED file containing the FIRE peaks calls
{sample}-fire-{v}-hap-differences.bed.gzBED file containing the results of searching for haplotype-selective peaks.
{sample}-fire-{v}-pileup.bed.gzBED file containing per-base information on number of FIREs, MSPs, nucleosomes, coverage and more.
{sample}-fire-{v}-qc.tbl.gzTable containing quality control metrics for the FIRE CRAM.
trackHub-{v}/Directory containing a UCSC trackHub for visualizing all the results of the FIRE pipeline.
additional-outputs-{v}/Directory containing additional outputs from the FIRE pipeline.

More details on the individual outputs

The {sample}-fire-{v}-filtered.cram file

The CRAM file contains all the data used in the FIRE pipeline. It is a CRAM file that can be viewed with IGV or other genome browsers. Sequencing quality scores are removed from the CRAM file to reduce the file size since per base quality scores are not used in the FIRE pipeline, as well as reads with insufficient m6A signal. The CRAM file is sorted and indexed.

The {sample}-fire-{v}-peaks.bed.gz file

This is the peak file for the FIRE method. Peaks are called by identifying FIRE score (methods) local-maxima that have FDR values below a threshold. By default, the pipeline reports peaks at a 5% FDR threshold. Once a local-maxima is identified, the start and end positions of the peak are determined by the median start and end positions of the underlying FIRE elements. We also calculate and report wide peaks in the additional-outputs/ by taking the union of the FIRE peaks and all regions below the FDR threshold and then merging resulting regions that are within one nucleosome (147 bp) of one another.

The FIRE peaks file has the following columns:

ColumnDescription
#chromChromosome of the peak
peak*startStart of the peak
peak_endEnd of the peak
startStart of the maximum of the peak
endEnd of the maximum of the peak
coverageCoverage of the peak
fire_coverageCoverage of the FIREs in the peak
scoreFIRE score of the peak (see methods)
nuc_coverageCoverage of the nucleosomes in the peak
msp_coverageCoverage of the MSPs in the peak
.**{H1,H2}Repeats of previous columns but specific for the two haplotypes
FDRFalse discovery rate of the peak
log_FDR-10*log10 of the FDR
FIRE_size_meanMean size of the FIREs in the peak
FIRE_size_ssdStandard deviation of the size of the FIREs in the peak
FIRE_start_ssdStandard deviation of the start of the FIREs in the peak
FIRE_end_ssdStandard deviation of the end of the FIREs in the peak
pass_coverageWhether the peak passes coverage filters

The {sample}-fire-{v}-hap-differences.bed.gz file

This file primarily contains the same columns as the FIRE peaks file but additionally has a p_value column with the results of a Fisher's exact test for the difference in coverage between the two haplotypes, and a p_adjust column with the Benjamini-Hochberg adjusted p-value. See the methods for more details.

The {sample}-fire-{v}-pileup.bed.gz file

This is a BED file containing per-base information on number of FIREs, MSPs, nucleosomes, coverage and more. The columns are calculated using ft-pileup and more details can be found in the ft-pileup documentation.

The {sample}-fire-{v}-qc.tbl.gz file

This file contains quality control metrics for the FIRE CRAM. The results are directly created by ft-qc and more details can be found in the ft-qc documentation.

The trackHub-{v}/ directory

The trackHub-{v}/ directory contains a UCSC trackHub for visualizing all the results of the FIRE pipeline. A description of the trackHub can be found in trackHub/fire-description.html. The trackHub can be loaded into UCSC by uploading the trackHub directory to a public facing website and then loading the hub.txt's URL into the UCSC trackHub browser.

A copy of the trackHub description can be found here.

The additional-outputs-{v}/ directory

The additional-outputs-{v}/ directory contains the following files: TODO