flowcraft.templates.trimmomatic module

Purpose

This module is intended execute trimmomatic on paired-end FastQ files.

Expected input

The following variables are expected whether using NextFlow or the main() executor.

  • sample_id : Pair of FastQ file paths.
    • e.g.: 'SampleA'
  • fastq_pair : Pair of FastQ file paths.
    • e.g.: 'SampleA_1.fastq.gz SampleA_2.fastq.gz'
  • trim_range : Crop range detected using FastQC.
    • e.g.: '15 151'
  • opts : List of options for trimmomatic
    • e.g.: '["5:20", "3", "3", "55"]'
    • e.g.: '[trim_sliding_window, trim_leading, trim_trailing, trim_min_length]'
  • phred : List of guessed phred values for each sample
    • e.g.: '[SampleA: 33, SampleB: 33]'
  • clear : If ‘true’, remove the input fastq files at the end of the
    component run, IF THE FILES ARE IN THE WORK DIRECTORY

Generated output

The generated output are output files that contain an object, usually a string. (Values within ${} are substituted by the corresponding variable.)

  • ${sample_id}_*P*: Pair of paired FastQ files generated by Trimmomatic
    • e.g.: 'SampleA_1_P.fastq.gz SampleA_2_P.fastq.gz'
  • trimmomatic_status: Stores the status of the trimmomatic run. If it was successfully executed, it stores ‘pass’. Otherwise, it stores the STDERR message.
    • e.g.: 'pass'

Code documentation

flowcraft.templates.trimmomatic.parse_log(log_file)[source]

Retrieves some statistics from a single Trimmomatic log file.

This function parses Trimmomatic’s log file and stores some trimming statistics in an OrderedDict object. This object contains the following keys:

  • clean_len: Total length after trimming.
  • total_trim: Total trimmed base pairs.
  • total_trim_perc: Total trimmed base pairs in percentage.
  • 5trim: Total base pairs trimmed at 5’ end.
  • 3trim: Total base pairs trimmed at 3’ end.
Parameters:
log_file : str

Path to trimmomatic log file.

Returns:
x : OrderedDict

Object storing the trimming statistics.

flowcraft.templates.trimmomatic.write_report(storage_dic, output_file, sample_id)[source]

Writes a report from multiple samples.

Parameters:
storage_dic : dict or OrderedDict

Storage containing the trimming statistics. See parse_log() for its generation.

output_file : str

Path where the output file will be generated.

flowcraft.templates.trimmomatic.trimmomatic_log(log_file, sample_id)[source]
flowcraft.templates.trimmomatic.clean_up(fastq_pairs, clear)[source]

Cleans the working directory of unwanted temporary files

flowcraft.templates.trimmomatic.merge_default_adapters()[source]

Merges the default adapters file in the trimmomatic adapters directory

Returns:
str

Path with the merged adapters file.

flowcraft.templates.trimmomatic.run_trimmomatic(cli, logfile, sample_id)[source]

Runs trimmomatic command Parameters ———- cli : lst

list containing trimmomatic command
logfile : str
Path to file for trimmomatic to write log
sample_id: str
Sample Identification string.