flowcraft.templates.trimmomatic module¶
Purpose¶
This module is intended execute trimmomatic on paired-end FastQ files.
Expected input¶
The following variables are expected whether using NextFlow or the
main()
executor.
sample_id
: Pair of FastQ file paths.- e.g.:
'SampleA'
- e.g.:
fastq_pair
: Pair of FastQ file paths.- e.g.:
'SampleA_1.fastq.gz SampleA_2.fastq.gz'
- e.g.:
trim_range
: Crop range detected using FastQC.- e.g.:
'15 151'
- e.g.:
opts
: List of options for trimmomatic- e.g.:
'["5:20", "3", "3", "55"]'
- e.g.:
'[trim_sliding_window, trim_leading, trim_trailing, trim_min_length]'
- e.g.:
phred
: List of guessed phred values for each sample- e.g.:
'[SampleA: 33, SampleB: 33]'
- e.g.:
clear
: If ‘true’, remove the input fastq files at the end of the- component run, IF THE FILES ARE IN THE WORK DIRECTORY
Generated output¶
The generated output are output files that contain an object, usually a string.
(Values within ${}
are substituted by the corresponding variable.)
${sample_id}_*P*
: Pair of paired FastQ files generated by Trimmomatic- e.g.:
'SampleA_1_P.fastq.gz SampleA_2_P.fastq.gz'
- e.g.:
trimmomatic_status
: Stores the status of the trimmomatic run. If it was successfully executed, it stores ‘pass’. Otherwise, it stores theSTDERR
message.- e.g.:
'pass'
- e.g.:
Code documentation¶
-
flowcraft.templates.trimmomatic.
parse_log
(log_file)[source]¶ Retrieves some statistics from a single Trimmomatic log file.
This function parses Trimmomatic’s log file and stores some trimming statistics in an
OrderedDict
object. This object contains the following keys:clean_len
: Total length after trimming.total_trim
: Total trimmed base pairs.total_trim_perc
: Total trimmed base pairs in percentage.5trim
: Total base pairs trimmed at 5’ end.3trim
: Total base pairs trimmed at 3’ end.
Parameters: - log_file : str
Path to trimmomatic log file.
Returns: - x :
OrderedDict
Object storing the trimming statistics.
-
flowcraft.templates.trimmomatic.
write_report
(storage_dic, output_file, sample_id)[source]¶ Writes a report from multiple samples.
Parameters: - storage_dic : dict or
OrderedDict
Storage containing the trimming statistics. See
parse_log()
for its generation.- output_file : str
Path where the output file will be generated.
- storage_dic : dict or