Pipeline reporting¶
This section describes how the reports of a FlowCraft pipeline are generated and collected at the end of a run. These reports can then be sent to the FlowCraft web application where the results are visualized.
Important
Note that if the nextflow process reports add new types of data, one or more React components need to be added to the web application for them to be rendered.
Data collection¶
The data for the pipeline reports is collected from three dotfiles in each nextflow process (they should be present in each work sub directory):
- .report.json: Contains report data (See Report JSON for more information).
- .versions: Contains information about the versions of the software used (See Versions for more information).
- .command.trace: Contains resource usage information.
The .command.trace file is generated by nextflow when the trace scope is active. The .report.json and .version files are specific to FlowCraft pipelines.
Generation of dotfiles¶
Both report.json and .versions empty dotfiles are automatically generated
by the {% include "post.txt" ignore missing %}
placeholder, specified in the
Create process template section. Using this placeholder in your processes is all
that is needed.
Collection of dotfiles¶
The .report.json, .versions and .command.trace files are automatically
collected and sent to dedicated report channels in the pipeline by the
{%- include "compiler_channels.txt" ignore missing -%}
placeholder, specified
in the process creation section. Placing this placeholder in your
processes will generate the following line in the output channel specification:
set {{ sample_id|default("sample_id") }}, val("{{ task_name }}_{{ pid }}"), val("{{ pid }}"), file(".report.json"), file(".versions"), file(".command.trace") into REPORT_{{task_name}}_{{ pid }}
This line collects several metadata associated with the process along with the three dotfiles.
Compilation of dotfiles¶
As mentioned in the previous section, the dotfiles and other relevant metadata for are sent through special report channels to a FlowCraft component that is responsible for compiling all the information and generate a single report file at the end of each pipeline run.
This component is specified in flowcraft.generator.templates.report_compiler.nf
and it consists of two nextflow processes:
First, the report process receives the data from each executed process that sends report data and runs the
flowcraft/bin/prepare_reports.py
script on that data. This script will simply merge metadata and dotfiles information in a single JSON file. This file contains the following keys:reportJson
: The data in .report.json file.versions
: The data in .versions file.trace
: The data in .command.trace file.processId
: The process IDpipelineId
: The pipeline ID that defaults to one, unless specified in the parameters.projectid
: The project ID that defaults to one, unless specified in the parameters.userId
: The user ID that defaults to one, unless specified in the parameters.username
: The user name that defaults to user, unless specified in the parametersprocessName
: The name of the flowcraft component.workdir
: The work directory where the process was executed.
Second, all JSON files created in the process above are merged and a single reports JSON file is created. This file will contains the following structure:
reportJSON = { "data": { "results": [<array of report JSONs>] } }