Pipeline reporting

This section describes how the reports of a FlowCraft pipeline are generated and collected at the end of a run. These reports can then be sent to the FlowCraft web application where the results are visualized.

Important

Note that if the nextflow process reports add new types of data, one or more React components need to be added to the web application for them to be rendered.

Data collection

The data for the pipeline reports is collected from three dotfiles in each nextflow process (they should be present in each work sub directory):

  • .report.json: Contains report data (See Report JSON for more information).
  • .versions: Contains information about the versions of the software used (See Versions for more information).
  • .command.trace: Contains resource usage information.

The .command.trace file is generated by nextflow when the trace scope is active. The .report.json and .version files are specific to FlowCraft pipelines.

Generation of dotfiles

Both report.json and .versions empty dotfiles are automatically generated by the {% include "post.txt" ignore missing %} placeholder, specified in the Create process template section. Using this placeholder in your processes is all that is needed.

Collection of dotfiles

The .report.json, .versions and .command.trace files are automatically collected and sent to dedicated report channels in the pipeline by the {%- include "compiler_channels.txt" ignore missing -%} placeholder, specified in the process creation section. Placing this placeholder in your processes will generate the following line in the output channel specification:

set {{ sample_id|default("sample_id") }}, val("{{ task_name }}_{{ pid }}"), val("{{ pid }}"), file(".report.json"), file(".versions"), file(".command.trace") into REPORT_{{task_name}}_{{ pid }}

This line collects several metadata associated with the process along with the three dotfiles.

Compilation of dotfiles

As mentioned in the previous section, the dotfiles and other relevant metadata for are sent through special report channels to a FlowCraft component that is responsible for compiling all the information and generate a single report file at the end of each pipeline run.

This component is specified in flowcraft.generator.templates.report_compiler.nf and it consists of two nextflow processes:

  • First, the report process receives the data from each executed process that sends report data and runs the flowcraft/bin/prepare_reports.py script on that data. This script will simply merge metadata and dotfiles information in a single JSON file. This file contains the following keys:

    • reportJson: The data in .report.json file.
    • versions: The data in .versions file.
    • trace: The data in .command.trace file.
    • processId: The process ID
    • pipelineId: The pipeline ID that defaults to one, unless specified in the parameters.
    • projectid: The project ID that defaults to one, unless specified in the parameters.
    • userId: The user ID that defaults to one, unless specified in the parameters.
    • username: The user name that defaults to user, unless specified in the parameters
    • processName: The name of the flowcraft component.
    • workdir: The work directory where the process was executed.
  • Second, all JSON files created in the process above are merged and a single reports JSON file is created. This file will contains the following structure:

    reportJSON = {
        "data": {
            "results": [<array of report JSONs>]
        }
    }