Pipeline inspection¶
FlowCraft offers an inspect
mode for tracking the progress of a nextflow
pipeline either directly in a terminal (overview
) or by broadcasting information to
the flowcraft web application
(broadcast
).
Note
This mode was design for nextflow pipelines generated by FlowCraft. It should be possible to inspect any nextflow pipeline, provided that the requirements below are met, but compatibility it’s not guaranteed.
How it works: Simply run flowcraft inspect -m <mode>
in the directory
where the pipeline is running. In either run mode, FlowCraft will keep running
(until you cancel it) and continuously update the progress of a pipeline. If
the pipeline is interrupted or fails for some reason, FlowCraft should be able
to correctly reset the inspection automatically when resuming its execution.
Requirements for inspect¶
While the inspect
mode is running, it will parse the information written
into two files that are generated by nextflow:
.nextflow.log
: The log file that is automatically generated by nextflow.trace file
: The trace file that is generated by nextflow when using the-with-trace
option. By default, it searches for thepipeline_stats.txt
file, but this can be changed using the-i
option.
Trace fields¶
FlowCraft parses several fields of the trace file, but only a few are mandatory for its execution. If the trace file does not contain any of the optional fields, that information will simply not appear on the terminal or web app. Nevertheless, to take full advantage of the inspect mode, the following trace fields should be present:
- Mandatory:
tag
: The tag of the nextflow process. Flowcraft assumes that this is a string with only the sample name (e.g.: SampleA). While this is not strictly required, providing strings with other information (e.g.: Running bowtie for sampleA) may result in some inconsistencies in the inspection.task_id
: The task ID is used to skip entries that have already been parsed.
- Optional:
hash
: Used to get the work directory the process execution.cpus
,%cpu
,memory
,rss
,rchar
andwchar
: Used for statistics of computational resources.
Note
Any additional fields present in the trace file are ignored.
Usage¶
flowcraft inspect --help
usage: flowcraft inspect [-h] [-i TRACE_FILE] [-r REFRESH_RATE]
[-m {overview,broadcast}] [-u URL] [--pretty]
optional arguments:
-h, --help show this help message and exit
-i TRACE_FILE Specify the nextflow trace file.
-r REFRESH_RATE Set the refresh frequency for the continuous inspect
functions
-m {overview,broadcast}, --mode {overview,broadcast}
Specify the inspection run mode.
-u URL, --url URL Specify the URL to where the data should be broadcast
--pretty Pretty inspection mode that removes usual reporting
processes.
-i
: Used to specify the path to the trace file that should be parsed. By default, FlowCraft will try to parse thepipeline_stats.txt
file in current working directory.-r
: Sets the time interval in seconds between each parsing of the relevant nextflow files. By default it is set to0.01
.-m
: The inspection mode.overview
is the terminal display whilebroadcast
sends the data to FlowCraft’s web service.-u
: The URL of FlowCraft’s web service. By default it is already set to the main service and you do not need to specify it. It is only useful when the service is running on local host or in other custom instance.--pretty
: By default the inspection shows the progress of all processes in the pipeline. Using this option filters the processes to the most relevant ones of FlowCraft’s pipelines.