flowcraft.generator.pipeline_parser module¶
-
flowcraft.generator.pipeline_parser.
guess_process
(query_str, process_map)[source]¶ Function to guess processes based on strings that are not available in process_map. If the string has typos and is somewhat similar (50%) to any process available in flowcraft it will print info to the terminal, suggesting the most similar processes available in flowcraft.
Parameters: - query_str: str
The string of the process with potential typos
- process_map:
The dictionary that contains all the available processes
-
flowcraft.generator.pipeline_parser.
remove_inner_forks
(text)[source]¶ Recursively removes nested brackets
This function is used to remove nested brackets from fork strings using regular expressions
Parameters: - text: str
The string that contains brackets with inner forks to be removed
Returns: - text: str
the string with only the processes that are not in inner forks, thus the processes that belong to a given fork.
-
flowcraft.generator.pipeline_parser.
empty_tasks
(p_string)[source]¶ Function to check if pipeline string is empty or has an empty string
Parameters: - p_string: str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
brackets_but_no_lanes
(p_string)[source]¶ Function to check if a LANE_TOKEN is provided but no fork is initiated. Parameters ———- p_string: str
- String with the definition of the pipeline, e.g.::
- ‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
brackets_insanity_check
(p_string)[source]¶ This function performs a check for different number of ‘(‘ and ‘)’ characters, which indicates that some forks are poorly constructed.
Parameters: - p_string: str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
lane_char_insanity_check
(p_string)[source]¶ This function performs a sanity check for multiple ‘|’ character between two processes.
Parameters: - p_string: str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
final_char_insanity_check
(p_string)[source]¶ This function checks if lane token is the last element of the pipeline string.
Parameters: - p_string: str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
fork_procs_insanity_check
(p_string)[source]¶ This function checks if the pipeline string contains a process between the fork start token or end token and the separator (lane) token. Checks for the absence of processes in one of the branches of the fork [‘|)' and '(|’] and for the existence of a process before starting a fork (in an inner fork) [‘|(‘].
Parameters: - p_string: str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
start_proc_insanity_check
(p_string)[source]¶ This function checks if there is a starting process after the beginning of each fork. It checks for duplicated start tokens [‘((‘].
Parameters: - p_string: str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
late_proc_insanity_check
(p_string)[source]¶ This function checks if there are processes after the close token. It searches for everything that isn’t “|” or “)” after a “)” token.
Parameters: - p_string: str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
inner_fork_insanity_checks
(pipeline_string)[source]¶ This function performs two sanity checks in the pipeline string. The first check, assures that each fork contains a lane token ‘|’, while the second check looks for duplicated processes within the same fork.
Parameters: - pipeline_string: str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
-
flowcraft.generator.pipeline_parser.
insanity_checks
(pipeline_str)[source]¶ Wrapper that performs all sanity checks on the pipeline string
Parameters: - pipeline_str : str
String with the pipeline definition
-
flowcraft.generator.pipeline_parser.
parse_pipeline
(pipeline_str)[source]¶ - Parses a pipeline string into a list of dictionaries with the connections
- between processes
Parameters: - pipeline_str : str
- String with the definition of the pipeline, e.g.::
‘processA processB processC(ProcessD | ProcessE)’
Returns: - pipeline_links : list
-
flowcraft.generator.pipeline_parser.
get_source_lane
(fork_process, pipeline_list)[source]¶ Returns the lane of the last process that matches fork_process
Parameters: - fork_process : list
List of processes before the fork.
- pipeline_list : list
List with the pipeline connection dictionaries.
Returns: - int
Lane of the last process that matches fork_process
-
flowcraft.generator.pipeline_parser.
get_lanes
(lanes_str)[source]¶ From a raw pipeline string, get a list of lanes from the start of the current fork.
When the pipeline is being parsed, it will be split at every fork position. The string at the right of the fork position will be provided to this function. It’s job is to retrieve the lanes that result from that fork, ignoring any nested forks.
Parameters: - lanes_str : str
Pipeline string after a fork split
Returns: - lanes : list
List of lists, with the list of processes for each lane
-
flowcraft.generator.pipeline_parser.
linear_connection
(plist, lane)[source]¶ Connects a linear list of processes into a list of dictionaries
Parameters: - plist : list
List with process names. This list should contain at least two entries.
- lane : int
Corresponding lane of the processes
Returns: - res : list
List of dictionaries with the links between processes
-
flowcraft.generator.pipeline_parser.
fork_connection
(source, sink, source_lane, lane)[source]¶ Makes the connection between a process and the first processes in the lanes to which it forks.
The
lane
argument should correspond to the lane of the source process. For each lane insink
, the lane counter will increase.Parameters: - source : str
Name of the process that is forking
- sink : list
List of the processes where the source will fork to. Each element corresponds to the start of a lane.
- source_lane : int
Lane of the forking process
- lane : int
Lane of the source process
Returns: - res : list
List of dictionaries with the links between processes
-
flowcraft.generator.pipeline_parser.
linear_lane_connection
(lane_list, lane)[source]¶ Parameters: - lane_list : list
Each element should correspond to a list of processes for a given lane
- lane : int
Lane counter before the fork start
Returns: - res : list
List of dictionaries with the links between processes
-
flowcraft.generator.pipeline_parser.
add_unique_identifiers
(pipeline_str)[source]¶ - Returns the pipeline string with unique identifiers and a dictionary with
- references between the unique keys and the original values
Parameters: - pipeline_str : str
Pipeline string
Returns: - str
Pipeline string with unique identifiers
- dict
Match between process unique values and original names
-
flowcraft.generator.pipeline_parser.
remove_unique_identifiers
(identifiers_to_tags, pipeline_links)[source]¶ Removes unique identifiers and add the original process names to the already parsed pipelines
Parameters: - identifiers_to_tags : dict
Match between unique process identifiers and process names
- pipeline_links: list
Parsed pipeline list with unique identifiers
Returns: - list
Pipeline list with original identifiers