TaskTrigger Refactor #2303

oliver-sanders · 2017-05-23T14:44:58Z

At the moment taskdefs store triggers in an in-efficient manner:

Scheduler(...).config.taskdef.triggers = {
    <cylc.cycling.SequenceBase>:  [
        [
            [{label: <cylc.task_trigger.TaskTrigger>, ...}, expression], ...
        ], ...
    ], ...
}

In this data structure the task names and qualifiers are stored three times:

expression = 'task_name_colon_succeeded | task_name_colon_failed'
label = 'task_name_colon_succeeded'
task_trigger = TaskTrigger('task_name', qualifier='succeeded', ...)

This pull removes this duplication of information:

The expression is now a nested list of TaskTrigger objects and conditional characters e.g. [<TaskTrigger>, '&', <TaskTrigger>]
Labels have been retired.

For suites with complex conditional dependencies this has a large effect on memory usage. For the suite mentioned in #2291 the Scheduler object (post configure) goes from 511Mb down to 146Mb, validation shows a 22% reduction (associated with a 3% rise in CPU)

Version	Run	Elapsed Time (s)	CPU Time - Total (s)	Max Memory (kb)
master	u-al307-validate	554.0	554.2	1584548.0
task-trigger-refactor	u-al307-validate	568.7	569.1	1240128.0

For suites with simple dependencies there is a smaller saving. This pull reduces the memory usage of the complex suite by about 4%. The plot below shows the scaling results for the diamond suite:

Changes:

Obsoletes task message offsets after Clean up task output message handling #1761 (see Clean up task output message handling #1761 (comment)).
Task labels (i.e. foo_colon_succeeded) have been removed. The Prerequisite class now uses the task message alone.
The Prerequisite class has been stripped of dead weight.
The TaskTrigger class has been reduced down to a data object.
TaskTrigger objects are cashed so that no duplicates are created.

oliver-sanders · 2017-05-23T14:55:40Z

Just to record this information. The main memory users in the SuiteConfig object for the "extremely complex" suite:

Before

404218752 taskdefs
 18233392 cfg
 14393992 pcfg
  2178320 sequences

After

74439888 taskdefs
18233456 cfg
14393992 pcfg
 2177616 sequences

oliver-sanders · 2017-05-23T14:57:01Z

lib/cylc/config.py

-    ("\+", "_plus_"),
-]
+# Message trigger offset regex(es).
+BCOMPAT_MSG_RE_C5 = re.compile(r'^(.*)\[\s*T\s*(([+-])\s*(\d+))?\s*\](.*)$')


Testing for message offsets is impacting validation so we should remove this as soon as we are confident that it is no-longer needed. In the mean time can we remove the cylc5 regex?

Yes we can remove any cylc-5 back compat code now.

matthewrmshin

Some initial comments.

matthewrmshin · 2017-05-23T14:54:38Z

lib/cylc/conditional_simplifier.py



 class ConditionalSimplifier(object):
    """A class to simplify logical expressions"""
+    RE_CONDITIONALS = "(&|\||\(|\))"


Can compile this regular expression?

matthewrmshin · 2017-05-23T14:57:23Z

lib/cylc/config.py

+                for message in outputs.values():
+                    if regex.match(message):
+                        raise SuiteConfigError(
+                            'ERROR: Message trigger offsets are obsolete.')


Can avoid looping message twice?

for message in outputs.values(): if BCOMPAT_MSG_RE_C5.match(message) or BCOMPAT_MSG_RE_C6.match(message): # ...

matthewrmshin · 2017-05-23T15:00:15Z

lib/cylc/prerequisite.py

-        m = re.match(self.__class__.CYCLE_POINT_RE, message)
-        if m:
-            self.target_point_strings.append(m.groups()[0])
+        match = re.match(self.__class__.CYCLE_POINT_RE, message)


Can just do:

match = CYCLE_POINT_RE.match(message)

since CYCLE_POINT_RE is already compiled.

matthewrmshin · 2017-05-23T15:05:25Z

lib/cylc/task_trigger.py

+
+        """
+        cpre = Prerequisite(point, tdef.start_point)
+        for task_trigger in self.task_triggers:


This block can probably do with some extra comments.

matthewrmshin · 2017-05-23T15:06:26Z

lib/cylc/taskdef.py

-            yield (key, re.sub('\[.*\]', str(new_point), msg))
+        """Yield task message outputs for initialisation of TaskOutputs."""
+        for key, msg in self.outputs:
+            yield (key, re.sub('\[.*\]', str(point), msg))


Do we still need this substitution?

oliver-sanders · 2017-05-23T16:06:37Z

Will address the database lock test failures tomorrow.

hjoliver · 2017-05-24T05:45:06Z

lib/cylc/conditional_simplifier.py

        """Convert a logical expression in a nested list back to a string"""
        flattened = copy.deepcopy(expr)
        for i in range(len(flattened)):
            if isinstance(flattened[i], list):
-                flattened[i] = self.flatten_nested_expr(flattened[i])
+                flattened[i] = cls.flatten_nested_expr(
+                    flattened[i])


Spurious change?

matthewrmshin

Some more minor style comments. Change tested as working in my environment.

matthewrmshin · 2017-05-24T13:42:44Z

lib/cylc/config.py

+                if lnode.output:
+                    qualifier = TaskTrigger.get_trigger_name(lnode.output)
+                else:
+                    qualifier = TASK_OUTPUT_SUCCEEDED


if outputs and lnode.output in outputs: # Task message. qualifier = outputs[lnode.output] elif lnode.output: # Built-in qualifier. qualifier = TaskTrigger.get_trigger_name(lnode.output) else: qualifier = TASK_OUTPUT_SUCCEEDED

A slightly better style? (This lines up all the assignment statements of qualifier.)

matthewrmshin · 2017-05-24T13:49:51Z

lib/cylc/conditional_simplifier.py



 class ConditionalSimplifier(object):
    """A class to simplify logical expressions"""
+    RE_CONDITIONALS = re.compile("(&|\||\(|\))")


The combination backslash escape + pipes (or-logic) + bracket (capture) are making the regular expression very difficult to read. Perhaps better to capture a set in square bracket like this r'([&|()])'?

hjoliver

Nice.

TaskTrigger refactor.

a371810

oliver-sanders added the efficiency For notable efficiency improvements label May 23, 2017

oliver-sanders added this to the next release milestone May 23, 2017

oliver-sanders self-assigned this May 23, 2017

oliver-sanders requested review from matthewrmshin and hjoliver May 23, 2017 14:45

oliver-sanders commented May 23, 2017

View reviewed changes

matthewrmshin reviewed May 23, 2017

View reviewed changes

Addressed feedback.

3896539

hjoliver reviewed May 24, 2017

View reviewed changes

oliver-sanders added 4 commits May 24, 2017 12:13

Reverted "spurious" change.

7697159

Fixed database lock tests.

52462d1

Removed cylc 5 message offset regex.

4dbc1dc

Moved task message offset check and added test.

cfa6bcf

matthewrmshin reviewed May 24, 2017

View reviewed changes

Addressed feedback.

df0c1fa

matthewrmshin approved these changes May 24, 2017

View reviewed changes

hjoliver approved these changes May 25, 2017

View reviewed changes

hjoliver merged commit ca7a953 into cylc:master May 25, 2017

oliver-sanders deleted the task-trigger-refactor branch December 14, 2017 12:06

oliver-sanders mentioned this pull request Sep 9, 2021

graph parser: remove string duplication #4400

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TaskTrigger Refactor #2303

TaskTrigger Refactor #2303

oliver-sanders commented May 23, 2017

oliver-sanders commented May 23, 2017

oliver-sanders May 23, 2017

hjoliver May 24, 2017

matthewrmshin left a comment

matthewrmshin May 23, 2017

matthewrmshin May 23, 2017

matthewrmshin May 23, 2017

matthewrmshin May 23, 2017

matthewrmshin May 23, 2017

oliver-sanders commented May 23, 2017

hjoliver May 24, 2017

matthewrmshin left a comment

matthewrmshin May 24, 2017

matthewrmshin May 24, 2017

hjoliver left a comment

TaskTrigger Refactor #2303

TaskTrigger Refactor #2303

Conversation

oliver-sanders commented May 23, 2017

oliver-sanders commented May 23, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthewrmshin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliver-sanders commented May 23, 2017

Choose a reason for hiding this comment

matthewrmshin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hjoliver left a comment

Choose a reason for hiding this comment