Improve performance of `st.domains()` #4202

tybug · 2024-12-14T21:23:34Z

Closes #4201. My benchmark results:

Master

Field     Time:    Compared to: Time difference:
Baseline  0.000182 N/A
Text      0.353401 baseline     0.353219
Domains   3.109082 baseline     3.108901
URLs      3.952818 domains      0.843736
Emails    3.208230 domains      0.099147
Fake      0.065587 domains      -3.043495

This PR

Field     Time:    Compared to: Time difference:
Baseline  0.000170 N/A
Text      0.363252 baseline     0.363082
Domains   1.905849 baseline     1.905679
URLs      2.744583 domains      0.838734
Emails    2.032297 domains      0.126449
Fake      0.067094 domains      -1.838755

tybug · 2024-12-14T21:25:18Z

hypothesis-python/src/hypothesis/provisional.py

+        @st.composite
+        def recase_randomly(draw, tld):
+            tld = list(tld)
+            changes = draw(st.tuples(*(st.booleans() for _ in range(len(tld)))))
+            for i, change_case in enumerate(changes):
+                if change_case:
+                    tld[i] = tld[i].lower() if tld[i].isupper() else tld[i].upper()
+            return "".join(tld)
+
+        self.domain_strategy = (
            st.sampled_from(get_top_level_domains())


The previous flatmap dynamically created a lot of uncacheable strategies. st.tuples isn't enormously cacheable in general either, but is decently so when the length is the only thing that can vary.

.flatmap( lambda tld: st.tuples( *(st.sampled_from([c.lower(), c.upper()]) for c in tld) ).map("".join) )

Most of the performance gain was in moving the strategy definitions to __init__, though changing the flatmap here did have some additional effect as well.

hypothesis-python/src/hypothesis/provisional.py

Zac-HD

Thanks, @tybug!

tybug commented Dec 14, 2024

View reviewed changes

tybug mentioned this pull request Dec 14, 2024

The domain strategy is very slow #4201

Closed

cache strategies to improve st.domains performance

0be2401

tybug force-pushed the domains branch from a0b7937 to 0be2401 Compare December 14, 2024 21:35

fix link

3ed3d47

Stranger6667 reviewed Dec 15, 2024

View reviewed changes

hypothesis-python/src/hypothesis/provisional.py Show resolved Hide resolved

Zac-HD approved these changes Dec 19, 2024

View reviewed changes

Zac-HD merged commit a6a166a into HypothesisWorks:master Dec 19, 2024
49 checks passed

tybug deleted the domains branch December 19, 2024 04:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of `st.domains()` #4202

Improve performance of `st.domains()` #4202

tybug commented Dec 14, 2024

tybug Dec 14, 2024 •

edited

Loading

Zac-HD left a comment

Improve performance of st.domains() #4202

Improve performance of st.domains() #4202

Conversation

tybug commented Dec 14, 2024

Master

This PR

tybug Dec 14, 2024 • edited Loading

Choose a reason for hiding this comment

Zac-HD left a comment

Choose a reason for hiding this comment

Improve performance of `st.domains()` #4202

Improve performance of `st.domains()` #4202

tybug Dec 14, 2024 •

edited

Loading