Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum creation non-linear in the number of values #89580

Closed
olliemath mannequin opened this issue Oct 8, 2021 · 5 comments
Closed

Enum creation non-linear in the number of values #89580

olliemath mannequin opened this issue Oct 8, 2021 · 5 comments
Assignees
Labels
3.10 only security fixes 3.11 only security fixes performance Performance or resource usage

Comments

@olliemath
Copy link
Mannequin

olliemath mannequin commented Oct 8, 2021

BPO 45417
Nosy @warsaw, @arigo, @cfbolz, @ethanfurman, @mattip, @olliemath
PRs
  • bpo-45417: fix quadratic behaviour when creating an enum #28907
  • Files
  • huge.py: A single very large enum
  • quadratic-fix-2.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ethanfurman'
    closed_at = None
    created_at = <Date 2021-10-08.23:52:33.688>
    labels = ['3.11', '3.10', 'performance']
    title = 'Enum creation non-linear in the number of values'
    updated_at = <Date 2021-10-15.16:07:46.518>
    user = 'https://github.com/olliemath'

    bugs.python.org fields:

    activity = <Date 2021-10-15.16:07:46.518>
    actor = 'ethan.furman'
    assignee = 'ethan.furman'
    closed = False
    closed_date = None
    closer = None
    components = []
    creation = <Date 2021-10-08.23:52:33.688>
    creator = 'olliemath'
    dependencies = []
    files = ['50332', '50341']
    hgrepos = []
    issue_num = 45417
    keywords = ['patch']
    message_count = 5.0
    messages = ['403512', '403585', '403592', '403751', '403947']
    nosy_count = 7.0
    nosy_names = ['barry', 'arigo', 'Carl.Friedrich.Bolz', 'eli.bendersky', 'ethan.furman', 'mattip', 'olliemath']
    pr_nums = ['28907']
    priority = 'normal'
    resolution = None
    stage = 'backport needed'
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue45417'
    versions = ['Python 3.10', 'Python 3.11']

    @olliemath
    Copy link
    Mannequin Author

    olliemath mannequin commented Oct 8, 2021

    Creating large enums takes a significant amount of time. Moreover this appears to be nonlinear in the number of entries in the enum. Locally, importing a single python file and taking this to the extreme:

    1000 entries - 0.058s
    10000 entries - 4.327s

    This is partially addressed by https://bugs.python.org/issue38659 and I can confirm that using @_simple_enum does not have this problem. But it seems like that API is only intended for internal use and the 'happy path' for user-defined enums is still not good.

    Note that it is not simply parsing the file / creating the instances, it is to do with the cardinality. Creating 100 enums with 100 entries each is far faster than a single 10000 entry enum.

    @olliemath olliemath mannequin added 3.7 (EOL) end of life 3.10 only security fixes 3.11 only security fixes performance Performance or resource usage labels Oct 8, 2021
    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Oct 10, 2021

    The timing is clearly quadratic:

    number of attributes time
    1500 0.24s
    3000 0.94s
    6000 3.74s
    12000 15.57s

    Pressing Ctrl-C in the middle of the execution of the largest examples points directly to the cause: when we consider the next attribute, we loop over all previous ones at enum.py:238.

    @mattip
    Copy link
    Contributor

    mattip commented Oct 10, 2021

    Over at PyPy arigo authored and we applied this fix to the as-yet unreleased pypy3.8. Note that it cannot be applied directly to CPython as sets are not ordered. Does the fix break anything? Tests in Lib/test/test_enum.py passed after applying it.

    @Fidget-Spinner Fidget-Spinner added 3.9 only security fixes and removed 3.7 (EOL) end of life labels Oct 10, 2021
    @cfbolz
    Copy link
    Mannequin

    cfbolz mannequin commented Oct 12, 2021

    I fixed the reliance of set being insertion ordered in pypy and opened a pull request.

    @ethanfurman
    Copy link
    Member

    New changeset b2af211 by Carl Friedrich Bolz-Tereick in branch 'main':
    bpo-45417: [Enum] fix quadratic behavior during creation (GH-28907)
    b2af211

    @ethanfurman ethanfurman removed the 3.9 only security fixes label Oct 15, 2021
    @ethanfurman ethanfurman self-assigned this Oct 15, 2021
    @ethanfurman ethanfurman removed the 3.9 only security fixes label Oct 15, 2021
    @ethanfurman ethanfurman self-assigned this Oct 15, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes 3.11 only security fixes performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants