Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid temporary varargs tuple creation in argument passing #90370

Closed
colorfulappl mannequin opened this issue Dec 31, 2021 · 18 comments · Fixed by #126064
Closed

Avoid temporary varargs tuple creation in argument passing #90370

colorfulappl mannequin opened this issue Dec 31, 2021 · 18 comments · Fixed by #126064
Labels
3.14 new features, bugs and security fixes performance Performance or resource usage topic-argument-clinic

Comments

@colorfulappl
Copy link
Mannequin

colorfulappl mannequin commented Dec 31, 2021

BPO 46212
Nosy @larryhastings, @pablogsal, @isidentical, @erlend-aasland, @colorfulappl
PRs
  • gh-90370: Avoid temporary varargs tuple creation in argument passing #30312
  • Files
  • bench-print.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-12-31.10:05:47.113>
    labels = ['3.11', 'expert-argument-clinic', 'performance']
    title = 'Avoid temporary `varargs` tuple creation in argument passing'
    updated_at = <Date 2022-01-21.14:06:02.177>
    user = 'https://github.com/colorfulappl'

    bugs.python.org fields:

    activity = <Date 2022-01-21.14:06:02.177>
    actor = 'erlendaasland'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Argument Clinic']
    creation = <Date 2021-12-31.10:05:47.113>
    creator = 'colorfulappl'
    dependencies = []
    files = ['50533']
    hgrepos = []
    issue_num = 46212
    keywords = ['patch']
    message_count = 6.0
    messages = ['409412', '409413', '409414', '409659', '411129', '411130']
    nosy_count = 5.0
    nosy_names = ['larry', 'pablogsal', 'BTaskaya', 'erlendaasland', 'colorfulappl']
    pr_nums = ['30312']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue46212'
    versions = ['Python 3.11']

    Linked PRs

    @colorfulappl
    Copy link
    Mannequin Author

    colorfulappl mannequin commented Dec 31, 2021

    When "Augument Clinic generated code" are parsing arguments, the args are packed to a tuple before passing to callee. This may be unnecessary.

    Pass a raw pointer which points to on-stack varargs, and a varargssize integer to indicate how many varargs are passed, can save the time of tuple creation/destruction and value copy.

    @colorfulappl colorfulappl mannequin added 3.11 only security fixes topic-argument-clinic performance Performance or resource usage labels Dec 31, 2021
    @colorfulappl
    Copy link
    Mannequin Author

    colorfulappl mannequin commented Dec 31, 2021

    I wrote some microbenchs.

    Patch: b68176d

    Environment:
    macOS 12.1
    clang 13.0.0
    configure with --enable-optimizations

    Result on microbench:

    +--------------------------------------------+-------------------------+------------------------+
    | Benchmark                                  | ./opt_baseline/res.json | ./opt_patched/res.json |
    +============================================+=========================+========================+
    | print(a, b, c)                             | 917 ns                  | 820 ns: 1.12x faster   |
    +--------------------------------------------+-------------------------+------------------------+
    | print(a, b, c, *v)                         | 1.56 us                 | 1.62 us: 1.04x slower  |
    +--------------------------------------------+-------------------------+------------------------+
    | print(a, sep='', file=stdout)              | 376 ns                  | 295 ns: 1.27x faster   |
    +--------------------------------------------+-------------------------+------------------------+
    | print(*v, sep='', flush=True, file=stdout) | 2.02 us                 | 1.94 us: 1.04x faster  |
    +--------------------------------------------+-------------------------+------------------------+
    | Geometric mean                             | (ref)                   | 1.05x faster           |
    +--------------------------------------------+-------------------------+------------------------+
    
    Benchmark hidden because not significant (3): print(a), print(a, sep='', flush=True, file=stdout), print(a, b, c, *v, sep='', flush=True, file=stdout)
    

    @erlend-aasland
    Copy link
    Contributor

    Note that _PyArg_UnpackKeywordsWithVararg is defined with PyAPI_FUNC. Changing its argument spec is strictly a backwards incompatible change, IIUC.

    @colorfulappl
    Copy link
    Mannequin Author

    colorfulappl mannequin commented Jan 4, 2022

    I am a rookie in Python, did not notice changing PyAPI_FUNC means breaking backward compatibility.

    I have reverted _PyArg_UnpackKeywordsWithVararg and committed again.

    @isidentical
    Copy link
    Member

    Note that _PyArg_UnpackKeywordsWithVararg is defined with PyAPI_FUNC. Changing its argument spec is strictly a backwards incompatible change, IIUC.

    AFAIK we have committed _PyArg_UnpackKeywordsWithVararg on 3.11 alpha, so I think it should be fine. Also CC: @pablogsal

    @erlend-aasland
    Copy link
    Contributor

    AFAIK we have committed _PyArg_UnpackKeywordsWithVararg on 3.11 alpha, so I think it should be fine.

    I see, so no ABI worries then.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added 3.12 bugs and security fixes and removed 3.11 only security fixes labels Sep 7, 2022
    @skirpichev skirpichev self-assigned this Oct 26, 2024
    @skirpichev skirpichev added 3.14 new features, bugs and security fixes and removed 3.12 bugs and security fixes labels Oct 26, 2024
    @skirpichev
    Copy link
    Member

    It seems #30312 was abandoned by author. I'll continue this work.

    skirpichev added a commit to skirpichev/cpython that referenced this issue Oct 28, 2024
    Current patch partially address issue, working in case when all
    arguments either positional-only or vararg.
    
    This also converts gcd(), lcm() and hypot() functions of the math module
    to the Argument Clinic.  Fix issue python#101123.
    
    Objects/setobject.c and Modules/gcmodule.c adapted.  This fixes slight
    performance regression for set methods, introduced by
    python#115112:
    
    | Benchmark            | ref    | patch                |
    |----------------------|:------:|:--------------------:|
    | set().update(s1, s2) | 354 ns | 264 ns: 1.34x faster |
    
    Benchmark code:
    ```py
    import pyperf
    s1, s2 = {1}, {2}
    runner = pyperf.Runner()
    runner.bench_func('set().update(s1, s2)', set().update, s1, s2)
    ```
    @skirpichev
    Copy link
    Member

    #126064 is ready for review (partially address issue)

    erlend-aasland pushed a commit that referenced this issue Oct 31, 2024
    …#126064)
    
    Avoid temporary tuple creation when all arguments either positional-only
    or vararg.
    
    Objects/setobject.c and Modules/gcmodule.c adapted. This fixes slight
    performance regression for set methods, introduced by gh-115112.
    @skirpichev
    Copy link
    Member

    @erlend-aasland, I'm not sure we can mark this a fixed issue.

    The merged pr covers only part of the original proposal (e.g. it doesn't work for the OP example with print()).

    @serhiy-storchaka
    Copy link
    Member

    I planned to came to this issue in few steps:

    1. Current gh-122943: Rework support of var-positional parameter in Argument Clinic #122945.
    2. Move the code for var-positional parameter to a separate converter (this is not easy, because it needs more parameters).
    3. Split that converter into two converters -- 'tuple' and 'array' to support different representations.

    There may be other intermediate steps. I tried to do this in one step, but it was too complicated.

    Now, #126064 created conflicts with #122945. I spent a day for this, and see a light at the end of tunnel, but the simplest way to resolve conflict is to revert #126064. Then merge #122945, then continue the initial plan. This will take a time.

    Alternatively, I can fuse all these steps in #122945, but the result will be larger and very dirty, because I would need to use dirty tricks like using globals to pass values of local variables between function calls in three or four different modules.

    @erlend-aasland
    Copy link
    Contributor

    Reverting #126064 will also mean reverting #126235. Are there any others that would need reverting, @skirpichev?

    @skirpichev
    Copy link
    Member

    Reverting #126064 will also mean reverting #126235.

    Not necessary. Reverting #126064 will just introduce some performance regression for converted functions. But as @serhiy-storchaka planned to address this issue In The Right Way - this will be eventually fixed.

    Are there any others that would need reverting

    None, as far as I know.

    @erlend-aasland
    Copy link
    Contributor

    Sounds good. Serhiy, please go ahead.

    @skirpichev
    Copy link
    Member

    (Given the size (few lines) of #126064, I don't see big problem to resolve merge conflicts. I'll try to do this. Serhiy, feel free to ignore these efforts.)

    @erlend-aasland
    Copy link
    Contributor

    (Given the size (few lines) of #126064, I don't see big problem to resolve merge conflicts. I'll try to do this. Serhiy, feel free to ignore these efforts.)

    IMO, this would be the best option; if we can avoid the revert churn that would be great.

    @skirpichev
    Copy link
    Member

    So, if this is not urgent, give me a day.

    @skirpichev
    Copy link
    Member

    I think, this is fixed by #122945 and #126560.

    @erlend-aasland
    Copy link
    Contributor

    Thanks, @colorfulappl, for the proposal and initial work; thanks Serhiy and Sergey for the PRs 🍰 👏 🥳

    picnixz pushed a commit to picnixz/cpython that referenced this issue Dec 8, 2024
    …arargs (python#126064)
    
    Avoid temporary tuple creation when all arguments either positional-only
    or vararg.
    
    Objects/setobject.c and Modules/gcmodule.c adapted. This fixes slight
    performance regression for set methods, introduced by pythongh-115112.
    ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
    …arargs (python#126064)
    
    Avoid temporary tuple creation when all arguments either positional-only
    or vararg.
    
    Objects/setobject.c and Modules/gcmodule.c adapted. This fixes slight
    performance regression for set methods, introduced by pythongh-115112.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.14 new features, bugs and security fixes performance Performance or resource usage topic-argument-clinic
    Projects
    None yet
    Development

    Successfully merging a pull request may close this issue.

    5 participants