Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-109047: concurrent.futures catches RuntimeError #109810

Merged
merged 2 commits into from
Sep 29, 2023

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Sep 25, 2023

concurrent.futures: The 'executor manager thread' now catches
PythonFinalizationError, it calls terminate_broken(). The exception
occurs while Python is being finalized when adding an item to the
'call queue' tries to create a new 'queue feeder' thread.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

  • terminate_broken() no longer calls shutdown_workers() since the
    queue is no longer working anymore (read and write ends of the
    queue pipe are closed).
  • terminate_broken() now terminates child processes.
  • wait_result_broken_or_wakeup() now uses the short form of
    traceback.format_exception().

multiprocessing.Queue changes:

  • Add _terminate_broken() method.
  • _start_thread() sets _thread to None on exception to prevent
    leaking "dangling threads" even if the thread was not started
    yet.

📚 Documentation preview 📚: https://cpython-previews--109810.org.readthedocs.build/

@vstinner
Copy link
Member Author

This PR relies on PR #109809 which adds PythonFinalizationError exception.

test_concurrent_futures.test_python_finalization_error() is still unstable in my current draft implementation. Example:

ERROR: test_python_finalization_error (test.test_concurrent_futures.test_process_pool.ProcessPoolForkserverProcessPoolExecutorTest.test_python_finalization_error)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vstinner/python/main/Lib/test/test_concurrent_futures/test_process_pool.py", line 209, in test_python_finalization_error
    list(executor.map(mul, [(2, 3)] * 10))
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 850, in map
    results = super().map(partial(_process_chunk, fn),
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/concurrent/futures/_base.py", line 608, in map
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 821, in submit
    self._adjust_process_count()
  File "/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 780, in _adjust_process_count
    self._spawn_process()
  File "/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 798, in _spawn_process
    p.start()
  File "/home/vstinner/python/main/Lib/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/multiprocessing/context.py", line 301, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/multiprocessing/popen_forkserver.py", line 35, in __init__
    super().__init__(process_obj)
  File "/home/vstinner/python/main/Lib/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/vstinner/python/main/Lib/multiprocessing/popen_forkserver.py", line 47, in _launch
    reduction.dump(process_obj, buf)
  File "/home/vstinner/python/main/Lib/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "/home/vstinner/python/main/Lib/multiprocessing/connection.py", line 1173, in reduce_connection
    df = reduction.DupFd(conn.fileno())
                         ^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/multiprocessing/connection.py", line 172, in fileno
    self._check_closed()
  File "/home/vstinner/python/main/Lib/multiprocessing/connection.py", line 138, in _check_closed
    raise OSError("handle is closed")
OSError: handle is closed

@vstinner
Copy link
Member Author

Another error, because a pipe file descriptor is closed:

ERROR: test_python_finalization_error (test.test_concurrent_futures.test_process_pool.ProcessPoolSpawnProcessPoolExecutorTest.test_python_finalization_error)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/vstinner/python/main/Lib/test/test_concurrent_futures/test_process_pool.py", line 209, in test_python_finalization_error
    list(executor.map(mul, [(2, 3)] * 10))
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 850, in map
    results = super().map(partial(_process_chunk, fn),
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/concurrent/futures/_base.py", line 608, in map
    fs = [self.submit(fn, *args) for args in zip(*iterables)]
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 821, in submit
    self._adjust_process_count()
  File "/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 780, in _adjust_process_count
    self._spawn_process()
  File "/home/vstinner/python/main/Lib/concurrent/futures/process.py", line 798, in _spawn_process
    p.start()
  File "/home/vstinner/python/main/Lib/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/multiprocessing/context.py", line 289, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/home/vstinner/python/main/Lib/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/vstinner/python/main/Lib/multiprocessing/popen_spawn_posix.py", line 58, in _launch
    self.pid = util.spawnv_passfds(spawn.get_executable(),
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vstinner/python/main/Lib/multiprocessing/util.py", line 453, in spawnv_passfds
    return _posixsubprocess.fork_exec(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: bad value(s) in fds_to_keep

@vstinner vstinner force-pushed the cf_finalization_error branch from 1722fb1 to 3137689 Compare September 29, 2023 18:20
@vstinner vstinner marked this pull request as ready for review September 29, 2023 18:20
@vstinner vstinner force-pushed the cf_finalization_error branch 2 times, most recently from a77c413 to a035765 Compare September 29, 2023 18:35
…ueue()

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  queue is no longer working anymore (read and write ends of the
  queue pipe are closed).
* terminate_broken() now terminates child processes.
* wait_result_broken_or_wakeup() now uses the short form (1 argument,
  not 3) of traceback.format_exception().
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
@vstinner vstinner force-pushed the cf_finalization_error branch from a035765 to f94feb1 Compare September 29, 2023 18:37
@vstinner vstinner enabled auto-merge (squash) September 29, 2023 18:43
@vstinner vstinner added the needs backport to 3.12 bug and security fixes label Sep 29, 2023
@vstinner vstinner changed the title gh-109047: concurrent.futures catches PythonFinalizationError gh-109047: concurrent.futures catches RuntimeError Sep 29, 2023
@vstinner vstinner merged commit 6351842 into python:main Sep 29, 2023
@vstinner vstinner deleted the cf_finalization_error branch September 29, 2023 19:31
@miss-islington
Copy link
Contributor

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @vstinner, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 635184212179b0511768ea1cd57256e134ba2d75 3.12

vstinner added a commit to vstinner/cpython that referenced this pull request Sep 29, 2023
…ython#109810)

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.

(cherry picked from commit 6351842)
vstinner added a commit to vstinner/cpython that referenced this pull request Sep 29, 2023
…ython#109810)

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.

(cherry picked from commit 6351842)
@bedevere-app
Copy link

bedevere-app bot commented Sep 29, 2023

GH-110126 is a backport of this pull request to the 3.12 branch.

@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot ARM64 macOS 3.x has failed when building commit 6351842.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/725/builds/5797) and take a look at the build logs.
  4. Check if the failure is related to this commit (6351842) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/725/builds/5797

Failed tests:

  • test.test_concurrent_futures.test_deadlock

Summary of the results of the build (if available):

==

Click to see traceback logs
remote: Enumerating objects: 30, done.        
remote: Counting objects:   3% (1/30)        
remote: Counting objects:   6% (2/30)        
remote: Counting objects:  10% (3/30)        
remote: Counting objects:  13% (4/30)        
remote: Counting objects:  16% (5/30)        
remote: Counting objects:  20% (6/30)        
remote: Counting objects:  23% (7/30)        
remote: Counting objects:  26% (8/30)        
remote: Counting objects:  30% (9/30)        
remote: Counting objects:  33% (10/30)        
remote: Counting objects:  36% (11/30)        
remote: Counting objects:  40% (12/30)        
remote: Counting objects:  43% (13/30)        
remote: Counting objects:  46% (14/30)        
remote: Counting objects:  50% (15/30)        
remote: Counting objects:  53% (16/30)        
remote: Counting objects:  56% (17/30)        
remote: Counting objects:  60% (18/30)        
remote: Counting objects:  63% (19/30)        
remote: Counting objects:  66% (20/30)        
remote: Counting objects:  70% (21/30)        
remote: Counting objects:  73% (22/30)        
remote: Counting objects:  76% (23/30)        
remote: Counting objects:  80% (24/30)        
remote: Counting objects:  83% (25/30)        
remote: Counting objects:  86% (26/30)        
remote: Counting objects:  90% (27/30)        
remote: Counting objects:  93% (28/30)        
remote: Counting objects:  96% (29/30)        
remote: Counting objects: 100% (30/30)        
remote: Counting objects: 100% (30/30), done.        
remote: Compressing objects:   6% (1/16)        
remote: Compressing objects:  12% (2/16)        
remote: Compressing objects:  18% (3/16)        
remote: Compressing objects:  25% (4/16)        
remote: Compressing objects:  31% (5/16)        
remote: Compressing objects:  37% (6/16)        
remote: Compressing objects:  43% (7/16)        
remote: Compressing objects:  50% (8/16)        
remote: Compressing objects:  56% (9/16)        
remote: Compressing objects:  62% (10/16)        
remote: Compressing objects:  68% (11/16)        
remote: Compressing objects:  75% (12/16)        
remote: Compressing objects:  81% (13/16)        
remote: Compressing objects:  87% (14/16)        
remote: Compressing objects:  93% (15/16)        
remote: Compressing objects: 100% (16/16)        
remote: Compressing objects: 100% (16/16), done.        
remote: Total 16 (delta 13), reused 2 (delta 0), pack-reused 0        
From https://github.com/python/cpython
 * branch                  main       -> FETCH_HEAD
Note: switching to '635184212179b0511768ea1cd57256e134ba2d75'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 6351842121 gh-109047: concurrent.futures catches PythonFinalizationError (#109810)
Switched to and reset branch 'main'

In file included from ./Modules/_tkinter.c:52:
In file included from /opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/tk.h:99:
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:131:21: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        int (*free_private)();  /* called to free private storage */
                           ^
                            void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:334:33: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        struct _XImage *(*create_image)();
                                       ^
                                        void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:453:23: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        XID (*resource_alloc)(); /* allocator function */
                             ^
                              void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:471:20: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        int (*synchandler)();   /* Synchronization handler */
                          ^
                           void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:496:24: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        Bool (*event_vec[128])();  /* vector for wire to event */
                              ^
                               void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:497:25: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        Status (*wire_vec[128])(); /* vector for event to wire */
                               ^
                                void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:509:20: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        Bool (**error_vec)();      /* vector for wire to error */
                          ^
                           void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:522:25: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        int (*savedsynchandler)(); /* user synchandler when Xlib usurps */
                               ^
                                void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:1053:24: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
typedef void (*XIMProc)();
                       ^
                        void
In file included from ./Modules/tkappinit.c:17:
In file included from /opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/tk.h:99:
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:131:21: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        int (*free_private)();  /* called to free private storage */
                           ^
                            void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:334:33: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        struct _XImage *(*create_image)();
                                       ^
                                        void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:453:23: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        XID (*resource_alloc)(); /* allocator function */
                             ^
                              void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:471:20: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        int (*synchandler)();   /* Synchronization handler */
                          ^
                           void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:496:24: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        Bool (*event_vec[128])();  /* vector for wire to event */
                              ^
                               void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:497:25: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        Status (*wire_vec[128])(); /* vector for event to wire */
                               ^
                                void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:509:20: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        Bool (**error_vec)();      /* vector for wire to error */
                          ^
                           void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:522:25: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
        int (*savedsynchandler)(); /* user synchandler when Xlib usurps */
                               ^
                                void
/opt/homebrew/Cellar/tcl-tk/8.6.13_5/include/tcl-tk/X11/Xlib.h:1053:24: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
typedef void (*XIMProc)();
                       ^
                        void
9 warnings generated.
9 warnings generated.

make: *** [buildbottest] Error 5

Yhg1s pushed a commit that referenced this pull request Oct 2, 2023
…110126)

gh-109047: concurrent.futures catches PythonFinalizationError (#109810)

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.

(cherry picked from commit 6351842)
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
…ython#109810)

concurrent.futures: The *executor manager thread* now catches
exceptions when adding an item to the *call queue*. During Python
finalization, creating a new thread can now raise RuntimeError. Catch
the exception and call terminate_broken() in this case.

Add test_python_finalization_error() to test_concurrent_futures.

concurrent.futures._ExecutorManagerThread changes:

* terminate_broken() no longer calls shutdown_workers() since the
  call queue is no longer working anymore (read and write ends of
  the queue pipe are closed).
* terminate_broken() now terminates child processes, not only
  wait until they complete.
* _ExecutorManagerThread.terminate_broken() now holds shutdown_lock
  to prevent race conditons with ProcessPoolExecutor.submit().

multiprocessing.Queue changes:

* Add _terminate_broken() method.
* _start_thread() sets _thread to None on exception to prevent
  leaking "dangling threads" even if the thread was not started
  yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants