Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak in Python 3.11.0 #98

Closed
juanfem opened this issue Feb 6, 2023 · 8 comments
Closed

Possible memory leak in Python 3.11.0 #98

juanfem opened this issue Feb 6, 2023 · 8 comments
Assignees
Labels

Comments

@juanfem
Copy link
Contributor

juanfem commented Feb 6, 2023

I know that Python 3.11 is not yet fully supported, but I think this is something you should have a look (or prevent people from using p4p on py3.11)

I found a memory leak in one of my codes after upgrading to Python 3.11, and I finally traced it to p4p (or pvxs). It seems there have been some changes in the C API and Google reveals that more libraries are affected...

To reproduce it, you just need to create a PV monitor and you'll see how memory is leaked at every update.

I tried both the thread and asyncio clients with the same outcome. See plots below.

Python 3.11:

  • Asyncio:
    mem_asyncio_60s_py311
  • Thread
    mem_thread_60s_py311

Python 3.10:

  • Asyncio:
    mem_asyncio_60s_py310
  • Thread
    mem_thread_60s_py310

I am also attaching some files to reproduce the issue on Docker:
p4p_mem_leak_test.zip

To run it, first build the image with:

$ docker compose build

Then start a soft IOC with

$ docker compose up p4p_ioc

To test the asyncio client:

$ docker compose up p4p_test_asyncio

And to test the thread client:

$ docker compose up p4p_test_thread

You can also run the python scripts locally if you prefer.

To test on Python 3.10, just change the version in the environment.yml file and build the Docker image again.

@mdavidsaver
Copy link
Member

@juanfem Have you run the unittest suite? This includes some tests intended to detect some kinds of problems with reference counting.

Also, do your tests for py3.10 vs. 3.11 use the same version of Cython?

@mdavidsaver mdavidsaver self-assigned this Feb 6, 2023
@mdavidsaver mdavidsaver added the bug label Feb 6, 2023
@mdavidsaver
Copy link
Member

Also, P4P includes some tools for detecting and troubleshooting resource leaks.

from p4p.disect import periodic
periodic(period=10.0, file=sys.stderr)

Will periodically print information about changes in the number of objects of various types. eg.

# Types 633 -> 634
New Types
  <class 'prompt_toolkit.input.vt100_parser._Flush'> 1
Known Types
  <class '_asyncio.Task'> 7 delta -20
...

Some kinds of resource leaks manifest as a consistently positive "delta".

Another, less well developed, tool is p4p.listRefs(), which returns instance counts for some of the C++ classes.

@juanfem
Copy link
Contributor Author

juanfem commented Feb 6, 2023

Only one test fails, but it seems to be an unrelated error:

======================================================================
ERROR: p4p.test.test_asyncio (nose2.loader.ModuleImportFailure.p4p.test.test_asyncio)
----------------------------------------------------------------------
ImportError: Failed to import test module: p4p.test.test_asyncio
Traceback (most recent call last):
  File "/opt/conda/envs/p4p_py311_test/lib/python3.11/site-packages/nose2/plugins/loader/discovery.py", line 204, in _find_tests_in_file
    module = util.module_from_name(module_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/p4p_py311_test/lib/python3.11/site-packages/nose2/util.py", line 76, in module_from_name
    __import__(name)
  File "/opt/conda/envs/p4p_py311_test/lib/python3.11/site-packages/p4p/test/test_asyncio.py", line 10, in <module>
    from .asynciotest import *
  File "/opt/conda/envs/p4p_py311_test/lib/python3.11/site-packages/p4p/test/asynciotest.py", line 61, in <module>
    class TestGPM(AsyncTest):
  File "/opt/conda/envs/p4p_py311_test/lib/python3.11/site-packages/p4p/test/asynciotest.py", line 123, in TestGPM
    @asyncio.coroutine
     ^^^^^^^^^^^^^^^^^
AttributeError: module 'asyncio' has no attribute 'coroutine'


----------------------------------------------------------------------
Ran 142 tests in 56.751s

FAILED (errors=1, skipped=2)

This error is related to @asyncio.coroutine being removed in Python 3.11.
That error seems to be fixed by removing that decorator from test/asynciotest.py and using async def instead, as well as replacing the yield from by await.

I installed p4p using pip and I don't have Cython installed. Should I have it?

@mdavidsaver
Copy link
Member

This error is related to @asyncio.coroutine being removed in Python 3.11.

Ah, I must have missed that one w/ #82.

Only one test fails, but it seems to be an unrelated error:

Agreed. This probably rules out a simple ref. counting omission in either hand written or cython generated c/c++ code. At this point I suspect a resource leak through some container growing without bound. I think that running the disect monitor is the next thing to try.

Also, do you see any error, warning, unusual, or simply different messages being logged?

(I would strongly recommend to always enable logging of at least WARNING level)

@mdavidsaver
Copy link
Member

I am also attaching some files to reproduce the issue on Docker:

I have easy access to podman, not docker. I can install P4P in an instance of docker.io/library/python:3.11, along with the necessary bits to use psrecord. Unfortunately, I'm not able to reproduce your result.

image

Calling p4p.disect.periodic() just before the sleep() loop doesn't show any net change in instance counts for python objects.

I modified test_thread.py to count the number of times cb() is called. 305831 in 60 seconds seems a good non-zero number. I guess this means these scripts are running correctly?

# python --version
Python 3.11.1
# pip --version
pip 22.3.1 from /usr/local/lib/python3.11/site-packages/pip (python 3.11)
# pip freeze
contourpy==1.0.7
cycler==0.11.0
epicscorelibs==7.0.7.99.0.0
fonttools==4.38.0
kiwisolver==1.4.4
matplotlib==3.6.3
nose2==0.12.0
numpy==1.24.2
p4p==4.1.5
packaging==23.0
Pillow==9.4.0
ply==3.11
psrecord==1.2
psutil==5.9.4
pvxslibs==1.1.2
pyparsing==3.0.9
python-dateutil==2.8.2
setuptools-dso==2.7
six==1.16.0

@juanfem
Copy link
Contributor Author

juanfem commented Feb 7, 2023

Interesting, the main difference in your setup is the Python version. I am using 3.11.0, which is the latest available in conda, and you used 3.11.1.

I just ran the tests on one of the Python official Docker images with 3.11.1 as you did and the leak is gone. It seems the memory leak only affects 3.11.0. I suspect the leak was related to this bug that was fixed in 3.11.1.

@juanfem
Copy link
Contributor Author

juanfem commented Feb 7, 2023

By the way, yesterday I also tested p4p.disect.periodic() on 3.11.0. It didn't show any sign of memory leaks.
This is one of the reasons I believe the CPython issue was the source of the leak.

@mdavidsaver
Copy link
Member

I suspect the leak was related to this bug that was fixed in 3.11.1.

A leak on GIL lock/unlock would certainly explain what you do and don't observe. I guess 3.10.0 has to be considered unusable for long running processes?

Absent further input, I will consider this issue closed as "not my bug" :) Thanks for taking the time to test and report.

@mdavidsaver mdavidsaver changed the title Possible memory leak in Python 3.11 Possible memory leak in Python 3.11.0 Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants