Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python references that go out of scope might not release the memory allocated by _upb_Arena_SlowMalloc #14571

Closed
tvalentyn opened this issue Oct 13, 2023 · 15 comments

Comments

@tvalentyn
Copy link

tvalentyn commented Oct 13, 2023

Repro:

  1. In a clean environment, run pip install apache-beam==2.51.0.
  2. Download + unpack pipeline.pb (see attached https://github.com/protocolbuffers/upb/files/12898151/pipeline.pb.zip).
  3. Create a memleak_repro.py
from apache_beam.portability.api import beam_runner_api_pb2
from apache_beam.io.gcp import gcsio
from apache_beam.io.gcp import gcsfilesystem

if len(sys.argv) <= 1:
  print("Specify proto path!")
  sys.exit(1)

path = sys.argv[1]
if path.startswith('gs'):
  open = gcsio.GcsIO().open

with open(path, 'rb') as f:
  pipeline = beam_runner_api_pb2.Pipeline()
  pipeline.ParseFromString(f.read())

for transform in pipeline.components.transforms:
  for _ in range(100000):
    reference = pipeline.components.transforms[transform].outputs
  1. Run python memleak_repro.py pipeline.pb, observe RAM usage in top/htop, etc, increasing the number of iterations as necessary to keep the process running longer.

  2. For profiling, run:

pip install memray

memray run -o output.bin --force memleak_repro.py pipeline.pb ; memray table --leak output.bin -o table.html --force ; memray flamegraph --leak output.bin -o flamegraph.html -f

  1. Open table.html, double click on the Size column, note ~36MB reported leaked memory in memleak_repro.py:20

Leak doesn't happen with PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python set.

Leak can also be observed with C-stack tools, such as tcmalloc+pprof:

apt install google-pprof
LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4:$LD_PRELOAD  HEAPPROFILE=`pwd`/  python memleak_repro.py pipeline.pb
google-pprof /home/valentyn/.pyenv/versions/py310/bin/python --inuse_space  _219720.0005.heap  --base=_219720.0004.heap 
Using local file /home/valentyn/.pyenv/versions/py310/bin/python.
Using local file _219720.0005.heap.
Welcome to pprof!  For help, type 'help'.
(pprof) top
Total: 1152.0 MB
  1152.0 100.0% 100.0%   1152.0 100.0% _upb_Arena_SlowMalloc
(pprof) 

valgrind+massif:

apt install valgrind
PYTHONMALLOC=malloc  valgrind  --tool=massif   --verbose  /home/valentyn/.pyenv/versions/py310/bin/python memleak_repro.py pipeline.pb

ms_print massif.out.341396 

...
->18.09% (817,040B) 0x59A17F1: _upb_Arena_SlowMalloc (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| ->07.00% (316,176B) 0x59A0051: upb_strtable_insert (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | ->06.05% (273,072B) 0x599FDDF: upb_strtable_resize (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | | ->06.05% (273,072B) 0x599FED6: upb_strtable_insert (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   ->04.17% (188,096B) 0x598F0D7: _upb_DefPool_InsertSym (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   | ->03.89% (175,568B) 0x5990B19: _upb_EnumValueDefs_New (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   | | ->03.89% (175,568B) 0x5990332: _upb_EnumDefs_New (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   | |   ->03.89% (175,568B) 0x5995356: _upb_MessageDefs_New (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   | |     ->03.89% (175,568B) 0x5992E88: _upb_FileDef_Create (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   | |       ->03.89% (175,568B) 0x598FC4B: upb_DefBuilder_AddFileToPool (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   | |         ->03.89% (175,568B) 0x598F699: _upb_DefPool_AddFile (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   | |           ->03.89% (175,568B) 0x598165C: PyUpb_DescriptorPool_DoAddSerializedFile (in /home/valentyn/.pyenv/versions/3.10.4/envs/py310/lib/python3.10/site-packages/google/_upb/_message.abi3.so)
| | |   | |             ->03.89% (175,568B) 0x31A7B1: method_vectorcall_O (descrobject.c:460)
| | |   | |               ->03.89% (175,568B) 0x16687A: _PyObject_VectorcallTstate (abstract.h:114)
| | |   | |                 ->03.89% (175,568B) 0x16687A: PyObject_Vectorcall (abstract.h:123)
| | |   | |                   ->03.89% (175,568B) 0x16687A: call_function (ceval.c:5867)
| | |   | |                     ->03.89% (175,568B) 0x16687A: _PyEval_EvalFrameDefault (ceval.c:4198)
| | |   | |                       ->03.89% (175,568B) 0x22F7B4: _PyEval_EvalFrame (pycore_ceval.h:46)
| | |   | |                         ->03.89% (175,568B) 0x22F7B4: _PyEval_Vector (ceval.c:5065)
| | |   | |                           ->03.89% (175,568B) 0x22F7B4: PyEval_EvalCode (ceval.c:1134)
| | |   | |                             ->03.89% (175,568B) 0x3593F8: builtin_exec_impl (bltinmodule.c:1056)
| | |   | |                               ->03.89% (175,568B) 0x3593F8: builtin_exec (bltinmodule.c.h:371)

...

Using memray in --native mode also points to _upb_Arena_SlowMalloc .

Command line: memray run --native -o output.bin --force memleak_repro.py pipeline.pb ; memray table --leak output.bin -o table.html --force ; memray flamegraph --leak output.bin -o flamegraph.html -f

@tvalentyn
Copy link
Author

cc: @haberman @ericsalo

@tvalentyn
Copy link
Author

pipeline.pb.zip

@tvalentyn
Copy link
Author

@anandolee
Copy link
Contributor

@tvalentyn can you help create a repo that not depend on apache-beam? Thanks

@tvalentyn
Copy link
Author

I am OOO atm and likely won't get to it before next week, but cc'ing: @AnandInguva in case he has time and interested.

@tvalentyn
Copy link
Author

Interesting. I gave it a try, and I was not able to recreate the leak in isolation while using essentially the same proto.

I'll check whether this has to do with how beam compiles proto stubs or something else.

@haberman
Copy link
Member

I'll check whether this has to do with how beam compiles proto stubs or something else.

That would be really helpful, thanks!

@tvalentyn
Copy link
Author

tvalentyn commented Oct 21, 2023

Got a smaller repro. Appears that there some interaction between protobuf and dill. Beam uses old version of dill, but the issue is reproducible even with latest dill. Good news is there is some work underway for Beam to not depend on Dill or at least be able to use cloudpickle as default..

pip install dill protobuf grpcio-tools dill memray 
import dill # Leak doesn't happen without this import.
import sys
from google.protobuf import text_format

proto_def="""
syntax = "proto3";

message MessageWithAMap {
  map<string, string> some_map = 1;
}
"""

with open("repro.proto", "w") as f:
  f.write(proto_def)

from grpc_tools import protoc
args = [
    sys.executable, # unused
    "repro.proto",
    "--proto_path", ".", "--python_out", "."]
protoc.main(args)

import repro_pb2
from repro_pb2 import MessageWithAMap

proto="""
some_map {
  key: "Length of this key increases the leak"
  value:"No observable impact on the leak"
}
"""

message = MessageWithAMap()
text_format.Parse(proto, message)

key = list(message.some_map)[0]
for _ in range(500000):
  unused_values = message.some_map[key] # memory leaks here
rm -f *pb2.py* ; memray run  -o output.bin --force memleak_repro.py  &&  memray table --leak output.bin -o table.html --force

@tvalentyn
Copy link
Author

tvalentyn commented Oct 23, 2023

import dill # Leak doesn't happen without this import.

Leak correlates with this line: https://github.com/uqfoundation/dill/blob/f66ed3b602d6e9b942542491614dfd18622583d7/dill/_dill.py#L72C32-L72C32

Here is a repro that doesn't use dill:

import __main__ as _main_module # Leak doesn't happen without this import.
import sys
from google.protobuf import text_format

proto_def="""
syntax = "proto3";

message MessageWithAMap {
  map<string, string> some_map = 1;
}
"""

with open("repro.proto", "w") as f:
  f.write(proto_def)

from grpc_tools import protoc
args = [
    sys.executable, # unused
    "repro.proto",
    "--proto_path", ".", "--python_out", "."]
protoc.main(args)

from repro_pb2 import MessageWithAMap

proto="""
some_map {
  key: "Length of this key increases the leak"
  value:"No observable impact on the leak"
}
"""

message = MessageWithAMap()
text_format.Parse(proto, message)

key = list(message.some_map)[0]
for _ in range(500000):
  unused_values = message.some_map[key] # memory leaks here

@tvalentyn
Copy link
Author

Leak doesn't happen with protobuf==3.20.3 or with PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python.

@haberman
Copy link
Member

I think I see the root cause. In map lookup, we use the message's arena to convert the Python key object to raw C data that we can use to perform the lookup in upb. But this data will never be freed until the message is freed:

protobuf/python/map.c

Lines 203 to 205 in 022d223

upb_Arena* arena = PyUpb_Arena_Get(self->arena);
upb_MessageValue u_key, u_val;
if (!PyUpb_PyToUpb(key, key_f, &u_key, arena)) return NULL;

We should use a temporary arena to perform the conversion. Only at the point where we are allocating non-temporary data should we use the message's arena.

@tvalentyn
Copy link
Author

tvalentyn commented Oct 23, 2023

That explains why import __main__ as _main_module has an effect. Leak is still happening without this statement, but this statement makes it easier to register the leak with the profiler as references to message persist longer.

@tvalentyn
Copy link
Author

Indeed, in top memory keeps growing without import __main__ as _main_module and memray also registers the leak when used without --leak flag.

@tvalentyn
Copy link
Author

tvalentyn commented Oct 23, 2023

@tvalentyn
Copy link
Author

tvalentyn commented Oct 23, 2023

And probably several other places in map.c where PyUpb_PyToUpb is used.

@haberman haberman transferred this issue from protocolbuffers/upb Oct 31, 2023
copybara-service bot pushed a commit that referenced this issue Nov 1, 2023
Previously we were allocating memory on the message's arena every time we performed a `map[key]` or `map.get(key)` operation.  This is unnecessary, as the key's data is only needed ephemerally, for the duration of the lookup, and we can therefore alias the Python object's string data instead of copying it.

This required fixing a bug in the convert.c operation.  Previously in the `arena==NULL` case, if the user passes a bytes object instead of a unicode string, the code would return a pointer to a temporary Python object that had already been freed, leading to use-after-free.  I fixed this by referencing the bytes object's data directly, and using utf8_range to verify the UTF-8.

Fixes: #14571
PiperOrigin-RevId: 578311252
copybara-service bot pushed a commit that referenced this issue Nov 1, 2023
Previously we were allocating memory on the message's arena every time we performed a `map[key]` or `map.get(key)` operation.  This is unnecessary, as the key's data is only needed ephemerally, for the duration of the lookup, and we can therefore alias the Python object's string data instead of copying it.

This required fixing a bug in the convert.c operation.  Previously in the `arena==NULL` case, if the user passes a bytes object instead of a unicode string, the code would return a pointer to a temporary Python object that had already been freed, leading to use-after-free.  I fixed this by referencing the bytes object's data directly, and using utf8_range to verify the UTF-8.

Fixes: #14571
PiperOrigin-RevId: 578311252
haberman added a commit that referenced this issue Nov 1, 2023
Previously we were allocating memory on the message's arena every time we performed a `map[key]` or `map.get(key)` operation.  This is unnecessary, as the key's data is only needed ephemerally, for the duration of the lookup, and we can therefore alias the Python object's string data instead of copying it.

This required fixing a bug in the convert.c operation.  Previously in the `arena==NULL` case, if the user passes a bytes object instead of a unicode string, the code would return a pointer to a temporary Python object that had already been freed, leading to use-after-free.  I fixed this by referencing the bytes object's data directly, and using utf8_range to verify the UTF-8.

Fixes: #14571
PiperOrigin-RevId: 578563555
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants