Skip to content

Commit

Permalink
Fix IDSelector memory leak with SWIG, add unit tests (#3810)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #3810

Pull Request resolved: #3704

Background
--
Issue: #2996

Prior attempted fix: https://github.com/facebookresearch/faiss/pull/3007/files#diff-d704f33c46a4ef2936c8cf2a66b05f1993e25e79ee5c19d4b63c4e0cf46b0a42

Partially copied from #3139 with an additional unit test.

It is a confirmed and reproducible memory leak every time. There is a workaround. See the comments on #2996.

Relevant SWIG docs: https://www.swig.org/Doc4.1/SWIGDocumentation.html#Typemaps_nn2

Current status
--

`buck test faiss/tests:test_search_params -- test_ownership_2`

Test prints:

without SWIG fix:
`[0.49, 0.82, 1.15, 1.39, 1.67, 1.88, 2.16, 2.36, 2.65, 2.85, 3.13, 3.34, 3.62, 3.82, 4.11, 4.31, 4.6, 4.8, 5.08, 5.28]`

with SWIG fix:
`[0.52, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71, 0.71]`

Note: This test is not ideal. Ideally we could enable ASAN to catch these memory leaks. But ASAN does not seem to catch these Python memory leaks. Example:
```
def test_ownership_3(self):
        d = 32
        quantizer = faiss.IndexFlatL2(d)
        quantizer.this.own(False)
```
The above test produces no ASAN error, even though the `quantizer` object leaks.

Why change HNSW test?
--
This fix causes the HNSW test to fail with heap-use-after-free. This is because the index.storage.get_distance_computer() somehow gets freed during clone_index, but only when reassigning to the same variable like `index.storage = clone(index.storage)`. I checked in https://fburl.com/code/qw6fznjt, and it is non-null before returning on the CPP side.

After adding the temp variable, I also had to set `index.own_fields = False`, otherwise we get a heap-use-after-free again due to it being deleted already.

Differential Revision: D61992599
  • Loading branch information
Michael Norris authored and facebook-github-bot committed Sep 12, 2024
1 parent a166e13 commit 299f0b3
Show file tree
Hide file tree
Showing 4 changed files with 39 additions and 26 deletions.
25 changes: 1 addition & 24 deletions faiss/python/class_wrappers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1170,28 +1170,6 @@ def add_to_referenced_objects(self, ref):
else:
self.referenced_objects.append(ref)

class RememberSwigOwnership:
"""
SWIG's seattr transfers ownership of SWIG wrapped objects to the class
(btw this seems to contradict https://www.swig.org/Doc1.3/Python.html#Python_nn22
31.4.2)
This interferes with how we manage ownership: with the referenced_objects
table. Therefore, we reset the thisown field in this context manager.
"""

def __init__(self, obj):
self.obj = obj

def __enter__(self):
if hasattr(self.obj, "thisown"):
self.old_thisown = self.obj.thisown
else:
self.old_thisown = None

def __exit__(self, *ignored):
if self.old_thisown is not None:
self.obj.thisown = self.old_thisown


def handle_SearchParameters(the_class):
""" this wrapper is to enable initializations of the form
Expand All @@ -1206,8 +1184,7 @@ def replacement_init(self, **args):
self.original_init()
for k, v in args.items():
assert hasattr(self, k)
with RememberSwigOwnership(v):
setattr(self, k, v)
setattr(self, k, v)
if type(v) not in (int, float, bool, str):
add_to_referenced_objects(self, v)

Expand Down
14 changes: 14 additions & 0 deletions faiss/python/swigfaiss.swig
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,20 @@ typedef uint64_t size_t;
* the language includes for their respective matrix libraries.
*******************************************************************/

/********************************************************
* Don't transfer ownership when assigning to a struct field
********************************************************/

// https://github.com/swig/swig/issues/2709
// override default behavior
%typemap(in, noblock=1) SWIGTYPE * (void *argp = 0, int res = 0) {
res = SWIG_ConvertPtr($input, &argp,$descriptor, /* $disown | */ %convertptr_flags);
if (!SWIG_IsOK(res)) {
%argument_fail(res, "$type", $symname, $argnum);
}
$1 = %reinterpret_cast(argp, $ltype);
}

%{


Expand Down
8 changes: 6 additions & 2 deletions tests/test_graph_based.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,8 +154,12 @@ def test_io_no_storage(self):
faiss.serialize_index(index2))

# add storage afterwards
index.storage = faiss.clone_index(index.storage)
index.own_fields = True
# Trying to call index.storage.get_distance_computer() after cloning
# will crash, unless we assign to an intermediate variable.
# We also get a heap-use-after-free unless we set own_fields to False.
temp = faiss.clone_index(index.storage)
index.storage = temp
index.own_fields = False

Dnew, Inew = index.search(self.xq, 5)
np.testing.assert_array_equal(Dnew, Dref)
Expand Down
18 changes: 18 additions & 0 deletions tests/test_search_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -397,6 +397,24 @@ def test_ownership(self):
self.assertTrue(sel1.this.own())
self.assertTrue(sel2.this.own())

def test_ownership_2(self):
res = []
for _ in range(20):
params = faiss.SearchParameters()
subset = np.arange(0, 5000000)
params.sel = faiss.IDSelectorBatch(subset)
mem_usage = faiss.get_mem_usage_kb() / 1024**2
res.append(round(mem_usage, 2))
print(res)
self.assertTrue(res[-1] < 0.9)

def test_ownership_3(self):
subset = np.arange(0, 5000000)
sel = faiss.IDSelectorBatch(subset)
assert sel.this.own() # True: correct
_ = faiss.SearchParameters(sel=sel)
assert sel.this.own() # False: why???


class TestSelectorCallback(unittest.TestCase):

Expand Down

0 comments on commit 299f0b3

Please sign in to comment.