You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When creating two IndexPQ instances in a Python script with specific configuration parameters continuously, the script is unexpectedly killed during execution.
Platform
OS: Ubuntu 24.04 LTS
Faiss version: faiss-gpu 1.8.0
Installed from: Anaconda
Faiss compilation options:
Running on:
[✔] CPU
GPU
Interface:
C++
[✔] Python
Reproduction instructions
importfaissimportnumpyasnp# Set parametersd=554M=1nbits=56metric=faiss.METRIC_INNER_PRODUCTindex0=faiss.IndexPQ(d, M, nbits, metric)
# Killed occurs hereindex1=faiss.IndexPQ(d, M, nbits, metric)
np.random.seed(0)
nb=10000nq=1xb=np.random.random((nb, d)).astype('float32')
xq=np.random.random((nq, d)).astype('float32')
index1.train(xb)
index1.add(xb)
k=5D, I=index1.search(xq, k)
print("Index:\n", I)
print("Distance:\n", D)
When running the above Python script, for unknown reason, the script itself gets killed as follow.
Killed
We have also observed that being killed is associated with the platform.
When the above Python script runs on the server with Ubuntu 24.04 LTS (Intel Core i7-11700 CPU and 64G memory), it gets killed.
However, the following errors occur when the script is running on the laptop with Ubuntu 22.04.3 LTS in WSL (AMD Ryzen 5 4600H CPU and 16G memory).
Traceback (most recent call last):
File "/mnt/d/faiss/bug.py", line 12, in<module>
index0 = faiss.IndexPQ(d, M, nbits, metric)
File "/home/xxx/anaconda3/envs/faiss/lib/python3.10/site-packages/faiss/swigfaiss_avx2.py", line 5063, in __init__
_swigfaiss_avx2.IndexPQ_swiginit(self, _swigfaiss_avx2.new_IndexPQ(*args))
MemoryError: std::bad_alloc
The text was updated successfully, but these errors were encountered:
@qwevdb I have played around with this and created more failure scenarios and was able to reproduce this on the same Intel machine with bad_alloc and another failure. I'm posting the script below to help the team get started. It looks like nbits of 56, 58, 60, 62, 63 (and potentially others) cause the error messages you. In my tests, Killed came back with an error code of 137 which is out of memory and aligns with the tests in the script below. As a workaround, you can use higher or lower nbits for now to unblock yourself for the time being. Thanks!
importfaissimportnumpyasnpimportpsutil# Set parametersd=554M=1nbits=56metric=faiss.METRIC_INNER_PRODUCT# print starting memory usedprint("Base memory used: %s"%psutil.virtual_memory().used)
# create first index that will eat up about 35GB of memoryindex0=faiss.IndexPQ(554, 1, 56, metric)
print("After first index: %s"%psutil.virtual_memory().used)
# create the second index that will require virtually no additional memoryindex1=faiss.IndexPQ(554, 2, 64, metric)
print("After second index: %s"%psutil.virtual_memory().used)
# and this will cause bad allocfaiss.IndexPQ(554, 1, 60, metric)
# and this will cause a swap errorfaiss.IndexPQ(554, 1, 63, metric)
Hi @qwevdb, after internal discussion, we added #3833 which sets the nbits maximum to 24 for IndexPQ. We noticed your number of subquantizers per vector (M) is 1. You can try to increase the number of subquantizers and decrease nbits for the same compression.
Actually, anything above nbits = 31 will cause integer overflow for size_t. The nbits = 64 that Ramil tried above didn't increase memory usage because it overflowed twice back down to 0. The nbits = 56 overflowed but was still large enough to OOM with 2 of them. (thanks @mengdilin for the investigation)
Summary
When creating two IndexPQ instances in a Python script with specific configuration parameters continuously, the script is unexpectedly killed during execution.
Platform
OS: Ubuntu 24.04 LTS
Faiss version: faiss-gpu 1.8.0
Installed from: Anaconda
Faiss compilation options:
Running on:
Interface:
Reproduction instructions
When running the above Python script, for unknown reason, the script itself gets killed as follow.
We have also observed that being killed is associated with the platform.
When the above Python script runs on the server with Ubuntu 24.04 LTS (Intel Core i7-11700 CPU and 64G memory), it gets killed.
However, the following errors occur when the script is running on the laptop with Ubuntu 22.04.3 LTS in WSL (AMD Ryzen 5 4600H CPU and 16G memory).
The text was updated successfully, but these errors were encountered: