Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.11 torch.jit.script segmentation fault #8459

Closed
dkurt opened this issue Feb 19, 2024 · 4 comments
Closed

Python 3.11 torch.jit.script segmentation fault #8459

dkurt opened this issue Feb 19, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@dkurt
Copy link

dkurt commented Feb 19, 2024

Describe the bug

Segmentation fault when import asr:

$ python -c "from nemo.collections.asr.models.msdd_models import NeuralDiarizer"
Segmentation fault (core dumped)

faulthandler reports a problem with @torch.jit.script compilation:

@torch.jit.script
def stitch_cluster_labels(Y_old: torch.Tensor, Y_new: torch.Tensor) -> torch.Tensor:

Fatal Python error: Segmentation fault

Current thread 0x00007f0df18b7000 (most recent call first):
  File "/home/dkurtaev/NeMo/.venv/lib/python3.11/site-packages/torch/jit/_script.py", line 1399 in script
  File "/home/dkurtaev/NeMo/.venv/lib/python3.11/site-packages/torch/jit/_recursive.py", line 1003 in try_compile_fn
  File "/home/dkurtaev/NeMo/.venv/lib/python3.11/site-packages/torch/jit/_script.py", line 1395 in script
  File "/home/dkurtaev/NeMo/.venv/lib/python3.11/site-packages/torch/jit/_recursive.py", line 1003 in try_compile_fn
  File "/home/dkurtaev/NeMo/.venv/lib/python3.11/site-packages/torch/jit/_script.py", line 1395 in script
  File "/home/dkurtaev/NeMo/.venv/lib/python3.11/site-packages/torch/jit/_recursive.py", line 1003 in try_compile_fn
  File "/home/dkurtaev/NeMo/.venv/lib/python3.11/site-packages/torch/jit/_script.py", line 1395 in script
  File "/home/dkurtaev/NeMo/nemo/collections/asr/parts/utils/online_clustering.py", line 119 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/dkurtaev/NeMo/nemo/collections/asr/parts/utils/longform_clustering.py", line 24 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/dkurtaev/NeMo/nemo/collections/asr/parts/utils/speaker_utils.py", line 31 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/dkurtaev/NeMo/nemo/collections/asr/parts/utils/manifest_utils.py", line 25 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1234 in _handle_fromlist
  File "/home/dkurtaev/NeMo/nemo/collections/asr/models/aed_multitask_models.py", line 44 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/dkurtaev/NeMo/nemo/collections/asr/models/__init__.py", line 15 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1234 in _handle_fromlist
  File "/home/dkurtaev/NeMo/nemo/collections/asr/__init__.py", line 15 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 940 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1128 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1128 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "/home/dkurtaev/audio_analytics/run.py", line 54 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, yaml._yaml, matplotlib._c_internal_utils, PIL._imaging, matplotlib._path, kiwisolver._cext, matplotlib._image, regex._regex, charset_normalizer.md, sentencepiece._sentencepiece, google._upb._message, psutil._psutil_linux, psutil._psutil_posix, _cffi_backend, scipy._lib._ccallback_c, scipy.signal._sigtools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._flinalg, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy._lib._uarray._uarray, scipy.signal._max_len_seq_inner, scipy.signal._upfirdn_apply, scipy.signal._spline, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.signal._sosfilt, scipy.signal._spectral, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy.stats._ansari_swilk_statistics, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.stats._unuran.unuran_wrapper, scipy.signal._peak_finding_utils, pyarrow.lib, pyarrow._hdfsio, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, numba.core.typeconv._typeconv, numba._helperlib, numba._dynfunc, numba._dispatcher, numba.core.runtime._nrt_python, numba.np.ufunc._internal, numba.experimental.jitclass._box, numba.mviewbuf, numba.types.itertools, cytoolz.utils, cytoolz.itertoolz, cytoolz.functoolz, cytoolz.dicttoolz, cytoolz.recipes (total: 197)
Segmentation fault (core dumped)

When use Python 3.10 or Python 3.11 with PYTORCH_JIT=0 import passes.

Steps/Code to reproduce bug

python3.11 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements/requirements_lightning.txt
pip install -r requirements/requirements_asr.txt
pip install -r requirements/requirements.txt
pip install -r requirements/requirements_common.txt
@dkurt dkurt added the bug Something isn't working label Feb 19, 2024
@dkurt
Copy link
Author

dkurt commented Feb 19, 2024

Probably related: pytorch/pytorch#86566

@titu1994
Copy link
Collaborator

Yikes. Fyi @tango4j @nithinraok @artbataev

@artbataev
Copy link
Collaborator

@dkurt What version of PyTorch do you use? What is the environment (OS, conda/pip, etc)?

@dkurt
Copy link
Author

dkurt commented Feb 20, 2024

@artbataev, Extremely apologize. Segmentation fault reproduced with Python 3.11.0rc1 and PyTorch 2.1.0/2.2.0. After upgrade to Python 3.11.7 everything works fine.

@dkurt dkurt closed this as not planned Won't fix, can't repro, duplicate, stale Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants