-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to install hdbscan on colab. #600
Comments
Seeing this on our CI builds now as well error: subprocess-exited-with-error
× Building wheel for hdbscan (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [[16](https://github.com/Arize-ai/phoenix/actions/runs/5577666975/jobs/10190745313?pr=917#step:6:17)8 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-38
creating build/lib.linux-x86_64-cpython-38/hdbscan
copying hdbscan/validity.py -> build/lib.linux-x86_64-cpython-38/hdbscan
copying hdbscan/plots.py -> build/lib.linux-x86_64-cpython-38/hdbscan
copying hdbscan/flat.py -> build/lib.linux-x86_64-cpython-38/hdbscan
copying hdbscan/prediction.py -> build/lib.linux-x86_64-cpython-38/hdbscan
copying hdbscan/hdbscan_.py -> build/lib.linux-x86_64-cpython-38/hdbscan
copying hdbscan/__init__.py -> build/lib.linux-x86_64-cpython-38/hdbscan
copying hdbscan/robust_single_linkage_.py -> build/lib.linux-x86_64-cpython-38/hdbscan
creating build/lib.linux-x86_64-cpython-38/hdbscan/tests
copying hdbscan/tests/test_rsl.py -> build/lib.linux-x86_64-cpython-38/hdbscan/tests
copying hdbscan/tests/test_prediction_utils.py -> build/lib.linux-x86_64-cpython-38/hdbscan/tests
copying hdbscan/tests/test_flat.py -> build/lib.linux-x86_64-cpython-38/hdbscan/tests
copying hdbscan/tests/__init__.py -> build/lib.linux-x86_64-cpython-38/hdbscan/tests
copying hdbscan/tests/test_hdbscan.py -> build/lib.linux-x86_64-cpython-38/hdbscan/tests
running build_ext
Compiling hdbscan/_hdbscan_tree.pyx because it changed.
[1/1] Cythonizing hdbscan/_hdbscan_tree.pyx
building 'hdbscan._hdbscan_tree' extension
creating build/temp.linux-x86_64-cpython-38
creating build/temp.linux-x86_64-cpython-38/hdbscan
gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/home/runner/.local/share/hatch/env/virtual/arize-phoenix/C8K4HrkP/type/include -I/opt/hostedtoolcache/Python/3.8.[17](https://github.com/Arize-ai/phoenix/actions/runs/5577666975/jobs/10190745313?pr=917#step:6:18)/x64/include/python3.8 -I/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/numpy/core/include -c hdbscan/_hdbscan_tree.c -o build/temp.linux-x86_64-cpython-38/hdbscan/_hdbscan_tree.o
In file included from /tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/numpy/core/include/numpy/ndarraytypes.h:[18](https://github.com/Arize-ai/phoenix/actions/runs/5577666975/jobs/10190745313?pr=917#step:6:19)30,
from /tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
from /tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from hdbscan/_hdbscan_tree.c:1097:
/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
17 | #warning "Using deprecated NumPy API, disable it with " \
| ^~~~~~~
gcc -shared -Wl,--rpath=/opt/hostedtoolcache/Python/3.8.17/x64/lib -Wl,--rpath=/opt/hostedtoolcache/Python/3.8.17/x64/lib build/temp.linux-x86_64-cpython-38/hdbscan/_hdbscan_tree.o -L/opt/hostedtoolcache/Python/3.8.17/x64/lib -o build/lib.linux-x86_64-cpython-38/hdbscan/_hdbscan_tree.cpython-38-x86_64-linux-gnu.so
/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/Cython/Compiler/Main.py:381: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /tmp/pip-install-sir9k2dg/hdbscan_aa682700701c41ffa445f31aed278805/hdbscan/_hdbscan_tree.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/Cython/Compiler/Main.py:381: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /tmp/pip-install-sir9k2dg/hdbscan_aa682700701c41ffa445f31aed278805/hdbscan/_hdbscan_linkage.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
Error compiling Cython file:
------------------------------------------------------------
...
import numpy as np
cimport numpy as np
from libc.float cimport DBL_MAX
from dist_metrics cimport DistanceMetric
^
------------------------------------------------------------
hdbscan/_hdbscan_linkage.pyx:12:0: 'dist_metrics.pxd' not found
Error compiling Cython file:
------------------------------------------------------------
...
import numpy as np
cimport numpy as np
from libc.float cimport DBL_MAX
from dist_metrics cimport DistanceMetric
^
------------------------------------------------------------
hdbscan/_hdbscan_linkage.pyx:12:0: 'dist_metrics/DistanceMetric.pxd' not found
Error compiling Cython file:
------------------------------------------------------------
...
cpdef np.ndarray[np.double_t, ndim=2] mst_linkage_core_vector(
np.ndarray[np.double_t, ndim=2, mode='c'] raw_data,
np.ndarray[np.double_t, ndim=1, mode='c'] core_distances,
DistanceMetric dist_metric,
^
------------------------------------------------------------
hdbscan/_hdbscan_linkage.pyx:58:8: 'DistanceMetric' is not a type identifier
Error compiling Cython file:
------------------------------------------------------------
...
continue
right_value = current_distances[j]
right_source = current_sources[j]
left_value = dist_metric.dist(&raw_data_ptr[num_features *
^
------------------------------------------------------------
hdbscan/_hdbscan_linkage.pyx:129:42: Cannot convert 'double_t *' to Python object
Error compiling Cython file:
------------------------------------------------------------
...
right_value = current_distances[j]
right_source = current_sources[j]
left_value = dist_metric.dist(&raw_data_ptr[num_features *
current_node],
&raw_data_ptr[num_features * j],
^
------------------------------------------------------------
hdbscan/_hdbscan_linkage.pyx:131:42: Cannot convert 'double_t *' to Python object
Compiling hdbscan/_hdbscan_linkage.pyx because it changed.
[1/1] Cythonizing hdbscan/_hdbscan_linkage.pyx
Traceback (most recent call last):
File "/home/runner/.local/share/hatch/env/virtual/arize-phoenix/C8K4HrkP/type/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/home/runner/.local/share/hatch/env/virtual/arize-phoenix/C8K4HrkP/type/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/runner/.local/share/hatch/env/virtual/arize-phoenix/C8K4HrkP/type/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
return _build_backend().build_wheel(wheel_directory, config_settings,
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 416, in build_wheel
return self._build_with_temp_dir(['bdist_wheel'], '.whl',
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 401, in _build_with_temp_dir
self.run_setup()
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 487, in run_setup
super(_BuildMetaLegacyBackend,
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 338, in run_setup
exec(code, locals())
File "<string>", line 96, in <module>
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line [20](https://github.com/Arize-ai/phoenix/actions/runs/5577666975/jobs/10190745313?pr=917#step:6:21)1, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 343, in run
self.run_command("build")
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "<string>", line 26, in run
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
self._build_extensions_serial()
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
self.build_extension(ext)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/Cython/Distutils/build_ext.py", line 1[22](https://github.com/Arize-ai/phoenix/actions/runs/5577666975/jobs/10190745313?pr=917#step:6:23), in build_extension
new_ext = cythonize(
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1134, in cythonize
cythonize_one(*args)
File "/tmp/pip-build-env-0_kdszx7/overlay/lib/python3.8/site-packages/Cython/Build/Dependencies.py", line 1[30](https://github.com/Arize-ai/phoenix/actions/runs/5577666975/jobs/10190745313?pr=917#step:6:31)1, in cythonize_one
raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: hdbscan/_hdbscan_linkage.pyx
[end of output] |
We're seeming the same issue since today, on linux x86-64 py3.10. I noticed there weren't any wheels before, so I'm assuming we were building hdbscan from source before. Not sure what change now causes the build failure. |
Having this problem as well. Installing using poetry. No changes to lock file. Was working last week. |
This is also creating issue in Databricks as well. Cython released a new major version (3.0.0) a few hours ago, so there might be an issue with that on these managed enviroments. https://pypi.org/project/Cython/#history EDIT: databricks runtime 10.4 LTS has issues, 11.3 LTS and 12.2 LTS work fine. |
Downgrading Cython to previous release is not working for me. Still same error. |
I suggested cython only because of the timing of them releasing a new major version and the errors popping up. it might not be related. |
Downgrading Cython to 0.29.36 is also not working for me. |
Having the same issue on Kaggle notebooks. |
There was a recent sklearn release that changed some internals the hdbscan relied on (which resulted in the 0.8.30 release to attempt to fix those). It's possible that this is the issue; Can you check what sklearn version you have? |
scikit-learn==1.2.2 |
Same issue on ubuntu 18.04 using docker image python:3.8.12 |
I'm at a bit of a loss; especially if 0.8.29 is also not building anymore. I can at least reproduce this locally, but it is unclear how to fix things since nothing that is currently breaking has changed in quite some time -- so it isn't clear why it is breaking at all. |
Okay, I poked the obvious things in terms of module name resolution issues and it seems to have fixed the problem locally. I don't understand what changed, or, indeed, why this particular change is now required, but given the scale of issues people are having I'm going to push those changes out as a 0.8.31 release and hopefully that solves the problems for some people. |
I have an idea. This might be caused by isolated builds. When I install the package it pulls down the most recent version of
(comment is being updated as I'm testing my hypothesis...) |
The new patch kindof solved the issue for me. https://github.com/scikit-learn-contrib/hdbscan/releases/tag/0.8.31 |
Confirming working on |
@nchepanov I believe you are correct; while the changes made allowed Cython 3 to build hdbscan, there seem to be further issues at runtime. Until I have time to figure out and work through all the changes that Cython 3 requires I have added a "<3" requirement for Cython. That seems to resolve all the issues as far as I can tell. I've pushed that out as 0.8.32 and hopefully that can keep things afloat for a while. Thanks to everyone for flagging the issue and the help tracking down the source of the problem. |
This is more what i was thinking, I did remember something about isolated builds but could not locate it in the python docs. |
pip does not respect installed versions of packages in |
0.83.31 is not working for me. I'm running hdbscan inside a dockerized application, and getting the following error: `Traceback (most recent call last): File "/usr/src/app/modules/cluster.py", line 26, in fit File "/usr/local/lib/python3.10/dist-packages/hdbscan/hdbscan_.py", line 1205, in fit File "/usr/local/lib/python3.10/dist-packages/hdbscan/hdbscan_.py", line 884, in hdbscan File "/usr/local/lib/python3.10/dist-packages/hdbscan/hdbscan_.py", line 78, in _tree_to_labels File "hdbscan/_hdbscan_tree.pyx", line 43, in hdbscan._hdbscan_tree.condense_tree File "hdbscan/_hdbscan_tree.pyx", line 114, in hdbscan._hdbscan_tree.condense_tree TypeError: 'numpy.float64' object cannot be interpreted as an integer` I'm using scikit-learn==1.2.2. |
This is alsow unfortunately the same runtime exception I'm hitting with 0.83.32 |
So I definitely saw that runtime error with 0.8.31; in testing that disappeared with 0.8.32. If it is still an issue in 0.8.32 then that's not so good. I was getting all green on the test suite: https://dev.azure.com/lelandmcinnes/HDBSCAN%20builds/_build/results?buildId=901&view=results so I'm not sure what the lingering issue is. Perhaps a clean re-install for 0.8.32? |
I am sorry - but if there is anything I can do to further identify the cause, please let me know. |
Hey there, I'm facing a similar error, I was following the whole thread, and was impossible for me to find a solution, I will post the whole error stack trace, I've downloaded the hdbscan 0.8.33 and this error is raised just when applying the fit_transform of Bertopic. I hope someone can help me to find a solution. `--------------------------------------------------------------------------- File ~/anaconda3/envs/PYTRC_1/lib/python3.10/site-packages/hdbscan/hdbscan_.py:1205, in HDBSCAN.fit(self, X, y) File ~/anaconda3/envs/PYTRC_1/lib/python3.10/site-packages/hdbscan/hdbscan_.py:884, in hdbscan(X, min_cluster_size, min_samples, alpha, cluster_selection_epsilon, max_cluster_size, metric, p, leaf_size, algorithm, memory, approx_min_span_tree, gen_min_span_tree, core_dist_n_jobs, cluster_selection_method, allow_single_cluster, match_reference_implementation, **kwargs) File ~/anaconda3/envs/PYTRC_1/lib/python3.10/site-packages/hdbscan/hdbscan_.py:78, in _tree_to_labels(X, single_linkage_tree, min_cluster_size, cluster_selection_method, allow_single_cluster, match_reference_implementation, cluster_selection_epsilon, max_cluster_size) File hdbscan/_hdbscan_tree.pyx:43, in hdbscan._hdbscan_tree.condense_tree() File hdbscan/_hdbscan_tree.pyx:114, in hdbscan._hdbscan_tree.condense_tree() TypeError: 'numpy.float64' object cannot be interpreted as an integer During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) File ~/anaconda3/envs/PYTRC_1/lib/python3.10/site-packages/bertopic/_bertopic.py:389, in BERTopic.fit_transform(self, documents, embeddings, images, y) File ~/anaconda3/envs/PYTRC_1/lib/python3.10/site-packages/bertopic/_bertopic.py:3220, in BERTopic.cluster_embeddings(self, umap_embeddings, documents, partial_fit, y) File ~/anaconda3/envs/PYTRC_1/lib/python3.10/site-packages/hdbscan/hdbscan_.py:1205, in HDBSCAN.fit(self, X, y) File ~/anaconda3/envs/PYTRC_1/lib/python3.10/site-packages/hdbscan/hdbscan_.py:884, in hdbscan(X, min_cluster_size, min_samples, alpha, cluster_selection_epsilon, max_cluster_size, metric, p, leaf_size, algorithm, memory, approx_min_span_tree, gen_min_span_tree, core_dist_n_jobs, cluster_selection_method, allow_single_cluster, match_reference_implementation, **kwargs) File ~/anaconda3/envs/PYTRC_1/lib/python3.10/site-packages/hdbscan/hdbscan_.py:78, in _tree_to_labels(X, single_linkage_tree, min_cluster_size, cluster_selection_method, allow_single_cluster, match_reference_implementation, cluster_selection_epsilon, max_cluster_size) File hdbscan/_hdbscan_tree.pyx:43, in hdbscan._hdbscan_tree.condense_tree() File hdbscan/_hdbscan_tree.pyx:114, in hdbscan._hdbscan_tree.condense_tree() TypeError: 'numpy.float64' object cannot be interpreted as an integer` |
Hello everyone, I'm encountering the ongoing issue of "'numpy.float64' object cannot be interpreted as an integer" persistently with the function get_clusters() on my local setup. I'm utilizing the following package versions:
Unfortunately, the solutions proposed earlier have not yielded positive results. Are there any recent developments or updates within these packages that could potentially address this problem? Alternatively, could someone suggest a combination of package versions that might prove effective? I've noticed potential compatibility concerns between HDBScan, BERTopic, Cython, Python versions, and more. Your insights would be greatly appreciated. Thank you. |
Thank you for this informative issue. I understand from reading this that changing |
Hi, I have a similar issue. I am also facing the ongoing issue of TypeError: 'numpy.float64' object cannot be interpreted as an integer. My packages: Python 3.10.11, hdbscan 0.8.33, BERTopic 0.15.0. I installed the packages in a fresh conda environment, could run: from bertopic import BERTopic, but when run topic_model = BERTopic(), got every time TypeError: 'numpy.float64' object cannot be interpreted as an integer. Does someone have an idea how to solve this issue? |
@nchepanov Could I ask, so you first install hdbscan, and then cython>=0.27<3? I keep having this error when running the BERTopic package and can't get rid of it. |
I was experiencing a similar issue with a Streamlit app deployed on streamlit community server where I had previously specified hdbscan == 0.8.28 in the requirements file but with 0.8.33 it is working again. |
Still experiencing this with |
Today I found the following error message when trying to install hdbscan on colab.
It worked fine when I installed it last week.
I also tried to install the previous version of hdbscan (0.8.29), but it still failed.
The text was updated successfully, but these errors were encountered: