You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When running the NLP tutorial, on opening the 'text_embedding' link in the final step, the page loads but crashes after a few seconds when the program attempts to cluster the embeddings.
The problem might come from the data type of the vectors provided to the fit_predict is not right.
Error stack
'numpy.float64' object cannot be interpreted as an integer
GraphQL request:18:7
18 | UMAPPoints(timeRange: $timeRange, minDist: $minDist, nNeighbors: $nNeighbo
| ^
| rs, nSamples: $nSamples, minClusterSize: $minClusterSize, clusterMinSamples: $cl
Traceback (most recent call last):
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\graphql\execution\execute.py", line 521, in execute_field
result = resolve_fn(source, info, **args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\schema\schema_converter.py", line 597, in _resolver
return _get_result_with_extensions(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\schema\schema_converter.py", line 583, in extension_resolver
return reduce(
^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\schema\schema_converter.py", line 578, in wrapped_get_result
return _get_result(
^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\schema\schema_converter.py", line 539, in _get_result
return field.get_result(
^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\field.py", line 177, in get_result
return self.base_resolver(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\types\fields\resolver.py", line 187, in __call__
return self.wrapped_func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\phoenix\server\api\types\EmbeddingDimension.py", line 414, in UMAPPoints
).generate(data, n_components=n_components)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\phoenix\pointcloud\pointcloud.py", line 67, in generate
clusters = self.clustersFinder.find_clusters(projections)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\phoenix\pointcloud\clustering.py", line 21, in find_clusters
cluster_ids: npt.NDArray[np.int_] = HDBSCAN(**asdict(self)).fit_predict(mat)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\hdbscan\hdbscan_.py", line 1243, in fit_predict
self.fit(X)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\hdbscan\hdbscan_.py", line 1205, in fit
) = hdbscan(clean_data, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\hdbscan\hdbscan_.py", line 884, in hdbscan
_tree_to_labels(
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\hdbscan\hdbscan_.py", line 80, in _tree_to_labels
labels, probabilities, stabilities = get_clusters(
^^^^^^^^^^^^^
File "hdbscan\\_hdbscan_tree.pyx", line 659, in hdbscan._hdbscan_tree.get_clusters
File "hdbscan\\_hdbscan_tree.pyx", line 733, in hdbscan._hdbscan_tree.get_clusters
TypeError: 'numpy.float64' object cannot be interpreted as an integer
Stack (most recent call last):
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\threading.py", line 995, in _bootstrap
self._bootstrap_inner()
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\threading.py", line 1038, in _bootstrap_inner
self.run()
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\threading.py", line 975, in run
self._target(*self._args, **self._kwargs)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\uvicorn\server.py", line 61, in run
return asyncio.run(self.serve(sockets=sockets))
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\asyncio\runners.py", line 190, in run
return runner.run(main)
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\asyncio\runners.py", line 118, in run
return self._loop.run_until_complete(task)
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\asyncio\base_events.py", line 640, in run_until_complete
self.run_forever()
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\asyncio\base_events.py", line 607, in run_forever
self._run_once()
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\asyncio\base_events.py", line 1919, in _run_once
handle._run()
File "C:\Users\ju\.pyenv\pyenv-win\versions\3.11.1\Lib\asyncio\events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\starlette\middleware\base.py", line 166, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\starlette\middleware\exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\starlette\_exception_handler.py", line 44, in wrapped_app
await app(scope, receive, sender)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\starlette\routing.py", line 746, in __call__
await route.handle(scope, receive, send)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\starlette\routing.py", line 288, in handle
await self.app(scope, receive, send)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\asgi\__init__.py", line 111, in __call__
return await self.handle_http(scope, receive, send)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\asgi\__init__.py", line 178, in handle_http
response = await self.run(request)
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\http\async_base_view.py", line 176, in run
result = await self.execute_operation(
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\http\async_base_view.py", line 115, in execute_operation
return await self.schema.execute(
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\schema\schema.py", line 248, in execute
result = await execute(
File "c:\Users\ju\Code\Pentalog\jiratool\.phoenix-venv\Lib\site-packages\strawberry\schema\execute.py", line 156, in execute
process_errors(result.errors, execution_context)
To Reproduce
Steps to reproduce the behavior:
Run the NLP tutorial
Click on 'text_embedding'
Expected behavior
Should result in the screenshot seen on the tutorial.
Environment (please complete the following information):
OS: Windows 11
Notebook Runtime: Jupyter notebook & VS Code notebooks
Browser Chrome 114.0.5735.199
Version: 0.0.30
Additional context
Add any other context about the problem here (e.x. a link to a colab)
The text was updated successfully, but these errors were encountered:
@Kydlaw Thanks so much for filing a bug report! From looking at your stack trace, I believe you are hitting this bug in HDBSCAN when cython 3 launched. See scikit-learn-contrib/hdbscan#600 (comment)
We worked closely with the authors to get the issue resolved within HDBSCAN and have actually pinned HDBSCAN more aggressively:
We believe it's fixed in 0.8.33 (see scikit-learn-contrib/hdbscan@7611cfe). Let us know if it's still broken after upgrading to this version. We can work with the authors to try to bottom it out!
Describe the bug
When running the NLP tutorial, on opening the 'text_embedding' link in the final step, the page loads but crashes after a few seconds when the program attempts to cluster the embeddings.
The problem might come from the data type of the vectors provided to the fit_predict is not right.
Error stack
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Should result in the screenshot seen on the tutorial.
Environment (please complete the following information):
Additional context
Add any other context about the problem here (e.x. a link to a colab)
The text was updated successfully, but these errors were encountered: