Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] [python-package] [dask] some CI jobs broken, others skipping Dask tests #6365

Closed
jameslamb opened this issue Mar 17, 2024 · 4 comments
Closed

Comments

@jameslamb
Copy link
Collaborator

jameslamb commented Mar 17, 2024

Description

All macOS and Linux regular CI jobs are failing with this error:

Traceback (most recent call last):
created a Dask LocalCluster
  File "/Users/runner/work/LightGBM/LightGBM/examples/python-guide/dask/binary-classification.py", line 27, in <module>
distributing training data on the Dask cluster
    dask_model.fit(dX, dy)
beginning training
  File "/Users/runner/.local/lib/python3.9/site-packages/lightgbm/dask.py", line 1190, in fit
    self._lgb_dask_fit(
  File "/Users/runner/.local/lib/python3.9/site-packages/lightgbm/dask.py", line 1060, in _lgb_dask_fit
    raise LightGBMError("dask is required for lightgbm.dask")
lightgbm.basic.LightGBMError: dask is required for lightgbm.dask
``

Example recent build: ([build link](https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=15909&view=logs&j=c28dceab-947a-5848-c21f-eef3695e5f11&t=fa158246-17e2-53d4-5936-86070edbaacf)).

Others, like the `* sdist` jobs, are technically being marked successful, but all the `test_dask.py` tests are being skipped, and I see this in the logs:

```text
../../../home/AzDevOps_azpcontainer/miniforge/envs/test-env/lib/python3.11/site-packages/dask/dataframe/__init__.py:31
  /home/AzDevOps_azpcontainer/miniforge/envs/test-env/lib/python3.11/site-packages/dask/dataframe/__init__.py:31: FutureWarning: 
  Dask dataframe query planning is disabled because dask-expr is not installed.
  
  You can install it with `pip install dask[dataframe]` or `conda install dask`.
  This will raise in a future version.

Example recent build: (build link)

Reproducible example

Look at any Linux or macOS CI jobs in this project, such as those linked above.

Additional Comments

Why would this happen on macOS?

The dask unit tests are skipped on macOS:

if not platform.startswith("linux"):
pytest.skip("lightgbm.dask is currently supported in Linux environments", allow_module_level=True)

But the lightgbm.dask examples from https://github.com/microsoft/LightGBM/tree/master/examples/python-guide/dask are still run on macOS for CI jobs with names like * regular.

for f in *.py **/*.py; do python $f || exit 1; done # run all examples

What might the root cause be?

The latest release of dask, v2024.3.1., was published to PyPI and conda-forge 2 days ago:

As of those releases, dask-expr is not a required runtime dependency of dask.

However, it seems that is now IS a requirement if you want to use dask.dataframe 🙃

docker run \
    --rm \
    -it python:3.10 \
    bash

pip install 'dask==2024.3.1' 'pandas==2.2.1'
python -c "from dask.dataframe import DataFrame"
ModuleNotFoundError: No module named 'dask_expr'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/site-packages/dask/dataframe/__init__.py", line 110, in <module>
    raise ImportError(msg) from e
ImportError: Dask dataframe requirements are not installed.

Since dask-expr wasn't added as a runtime dependency in the dask-core package on conda-forge, importing that library in lightgbm's CI fails, probably causing these errors.

It IS a runtime dependency of the dask package on conda-forge (link).

So I guess our options for LightGBM's CI environments, assuming we want to keep getting conda packages for dask and distributed, are:

  • keep installing dask-core conda-forge package + pip install dask-expr
  • keep installing dask-core conda-forge package + install dask-expr conda-forge package
  • switch from dask-core conda-forge package to dask conda-forge package

References

Discussions about how this was rolled out:

dask-expr also adds a requirement of pandas>=2.0

@jameslamb
Copy link
Collaborator Author

After switching from dask-core to dask package conda-forge, the installation issues here were resolved.

But I now see a new issue:

ValueError: For a 1d array, columns must be a scalar or single element list

Example stacktrace:

tests/python_package_test/test_dask.py:222: in _accuracy_score
    return da.average(dy_true == dy_pred).compute()
/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask_expr/_collection.py:160: in _wrap_expr_op
    other = from_dask_array(
/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask_expr/_collection.py:4730: in from_dask_array
    df = from_dask_array(x, columns=columns, index=index, meta=meta)
/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask/dataframe/io/io.py:433: in from_dask_array
    meta = _meta_from_array(x, columns, index, meta=meta)

build link: https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=15911&view=logs&j=c28dceab-947a-5848-c21f-eef3695e5f11&t=fa158246-17e2-53d4-5936-86070edbaacf

@jameslamb
Copy link
Collaborator Author

Think I've identified the root cause of the issue, and documented it over in dask/dask#11006.

To unblock CI, I'm gonna try just .compute()-ing the inputs in test functions like these:

def _accuracy_score(dy_true, dy_pred):
return da.average(dy_true == dy_pred).compute()

Those functions aren't testing anything Dask-specific like chunking, types, etc. ... just values. And they're operating on small test inputs, so the extra memory usage from .compute()-ing them should be negligible.

@jameslamb
Copy link
Collaborator Author

The Dask tests are now running and passing on this #6357 🎉

(build link)

But I found another problem... the check that we run to ensure that lightgbm can be imported even if its optional dependencies are unavailable is failing, like this:

ValueError: Must install dask-expr to activate query planning.
full traceback (click me)
Traceback (most recent call last):
  File "/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask/dataframe/__init__.py", line 22, in _dask_expr_enabled
    import dask_expr  # noqa: F401
  File "/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask_expr/__init__.py", line 3, in <module>
    from dask_expr import _version, datasets
  File "/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask_expr/datasets.py", line 8, in <module>
    from dask_expr._collection import new_collection
  File "/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask_expr/_collection.py", line 15, in <module>
    import pyarrow as pa
ModuleNotFoundError: No module named 'pyarrow'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.9/site-packages/lightgbm/__init__.py", line 8, in <module>
    from .basic import Booster, Dataset, Sequence, register_logger
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.9/site-packages/lightgbm/basic.py", line 21, in <module>
    from .compat import (
  File "/home/AzDevOps_azpcontainer/.local/lib/python3.9/site-packages/lightgbm/compat.py", line 162, in <module>
    from dask.dataframe import DataFrame as dask_DataFrame
  File "/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask/dataframe/__init__.py", line 87, in <module>
    if _dask_expr_enabled():
  File "/opt/miniforge/envs/test-env/lib/python3.9/site-packages/dask/dataframe/__init__.py", line 24, in _dask_expr_enabled
    raise ValueError("Must install dask-expr to activate query planning.")
ValueError: Must install dask-expr to activate query planning.

dask is raising an import-time ValueError, which lightgbm's import strategy doesn't catch.

DASK_INSTALLED = True
except ImportError:
DASK_INSTALLED = False

I put up a proposed change at dask/dask#11007 to fix that.

@jameslamb
Copy link
Collaborator Author

This was resolved by #6357.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant