-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIR] Tracking issues in AIR examples tested on Windows #27851
Comments
It might make sense to split these up into groups: fail to run to completion, emits internal deprecation warnings, emits other warnings, fails to produce the correct answer. Some examples may appear in more than one group. |
In an effort to make the notes more useful, I've grouped the issues I encountered above together so that we can more easily track them and check them off the list as they are addressed. They're grouped into internal warnings, errors, other warnings, and other todo items. I've labeled problems that I think are specific to a single example with the name of the affected example. @mattip Hopefully this is useful, but if you still think it would be good to group the examples by the problems encountered, I can do that too. Internal warnings[All four are fixed by #28315]
Details
Details
Details
Details
Details
Details
Details
Details
Details
and
Details
Other warnings
Details
Details
Details
Details
Details
Details
Errors - fails to run
DetailsAffects the following examples:
Details
Other todo items
DetailsExtra
DetailsExample information to include near the top of the notebook # import comet_ml at the top of your file
from comet_ml import Experiment
import os
comet_project = "ray_air_example"
# Create an experiment with your api key
experiment = Experiment(
api_key=os.environ['COMET_API_KEY'],
project_name=comet_project,
workspace="your_user_name",
)
# Rest of the example goes here
experiment.end()
Details
import wandb
wandb_project = "ray_air_example"
entity = "peytondmurray"
wandb.init(project=wandb_project, entity=entity)
Detailsfrom ray.ml.preprocessors import Chain, OrdinalEncoder, SimpleImputer changes to from ray.data.preprocessors import Chain, OrdinalEncoder, SimpleImputer and from ray.ml.checkpoint import Checkpoint
from ray.ml.predictors.integrations.xgboost import XGBoostPredictor changes to from ray.air.checkpoint import Checkpoint
from ray.train.xgboost import XGBoostPredictor
Details
|
The "Bad file descriptor" in |
Does the rllib initialization error in |
Thanks! This was quite an effort! |
@mattip I just tried running the I also tried creating an environment from scratch by doing pip install -ve . .[rllib] .[air] .[tune] .[data] gym tensorflow pygame ( Here's the output of Outdated packages
Edit: it looks like at least for |
What happened + What you expected to happen
I've spent some time testing the AIR examples on Windows. This issue is intended to track issues encountered while running through the examples manually. Each example has it's own collapsible section - let me know if a better format for these notes would be preferred.
1.
torch_image_example
2.
convert_existing_pytorch_code_to_ray_air
No additional notes, example worked as intended.
3.
tfx_tabular_train_to_serve
During the fit several warnings were generated before the fit failed.
Deprecation warnings:
placement_group_parameter
object_store_memory
placement_group
placement_group_bundle_index
placement_group_capture_child_tasks
pandas
deprecation warning:4.
huggingface_text_classification
result = trainer.fit()
the fit fails with the following error:5.
sklearn_example
No additional notes, example worked as intended.
6.
xgboost_example
ray.worker.get_resource_ids
deprecation warningray.worker.get_resource_ids
being called;python/ray/__init__.py
prints the stack for several of these deprecation warnings - seeray/python/ray/__init__.py
Line 201 in ea47d97
7.
analyze_tuning_results
DeprecationWarning
aboutray.worker.get_resource_ids
as inxgboost_example
traceback.print_stack
as inxgboost_example
8.
lightgbm_example
2*num_workers + 1
. Not sure if this is intended.DeprecationWarning
aboutray.worker.get_resource_ids
as inxgboost_example
traceback.print_stack
as inxgboost_example
9.
torch_incremental_learning
During the first training step:
parallelism
argument was directly specified in the example. However, a warning about aparallelism
argument was generated for some internal call:pandas
warnings generated about setting with copy:ray.serve.api
deprecation warningray.serve.deployment
deprecation warningAlso, the same errors about setting on a copy of a dataframe and read-only numpy arrays errors also appeared in the second training step.
10.
rl_serving_example
flatbuffers
maintainers have not made a release to handle this, see New Python release to avoid Deprecation Warning google/flatbuffers#6957. Likely nothing to be done here, as new versions offlatbuffers
have not been released to PyPI:keras
uses deprecatedpillow
functions. This is fixed by a recent commit tokeras
: Fix usage of deprecated Pillow interpolation methods keras-team/keras#16746, so will likely go away in the future when this dependency is updated:rllib
deprecation warnings due to invalid escape sequences at the start of fit:and
Deprecation warnings - same as in
tfx_tabular_train_to_serve
example:placement_group_parameter
object_store_memory
placement_group
placement_group_bundle_index
placement_group_capture_child_tasks
Deprecation warnings coming from
gym
related to usingCartPole-v0
:rllib
appears to be trying to initialize an object with a shape given by atensorflow.python.framework.tensor_shape.Dimension
, when it should use integers:11.
rl_online_example
rllib.agents.marwil
deprecation warning:Deprecation warnings - same as in
tfx_tabular_train_to_serve
example:placement_group_parameter
object_store_memory
placement_group
placement_group_bundle_index
placement_group_capture_child_tasks
Same
flatbuffers
,keras
, andCartPole-v0
warnings as inrl_serving_example
:flatbuffers
use ofimp
keras
calling deprecatedpillow
functionsCartPole-v0
out of dateFails to train due to the same
Dimension
error as inrl_serving_example
:12.
rl_offline_example
Same errors as in the
rl_online_example
:Same
flatbuffers
,keras
, andCartPole-v0
warnings as inrl_online_example
:flatbuffers
use ofimp
keras
calling deprecatedpillow
functionsCartPole-v0
out of dateplacement_group_parameter
object_store_memory
placement_group
placement_group_bundle_index
placement_group_capture_child_tasks
rllib.agents.marwil
Fit fails because of same
tensorflow.python.framework.tensor_shape.Dimension
error as inrl_online_example
13.
upload_to_comet_ml
DeprecationWarning
aboutray.worker.get_resource_ids
as inxgboost_example
traceback.print_stack
as inxgboost_example
COMET_API_KEY
environment variable. Definingcomet_project
at the top of the notebook also avoids having it undefined further down. For example:experiment.end()
at the end of the notebook14.
upload_to_wandb
DeprecationWarning
aboutray.worker.get_resource_ids
as inxgboost_example
traceback.print_stack
as inxgboost_example
WANDB_API_KEY
environment variable to the your API keywandb.init
, taking care to set the entity - the API docs say it's optional, but I kept getting an error until I set it:15.
feast_example
comet_ml
deprecation warning about imp module (?):Deprecation warnings:
placement_group_parameter
object_store_memory
placement_group
placement_group_bundle_index
placement_group_capture_child_tasks
Same
DeprecationWarning
aboutray.worker.get_resource_ids
as inxgboost_example
Same call to
traceback.print_stack
as inxgboost_example
Some broken Ray imports:
needs to be changed to
and
needs to be changed to
Versions / Dependencies
platform.platform()
Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.31
The latest development version of
ray
(410fe1b) was installed frommaster
viapip install -ve .
. All other python dependencies for the examples were installed throughpip
, as needed. No GPU was used in testing the examples.Environment
Reproduction script
Here are the notebooks for the examples I ran. Some of them may vary slightly from the examples; most of the time I bumped up the number of cpus. In cases where the example couldn't run without being fixed, I made the necessary changes to make the example run, if possible.
ipynbs.zip
Issue Severity
Low: It annoys or frustrates me.
The text was updated successfully, but these errors were encountered: