Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when appending to existing run through remote tracker #2125

Closed
jiyuanq opened this issue Aug 31, 2022 · 5 comments
Closed

Error when appending to existing run through remote tracker #2125

jiyuanq opened this issue Aug 31, 2022 · 5 comments
Assignees
Labels
help wanted Extra attention is needed phase / shipped Issue phase: shipped type / bug Issue type: something isn't working
Milestone

Comments

@jiyuanq
Copy link

jiyuanq commented Aug 31, 2022

🐛 Bug

File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1286, in run_stage
return self._run_evaluate()
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1334, in _run_evaluate
eval_loop_results = self._evaluation_loop.run()
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 151, in run
output = self.on_run_end()
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 137, in on_run_end
eval_loop_results = self.trainer.logger_connector.update_eval_epoch_metrics()
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py", line 182, in update_eval_epoch_metrics
self.log_metrics(metrics["log"])
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py", line 121, in log_metrics
self.trainer.logger.save()
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loggers/base.py", line 317, in save
self._finalize_agg_metrics()
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loggers/base.py", line 152, in _finalize_agg_metrics
self.log_metrics(metrics=metrics_to_log, step=agg_step)
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/utilities/distributed.py", line 50, in wrapped_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/aim/sdk/adapters/pytorch_lightning.py", line 117, in log_metrics
self.experiment.track(v, name=name, step=step, context=context)
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loggers/base.py", line 43, in experiment
return get_experiment() or DummyExperiment()
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/utilities/distributed.py", line 50, in wrapped_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loggers/base.py", line 41, in get_experiment
return fn(self)
File "/usr/local/lib/python3.8/dist-packages/aim/sdk/adapters/pytorch_lightning.py", line 64, in experiment
self._run = Run(
File "/usr/local/lib/python3.8/dist-packages/aim/sdk/run.py", line 345, in __init__
self._prepare_resource_tracker(system_tracking_interval, capture_terminal_logs)
File "/usr/local/lib/python3.8/dist-packages/aim/sdk/run.py", line 407, in _prepare_resource_tracker
current_logs = self.get_terminal_logs()
File "/usr/local/lib/python3.8/dist-packages/aim/sdk/run.py", line 595, in get_terminal_logs
return self._get_sequence('logs', 'logs', Context({}))
File "/usr/local/lib/python3.8/dist-packages/aim/sdk/run.py", line 638, in _get_sequence
return sequence if bool(sequence) else None
File "/usr/local/lib/python3.8/dist-packages/aim/sdk/sequence.py", line 332, in __bool__
return bool(self.values)
File "aim/storage/treearrayview.py", line 47, in aim.storage.treearrayview.TreeArrayView.__bool__
File "aim/storage/treearrayview.py", line 41, in aim.storage.treearrayview.TreeArrayView.__len__
File "aim/storage/treearrayview.py", line 117, in aim.storage.treearrayview.TreeArrayView.last_idx
File "/usr/local/lib/python3.8/dist-packages/aim/storage/treeviewproxy.py", line 254, in last_key
return self.tree.last_key(self.absolute_path(path))
File "/usr/local/lib/python3.8/dist-packages/aim/storage/treeviewproxy.py", line 149, in last_key
return self._rpc_client.run_instruction(self._hash, self._handler, 'last', (path,))
File "/usr/local/lib/python3.8/dist-packages/aim/ext/transport/client.py", line 111, in run_instruction
return self._run_read_instructions(queue_id, resource, method, args)
File "/usr/local/lib/python3.8/dist-packages/aim/ext/transport/client.py", line 136, in _run_read_instructions
raise_exception(status_msg.header.exception)
File "/usr/local/lib/python3.8/dist-packages/aim/ext/transport/message_utils.py", line 76, in raise_exception
raise exception(*args) if args else exception()
AttributeError: 'aim.storage.containertreeview.ContainerTreeView' object has no attribute 'last'

To reproduce

With pytorch lightning, first run training, and then run evaluation with the same AimLogger and the same run hash.

I checked the code (https://github.com/aimhubio/aim/blob/main/aim/storage/treeviewproxy.py#L149), is it actually a typo? I guess it should be 'last_key' instead of 'last'

Expected behavior

It should work correctly like non-remote tracker case

Environment

  • Aim Version: 3.13.0
  • Python version: 3.8.10
  • OS (e.g., Linux): ubuntu 20.04
@jiyuanq jiyuanq added help wanted Extra attention is needed type / bug Issue type: something isn't working labels Aug 31, 2022
@gorarakelyan
Copy link
Contributor

@mihran113 could you please take a look at this?

@mihran113
Copy link
Contributor

Hey @jiyuanq. Thanks a lot for the report. That seems to be a typo actually. Will do the fix ASAP and we'll ship it with the next patch release.

@mihran113 mihran113 self-assigned this Aug 31, 2022
@mihran113 mihran113 moved this to Current in Aim roadmap Aug 31, 2022
@mihran113 mihran113 added this to the v3.13.x milestone Aug 31, 2022
@mihran113 mihran113 added phase / review-needed Issue phase: issues that are done and needs review phase / ready-to-go Issue phase: issues that are merged and will be included in the upcoming release and removed phase / review-needed Issue phase: issues that are done and needs review labels Aug 31, 2022
@jiyuanq
Copy link
Author

jiyuanq commented Sep 1, 2022

Hey @jiyuanq. Thanks a lot for the report. That seems to be a typo actually. Will do the fix ASAP and we'll ship it with the next patch release.

Thank you for the quick fix! Do we have a test case that covers this scenario? If not, it might be a good idea to add one

@mihran113
Copy link
Contributor

Hey @jiyuanq. Thanks a lot for the report. That seems to be a typo actually. Will do the fix ASAP and we'll ship it with the next patch release.

Thank you for the quick fix! Do we have a test case that covers this scenario? If not, it might be a good idea to add one

Unfortunately not, we don't have test cases for remote tracking for now.
Will make sure to cover this when adding test-cases for remote tracking. 🙌

@alberttorosyan alberttorosyan added phase / shipped Issue phase: shipped and removed phase / ready-to-go Issue phase: issues that are merged and will be included in the upcoming release labels Sep 5, 2022
@gorarakelyan
Copy link
Contributor

Hey @jiyuanq. The fix has been shipped with Aim v3.13.1. Thanks for reporting!

@gorarakelyan gorarakelyan moved this from Current to Done in Aim roadmap Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed phase / shipped Issue phase: shipped type / bug Issue type: something isn't working
Projects
Status: Done
Development

No branches or pull requests

4 participants