You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 16.04
Ray installed from (source or binary): wheels
Ray version: master
Python version: 3.6
Exact command to reproduce: Run atari-a2c in tuned_examples.
Describe the problem
In middle of Tune experiment, Redis seems to drop the connection (this is only single-node).
Source code / logs
== Status ==
Using FIFO scheduling algorithm.
Resources requested: 12/12 CPUs, 4/4 GPUs
Result logdir: /root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp
PENDING trials:
- mujoco-runner_4_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=5: PENDING
RUNNING trials:
- mujoco-runner_0_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=1: RUNNING [pid=7045], 27849 s, 607 ts, -101 acc
- mujoco-runner_1_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=2: RUNNING [pid=7047], 27832 s, 629 ts, -68.9 acc
- mujoco-runner_2_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=3: RUNNING [pid=7051], 27827 s, 614 ts, -57.2 acc
- mujoco-runner_3_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=4: RUNNING [pid=7049], 27817 s, 630 ts, -118 acc
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 911, in _process_task
self._store_outputs_in_objstore(return_object_ids, outputs)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 839, in _store_outputs_in_objstore
self.put_object(object_ids[i], outputs[i])
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 368, in put_object
self.store_and_register(object_id, value)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 303, in store_and_register
serialization_context=self.serialization_context)
File "pyarrow/_plasma.pyx", line 396, in pyarrow._plasma.PlasmaClient.put
File "pyarrow/_plasma.pyx", line 300, in pyarrow._plasma.PlasmaClient.create
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Encountered unexpected EOF
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/workers/default_worker.py", line 69, in <module>
ray.worker.global_worker.main_loop()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 1044, in main_loop
self._wait_for_and_process_task(task)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 1003, in _wait_for_and_process_task
self._process_task(task)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 915, in _process_task
ray.utils.format_error_message(traceback.format_exc()))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 925, in _handle_process_task_failure
self._store_outputs_in_objstore(return_object_ids, failure_objects)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 839, in _store_outputs_in_objstore
self.put_object(object_ids[i], outputs[i])
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 368, in put_object
self.store_and_register(object_id, value)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 303, in store_and_register
serialization_context=self.serialization_context)
File "pyarrow/_plasma.pyx", line 396, in pyarrow._plasma.PlasmaClient.put
File "pyarrow/_plasma.pyx", line 300, in pyarrow._plasma.PlasmaClient.create
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Broken pipe
This error is unexpected and should not have happened. Somehow a worker
crashed in an unanticipated way causing the main_loop to throw an exception,
which is being caught in "python/ray/workers/default_worker.py".
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/opt/conda/envs/softlearning/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/softqlearning-private/examples/mujoco_all_ray.py", line 223, in <module>
main()
File "/root/softqlearning-private/examples/mujoco_all_ray.py", line 218, in main
for policy, variant_spec in zip(args.policy, variant_specs)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/tune.py", line 91, in run_experiments
runner.step()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 103, in step
self._process_events()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 252, in _process_events
[result_id], _ = ray.wait(list(self._running))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 2879, in wait
object_id_strs, timeout, num_returns)
File "pyarrow/_plasma.pyx", line 590, in pyarrow._plasma.PlasmaClient.wait
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Encountered unexpected EOF
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_3_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=4_2018-08-18_23-09-00h15_vgl_/progress.csv' mode='w' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_3_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=4_2018-08-18_23-09-00h15_vgl_/result.json' mode='w' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_2_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=3_2018-08-18_23-09-00fxuq4tqx/progress.csv' mode='w' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_2_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=3_2018-08-18_23-09-00fxuq4tqx/result.json' mode='w' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_1_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=2_2018-08-18_23-09-00h2mqxkgy/progress.csv' mode='w' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_1_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=2_2018-08-18_23-09-00h2mqxkgy/result.json' mode='w' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_0_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=1_2018-08-18_23-09-000paimgmv/progress.csv' mode='w' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_0_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=1_2018-08-18_23-09-000paimgmv/result.json' mode='w' encoding='UTF-8'>
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 484, in connect
sock = self._connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 541, in _connect
raise err
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 529, in _connect
sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 667, in execute_command
connection.send_command(*args)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 585, in send_packed_command
self.connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 489, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.17.0.2:25143. Connection refused.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 484, in connect
sock = self._connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 541, in _connect
raise err
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 529, in _connect
sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 2183, in connect
ray.services.check_version_info(worker.redis_client)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/services.py", line 382, in check_version_info
redis_reply = redis_client.get("VERSION_INFO")
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 976, in get
return self.execute_command('GET', name)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 673, in execute_command
connection.send_command(*args)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 585, in send_packed_command
self.connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 489, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.17.0.2:25143. Connection refused.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 484, in connect
sock = self._connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 541, in _connect
raise err
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 529, in _connect
sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 667, in execute_command
connection.send_command(*args)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 585, in send_packed_command
(softlearning) root@2781f2e0e6c2:~/softqlearning-private# self.connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 489, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.17.0.2:25143. Connection refused.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 484, in connect
sock = self._connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 541, in _connect
raise err
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 529, in _connect
sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/workers/default_worker.py", line 55, in <module>
info, mode=ray.WORKER_MODE, use_raylet=(args.raylet_name is not None))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 2194, in connect
driver_id=None)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/utils.py", line 117, in push_error_to_driver_through_redis
"data": data
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 2011, in hmset
return self.execute_command('HMSET', name, *items)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 673, in execute_command
connection.send_command(*args)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 585, in send_packed_command
self.connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 489, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.17.0.2:25143. Connection refused.
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 911, in _process_task
self._store_outputs_in_objstore(return_object_ids, outputs)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 839, in _store_outputs_in_objstore
self.put_object(object_ids[i], outputs[i])
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 368, in put_object
self.store_and_register(object_id, value)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 303, in store_and_register
serialization_context=self.serialization_context)
File "pyarrow/_plasma.pyx", line 396, in pyarrow._plasma.PlasmaClient.put
File "pyarrow/_plasma.pyx", line 300, in pyarrow._plasma.PlasmaClient.create
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/workers/default_worker.py", line 69, in <module>
ray.worker.global_worker.main_loop()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 1044, in main_loop
self._wait_for_and_process_task(task)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 1003, in _wait_for_and_process_task
self._process_task(task)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 915, in _process_task
ray.utils.format_error_message(traceback.format_exc()))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 925, in _handle_process_task_failure
self._store_outputs_in_objstore(return_object_ids, failure_objects)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 839, in _store_outputs_in_objstore
self.put_object(object_ids[i], outputs[i])
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 368, in put_object
self.store_and_register(object_id, value)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 303, in store_and_register
serialization_context=self.serialization_context)
File "pyarrow/_plasma.pyx", line 396, in pyarrow._plasma.PlasmaClient.put
File "pyarrow/_plasma.pyx", line 300, in pyarrow._plasma.PlasmaClient.create
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 177, in _read_from_socket
raise socket.error(SERVER_CLOSED_CONNECTION_ERROR)
OSError: Connection closed by server.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 668, in execute_command
return self.parse_response(connection, command_name, **options)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 680, in parse_response
response = connection.read_response()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 624, in read_response
response = self._parser.read_response()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 284, in read_response
response = self._buffer.readline()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 216, in readline
self._read_from_socket()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 191, in _read_from_socket
(e.args,))
redis.exceptions.ConnectionError: Error while reading from socket: ('Connection closed by server.',)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 484, in connect
sock = self._connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 541, in _connect
raise err
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 529, in _connect
sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/workers/default_worker.py", line 76, in <module>
driver_id=None)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/utils.py", line 76, in push_error_to_driver
"data": data
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 2011, in hmset
return self.execute_command('HMSET', name, *items)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 673, in execute_command
connection.send_command(*args)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 585, in send_packed_command
self.connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 489, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.17.0.2:25143. Connection refused.
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_2_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=3_2018-08-18_23-09-00fxuq4tqx/rllab-logger/debug.log' mode='a' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_2_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=3_2018-08-18_23-09-00fxuq4tqx/rllab-logger/progress.csv' mode='w' encoding='UTF-8'>
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 911, in _process_task
self._store_outputs_in_objstore(return_object_ids, outputs)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 839, in _store_outputs_in_objstore
self.put_object(object_ids[i], outputs[i])
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 368, in put_object
self.store_and_register(object_id, value)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 303, in store_and_register
serialization_context=self.serialization_context)
File "pyarrow/_plasma.pyx", line 396, in pyarrow._plasma.PlasmaClient.put
File "pyarrow/_plasma.pyx", line 300, in pyarrow._plasma.PlasmaClient.create
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/workers/default_worker.py", line 69, in <module>
ray.worker.global_worker.main_loop()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 1044, in main_loop
self._wait_for_and_process_task(task)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 1003, in _wait_for_and_process_task
self._process_task(task)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 915, in _process_task
ray.utils.format_error_message(traceback.format_exc()))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 925, in _handle_process_task_failure
self._store_outputs_in_objstore(return_object_ids, failure_objects)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 839, in _store_outputs_in_objstore
self.put_object(object_ids[i], outputs[i])
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 368, in put_object
self.store_and_register(object_id, value)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 303, in store_and_register
serialization_context=self.serialization_context)
File "pyarrow/_plasma.pyx", line 396, in pyarrow._plasma.PlasmaClient.put
File "pyarrow/_plasma.pyx", line 300, in pyarrow._plasma.PlasmaClient.create
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 177, in _read_from_socket
raise socket.error(SERVER_CLOSED_CONNECTION_ERROR)
OSError: Connection closed by server.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 668, in execute_command
return self.parse_response(connection, command_name, **options)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 680, in parse_response
response = connection.read_response()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 624, in read_response
response = self._parser.read_response()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 284, in read_response
response = self._buffer.readline()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 216, in readline
self._read_from_socket()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 191, in _read_from_socket
(e.args,))
redis.exceptions.ConnectionError: Error while reading from socket: ('Connection closed by server.',)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 484, in connect
sock = self._connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 541, in _connect
raise err
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 529, in _connect
sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/workers/default_worker.py", line 76, in <module>
driver_id=None)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/utils.py", line 76, in push_error_to_driver
"data": data
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 2011, in hmset
return self.execute_command('HMSET', name, *items)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 673, in execute_command
connection.send_command(*args)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 585, in send_packed_command
self.connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 489, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.17.0.2:25143. Connection refused.
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_1_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=2_2018-08-18_23-09-00h2mqxkgy/rllab-logger/debug.log' mode='a' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_1_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=2_2018-08-18_23-09-00h2mqxkgy/rllab-logger/progress.csv' mode='w' encoding='UTF-8'>
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 911, in _process_task
self._store_outputs_in_objstore(return_object_ids, outputs)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 839, in _store_outputs_in_objstore
self.put_object(object_ids[i], outputs[i])
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 368, in put_object
self.store_and_register(object_id, value)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 303, in store_and_register
serialization_context=self.serialization_context)
File "pyarrow/_plasma.pyx", line 396, in pyarrow._plasma.PlasmaClient.put
File "pyarrow/_plasma.pyx", line 300, in pyarrow._plasma.PlasmaClient.create
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/workers/default_worker.py", line 69, in <module>
ray.worker.global_worker.main_loop()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 1044, in main_loop
self._wait_for_and_process_task(task)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 1003, in _wait_for_and_process_task
self._process_task(task)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 915, in _process_task
ray.utils.format_error_message(traceback.format_exc()))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 925, in _handle_process_task_failure
self._store_outputs_in_objstore(return_object_ids, failure_objects)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 839, in _store_outputs_in_objstore
self.put_object(object_ids[i], outputs[i])
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 368, in put_object
self.store_and_register(object_id, value)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/worker.py", line 303, in store_and_register
serialization_context=self.serialization_context)
File "pyarrow/_plasma.pyx", line 396, in pyarrow._plasma.PlasmaClient.put
File "pyarrow/_plasma.pyx", line 300, in pyarrow._plasma.PlasmaClient.create
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 177, in _read_from_socket
raise socket.error(SERVER_CLOSED_CONNECTION_ERROR)
OSError: Connection closed by server.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 668, in execute_command
return self.parse_response(connection, command_name, **options)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 680, in parse_response
response = connection.read_response()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 624, in read_response
response = self._parser.read_response()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 284, in read_response
response = self._buffer.readline()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 216, in readline
self._read_from_socket()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 191, in _read_from_socket
(e.args,))
redis.exceptions.ConnectionError: Error while reading from socket: ('Connection closed by server.',)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 484, in connect
sock = self._connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 541, in _connect
raise err
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 529, in _connect
sock.connect(socket_address)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/workers/default_worker.py", line 76, in <module>
driver_id=None)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/ray/utils.py", line 76, in push_error_to_driver
"data": data
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 2011, in hmset
return self.execute_command('HMSET', name, *items)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/client.py", line 673, in execute_command
connection.send_command(*args)
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 610, in send_command
self.send_packed_command(self.pack_command(*args))
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 585, in send_packed_command
self.connect()
File "/opt/conda/envs/softlearning/lib/python3.6/site-packages/redis/connection.py", line 489, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to 172.17.0.2:25143. Connection refused.
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_0_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=1_2018-08-18_23-09-000paimgmv/rllab-logger/debug.log' mode='a' encoding='UTF-8'>
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/root/ray_results/gym/pusher-2d/image-reach/20180818-image-reach-spatial-softmax-lsp-4-lsp/mujoco-runner_0_discount=0.99,arm_goal_distance_cost_coeff=1.0,image_size=32x32x3,seed=1_2018-08-18_23-09-000paimgmv/rllab-logger/progress.csv' mode='w' encoding='UTF-8'>
The text was updated successfully, but these errors were encountered:
System information
atari-a2c
intuned_examples
.Describe the problem
In middle of Tune experiment, Redis seems to drop the connection (this is only single-node).
Source code / logs
The text was updated successfully, but these errors were encountered: