Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection Timeout when calling DVC Push to Azure remote storage #42

Closed
markFriel opened this issue Mar 1, 2023 · 5 comments
Closed
Assignees

Comments

@markFriel
Copy link

Bug Report

Description

When trying to push tracked files to remote storage on azure I get a error where the connection times out. The process appear to hang when trying to check for the existence of objects on the remote storage.

Reproduce

  1. dvc init
  2. dvc remote add -d remoteregistry azure://container_name/path
  3. dvc add dataset.zip
  4. dvc remote modify remoteregistry --local connection_string <connection_string>
  5. dvc push

Output

2023-03-01 16:30:42,551 DEBUG: v2.45.1 (pip), CPython 3.9.16 on Windows-10-10.0.19044-SP0
2023-03-01 16:30:42,551 DEBUG: command: C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\Scripts\dvc push datasets/test_folder/cb54000282dd4e3891aa8057adc092ff.jpg.dvc -v
2023-03-01 16:30:44,192 DEBUG: Preparing to transfer data from 'C:\NLP_Project\autodocs_dataset_creation\.dvc\cache' to 'autodocs-classifiers-datasets/'
2023-03-01 16:30:44,192 DEBUG: Preparing to collect status from 'autodocs-classifiers-datasets/'
2023-03-01 16:30:44,192 DEBUG: Collecting status from 'autodocs-classifiers-datasets/'
2023-03-01 16:30:44,194 DEBUG: Querying 1 oids via object_exists
2023-03-01 16:33:26,160 ERROR: unexpected error - Connection timeout to host https://storage_account.blob.core.windows.net/autodocs-classifiers-datasets/39/6dfb4c4cbe20ce15cd7ad4a569dd95: Connection timeout to host https://storage_account.blob.core.windows.net/autodocs-classifiers-datasets/39/6dfb4c4cbe20ce15cd7ad4a569dd95:
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\aiohttp\connector.py", line 980, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore[return-value]  # noqa
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\asyncio\base_events.py", line 1050, in create_connection
    sock = await self._connect_sock(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\asyncio\base_events.py", line 961, in _connect_sock
    await self.sock_connect(sock, address)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\asyncio\selector_events.py", line 500, in sock_connect
    return await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\aiohttp\client.py", line 536, in _request
    conn = await self._connector.connect(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\aiohttp\connector.py", line 540, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\aiohttp\connector.py", line 901, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\aiohttp\connector.py", line 1175, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\aiohttp\connector.py", line 980, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore[return-value]  # noqa
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\async_timeout\__init__.py", line 129, in __aexit__
    self._do_exit(exc_type)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\async_timeout\__init__.py", line 212, in _do_exit
    raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\transport\_aiohttp.py", line 257, in send
    result = await self.session.request(  # type: ignore
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\aiohttp\client.py", line 540, in _request
    raise ServerTimeoutError(
aiohttp.client_exceptions.ServerTimeoutError: Connection timeout to host https://storage_account.blob.core.windows.net/autodocs-classifiers-datasets/39/6dfb4c4cbe20ce15cd7ad4a569dd95

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc\cli\__init__.py", line 210, in main
    ret = cmd.do_run()
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc\cli\command.py", line 26, in do_run
    return self.run()
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc\commands\data_sync.py", line 59, in run
    processed_files_count = self.repo.push(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc\repo\__init__.py", line 58, in wrapper
    return f(repo, *args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc\repo\push.py", line 89, in push
    result = self.cloud.push(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc\data_cloud.py", line 154, in push
    return self.transfer(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc\data_cloud.py", line 135, in transfer
    return transfer(src_odb, dest_odb, objs, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc_data\hashfile\transfer.py", line 203, in transfer
    status = compare_status(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc_data\hashfile\status.py", line 178, in compare_status
    dest_exists, dest_missing = status(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc_data\hashfile\status.py", line 149, in status
    odb.oids_exist(hashes, jobs=jobs, progress=pbar.callback)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc_objects\db.py", line 412, in oids_exist
    return list(wrap_iter(remote_oids, callback))
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc_objects\db.py", line 36, in wrap_iter
    for index, item in enumerate(iterable, start=1):
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc_objects\db.py", line 358, in list_oids_exists
    in_remote = self.fs.exists(paths, batch_size=jobs)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc_objects\fs\base.py", line 345, in exists
    return fut.result()
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\concurrent\futures\_base.py", line 446, in result
    return self.__get_result()
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\concurrent\futures\_base.py", line 391, in __get_result
    raise self._exception
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\dvc_objects\executors.py", line 134, in batch_coros
    result = fut.result()
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\adlfs\spec.py", line 1410, in _exists
    if await bc.exists(version_id=version_id):
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\tracing\decorator_async.py", line 79, in wrapper_use_tracer
    return await func(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\storage\blob\aio\_blob_client_async.py", line 672, in exists
    await self._client.blob.get_properties(
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\tracing\decorator_async.py", line 79, in wrapper_use_tracer
    return await func(*args, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\storage\blob\_generated\aio\operations\_blob_operations.py", line 473, in get_properties
    pipeline_response = await self._client._pipeline.run(  # type: ignore # pylint: disable=protected-access
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 200, in run
    return await first_node.send(pipeline_request)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)  # type: ignore
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)  # type: ignore
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)  # type: ignore
  [Previous line repeated 5 more times]
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\policies\_redirect_async.py", line 62, in send
    response = await self.next.send(request)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)  # type: ignore
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\storage\blob\_shared\policies_async.py", line 137, in send
    raise err
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\storage\blob\_shared\policies_async.py", line 111, in send
    response = await self.next.send(request)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)  # type: ignore
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\storage\blob\_shared\policies_async.py", line 64, in send
    response = await self.next.send(request)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)  # type: ignore
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 68, in send
    response = await self.next.send(request)  # type: ignore
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\_base_async.py", line 101, in send
    await self._sender.send(request.http_request, **request.context.options),
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\storage\blob\_shared\base_client_async.py", line 176, in send
    return await self._transport.send(request, **kwargs)
  File "C:\ProgramData\Anaconda3\envs\autodocs_dataset_creation\lib\site-packages\azure\core\pipeline\transport\_aiohttp.py", line 289, in send
    raise ServiceRequestError(err, error=err) from err
azure.core.exceptions.ServiceRequestError: Connection timeout to host https://<storage_account>.blob.core.windows.net/container_name/39/6dfb4c4cbe20ce15cd7ad4a569dd95

Expected

I expect the data to be pushed to the remote storage.

Environment information

Platform: Python 3.9.16 on Windows-10-10.0.19044-SP0

Subprojects:
dvc_data = 0.40.3
dvc_objects = 0.19.3
dvc_render = 0.2.0
dvc_task = 0.1.11
dvclive = 2.1.0
scmrepo = 0.1.11

Supports:
azure (adlfs = 2023.1.0, knack = 0.10.1, azure-identity = 1.12.0),
http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3)
Cache types: hardlink, symlink
Cache directory: NTFS on C:
Caches: local
Remotes: azure
Workspace directory: NTFS on C:
Repo: dvc, git

Additional Information (if any):

@efiop efiop transferred this issue from iterative/dvc Mar 8, 2023
@efiop
Copy link
Contributor

efiop commented Mar 8, 2023

@markFriel Does az cli work for you? Are you able to upload large files with those credentials to that container?

@efiop
Copy link
Contributor

efiop commented Oct 2, 2023

Getting another report, but now about read timeout

image

First thing we should do here is add support for available timeout options (e.g. in upload_blob), so we could at least tweak it and see if increasing them works.

@efiop efiop self-assigned this Oct 2, 2023
@efiop efiop added this to DVC Oct 2, 2023
@github-project-automation github-project-automation bot moved this to Backlog in DVC Oct 2, 2023
@efiop efiop moved this from Backlog to In Progress in DVC Oct 2, 2023
@pezosanta
Copy link

pezosanta commented Oct 17, 2023

Hi @efiop,

We have encountered this exact same issue (azure remote, dvc pull, same error messages etc.). Is there any progress on this issue or recommended workarounds (we are forced to use Azure remotes)?

DVC doctor:

DVC version: 3.23.0 (pip)
-------------------------
Platform: Python 3.9.13 on Windows-10-10.0.19045-SP0
Subprojects:
        dvc_data = 2.16.4
        dvc_objects = 1.0.1
        dvc_render = 0.5.3
        dvc_task = 0.3.0
        scmrepo = 1.3.1
Supports:
        azure (adlfs = 2023.8.0, knack = 0.11.0, azure-identity = 1.14.0),
        http (aiohttp = 3.8.5, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.5, aiohttp-retry = 2.8.3)
Config:
        Global: C:\Users\<my_user>\AppData\Local\iterative\dvc
        System: C:\ProgramData\iterative\dvc
Cache types: <[https://error.dvc.org/no-dvc-cache>](https://error.dvc.org/no-dvc-cache%3E)
Caches: local
Remotes: azure
Workspace directory: NTFS on C:\
Repo: dvc, git
Repo.site_cache_dir: C:\ProgramData\iterative\dvc\Cache\repo\5de647d54e519ea4a0c1849b6d45760a

@efiop
Copy link
Contributor

efiop commented Oct 17, 2023

@pezosanta Actually just merged iterative/dvc#10027 today, that exposes timeout config options iterative/dvc.org#4930. Would you be able to try out upstream dvc and see if tweaking those config options helps you?

@efiop
Copy link
Contributor

efiop commented Oct 17, 2023

Closing for now as fixed (iterative/dvc#10027).

@efiop efiop closed this as completed Oct 17, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in DVC Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

No branches or pull requests

3 participants