Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update multipart download path to first write to temp files #10347

Merged
merged 5 commits into from
Oct 5, 2023

Conversation

mch2
Copy link
Member

@mch2 mch2 commented Oct 3, 2023

Description

This change updates ReadContextListener to first write parts to a temp location until all parts have been received. This ensures that partially written files do not lead to corruption. The temp location is prefixed with a random uuid ensuring there is no collision on retries.

Related Issues

Resolves #9784

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • GitHub issue/PR created in OpenSearch documentation repo for the required public documentation changes (#[Issue/PR number])

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2023

Compatibility status:

Checks if related components are compatible with change dfc8110

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/reporting.git]

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@andrross
Copy link
Member

andrross commented Oct 4, 2023

My changes in #10349 will definitely conflict here, but it should be relatively easy to port the code.

This change updates ReadContextListener to first write parts to a temp location
until all parts have been received.

Signed-off-by: Marc Handalian <handalm@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Marc Handalian <handalm@amazon.com>
mch2 added 2 commits October 4, 2023 16:21
Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Marc Handalian <handalm@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Marc Handalian <handalm@amazon.com>
@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

Merging #10347 (8d7bcab) into main (28f185b) will decrease coverage by 0.01%.
Report is 1 commits behind head on main.
The diff coverage is 75.00%.

❗ Current head 8d7bcab differs from pull request most recent head dfc8110. Consider uploading reports for the commit dfc8110 to get more accurate results

@@             Coverage Diff              @@
##               main   #10347      +/-   ##
============================================
- Coverage     71.09%   71.08%   -0.01%     
- Complexity    58209    58275      +66     
============================================
  Files          4830     4830              
  Lines        274840   274858      +18     
  Branches      40048    40048              
============================================
+ Hits         195384   195385       +1     
- Misses        63052    63116      +64     
+ Partials      16404    16357      -47     
Files Coverage Δ
...tore/stream/read/listener/ReadContextListener.java 79.22% <75.00%> (-2.14%) ⬇️

... and 452 files with indirect coverage changes

@github-actions
Copy link
Contributor

github-actions bot commented Oct 5, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Oct 5, 2023

Gradle Check (Jenkins) Run Completed with:

@mch2 mch2 added the backport 2.x Backport to 2.x branch label Oct 5, 2023
@mch2 mch2 merged commit ca0dae6 into opensearch-project:main Oct 5, 2023
@mch2 mch2 deleted the readcontext branch October 5, 2023 01:19
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-10347-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ca0dae61cb1845f2a2543d051a00be946635fc83
# Push it to GitHub
git push --set-upstream origin backport/backport-10347-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-10347-to-2.x.

andrross pushed a commit to andrross/OpenSearch that referenced this pull request Oct 5, 2023
…ch-project#10347)

* Update multipart download path to write to temp files.

This change updates ReadContextListener to first write parts to a temp location
until all parts have been received.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Suppress forbidden IOUtils.fsync

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove unnecessary logging format

Signed-off-by: Marc Handalian <handalm@amazon.com>

* sync directory after file rename

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove flaky threadpool terminate test

Signed-off-by: Marc Handalian <handalm@amazon.com>

---------

Signed-off-by: Marc Handalian <handalm@amazon.com>
(cherry picked from commit ca0dae6)
mch2 added a commit that referenced this pull request Oct 5, 2023
…10389)

* Update multipart download path to write to temp files.

This change updates ReadContextListener to first write parts to a temp location
until all parts have been received.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Suppress forbidden IOUtils.fsync

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove unnecessary logging format

Signed-off-by: Marc Handalian <handalm@amazon.com>

* sync directory after file rename

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove flaky threadpool terminate test

Signed-off-by: Marc Handalian <handalm@amazon.com>

---------

Signed-off-by: Marc Handalian <handalm@amazon.com>
(cherry picked from commit ca0dae6)

Co-authored-by: Marc Handalian <handalm@amazon.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 9, 2023
…ch-project#10347)

* Update multipart download path to write to temp files.

This change updates ReadContextListener to first write parts to a temp location
until all parts have been received.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Suppress forbidden IOUtils.fsync

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove unnecessary logging format

Signed-off-by: Marc Handalian <handalm@amazon.com>

* sync directory after file rename

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove flaky threadpool terminate test

Signed-off-by: Marc Handalian <handalm@amazon.com>

---------

Signed-off-by: Marc Handalian <handalm@amazon.com>
vikasvb90 pushed a commit to vikasvb90/OpenSearch that referenced this pull request Oct 10, 2023
…ch-project#10347)

* Update multipart download path to write to temp files.

This change updates ReadContextListener to first write parts to a temp location
until all parts have been received.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Suppress forbidden IOUtils.fsync

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove unnecessary logging format

Signed-off-by: Marc Handalian <handalm@amazon.com>

* sync directory after file rename

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove flaky threadpool terminate test

Signed-off-by: Marc Handalian <handalm@amazon.com>

---------

Signed-off-by: Marc Handalian <handalm@amazon.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…ch-project#10347)

* Update multipart download path to write to temp files.

This change updates ReadContextListener to first write parts to a temp location
until all parts have been received.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Suppress forbidden IOUtils.fsync

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove unnecessary logging format

Signed-off-by: Marc Handalian <handalm@amazon.com>

* sync directory after file rename

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove flaky threadpool terminate test

Signed-off-by: Marc Handalian <handalm@amazon.com>

---------

Signed-off-by: Marc Handalian <handalm@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working distributed framework skip-changelog v2.11.0 Issues and PRs related to version 2.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure FilePartWriter can't leave partially written files
3 participants