Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core-clp: Add CLI command to extract a compressed file as IR. #420

Merged
merged 75 commits into from
Jun 12, 2024

Conversation

haiqi96
Copy link
Contributor

@haiqi96 haiqi96 commented May 30, 2024

References

based on #417

Description

This changes adds ir decompression execution path to the clp executable.

The PR contains two notable changes:

  1. The PR introduce a new command clp i. The command allows user to decompress a file split to one or multiple IR files, by providing the orig_file_id and a message index. It also let user pick a custom threshold for the uncompressed IR size and a directory to temporarily write IRs to.
  2. Since the message_index and the orig_file_id can unique identiy a file split, we implemented a simplified decompression logic in IrDecompression.cpp. Compared to the decompression.cpp,

Validation performed

To validate the functionality, we compressed a 64MB file into archive(s). We then decompressed it into mulitple IRs, decoded and concatnate them, and did a binary comparison with the original file.

We used two configuration to cover all the possible cases:

  1. Compressed a 64MB hadoop log using smaller encoded file size and archive size, such that it splits the original file into 3 splits across 2 archives. We then decompressed all 3 IRs by running clp 3 times, using different message index

  2. Compressed the 64MB hadoop log using default settings, so only one file and archive was generated. We then decompressed the IR using a 32MB threshold, generating 3 IRs on disk.

@kirkrodrigues kirkrodrigues marked this pull request as ready for review June 10, 2024 14:27
@kirkrodrigues kirkrodrigues self-requested a review June 10, 2024 21:52
components/core/src/clp/clp/IrDecompression.hpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/CommandLineArguments.hpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/CommandLineArguments.cpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/run.cpp Show resolved Hide resolved
components/core/src/clp/clp/run.cpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/run.cpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/run.cpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/FileDecompressor.inc Outdated Show resolved Hide resolved
components/core/src/clp/clp/FileDecompressor.hpp Outdated Show resolved Hide resolved
@haiqi96 haiqi96 requested a review from kirkrodrigues June 11, 2024 21:59
components/core/src/clp/clp/CommandLineArguments.hpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/run.cpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/run.cpp Show resolved Hide resolved
components/core/src/clp/clp/FileDecompressor.hpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/decompression.hpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/decompression.cpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/CommandLineArguments.cpp Outdated Show resolved Hide resolved
components/core/src/clp/clp/CommandLineArguments.cpp Outdated Show resolved Hide resolved
haiqi96 and others added 4 commits June 11, 2024 21:09
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

core-clp: Add CLI command to extract a file from an archive as IR.

@haiqi96
Copy link
Contributor Author

haiqi96 commented Jun 12, 2024

For the PR title, how about:

core-clp: Add CLI command to extract a file from an archive as IR.

how about core-clp: Add CLI command to extract a compressed file as IR.

An archive gives me the impression that user needs to specifiy an archive.

@kirkrodrigues
Copy link
Member

For the PR title, how about:
core-clp: Add CLI command to extract a file from an archive as IR.

how about core-clp: Add CLI command to extract a compressed file as IR.

sgtm

@haiqi96 haiqi96 changed the title core-clp: add Archive to IR decompression as a command line option for clp core-clp: Add CLI command to extract a compressed file as IR. Jun 12, 2024
@haiqi96 haiqi96 merged commit d5fcd6b into y-scope:main Jun 12, 2024
11 checks passed
@haiqi96 haiqi96 deleted the ArchiveToIRCmd branch June 28, 2024 14:43
jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants