Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(clp::streaming_compression): Update compressors/decompressors to accept generic WriterInterface/ReaderInterface for improved flexibility over FileWriter/FileReader. #696

Merged
merged 6 commits into from
Jan 28, 2025

Conversation

gibber9809
Copy link
Contributor

@gibber9809 gibber9809 commented Jan 27, 2025

Description

This PR modifies all of the clp::Decompressor classes to accept a generic clp::ReaderInterface instead of only clp::FileReader. This change is necessary to allow use-cases like directly decompressing the output of a clp::NetworkReader. Likewise we modify the clp::Compressor classes to accept a generic clp::WriterInterface.

We also make some small modifications to zstd::Decompressor to avoid throwing on open when the passed clp::ReaderInterface is already at EOF. This allows open followed by no reads or reads of size zero to succeed as they should.

Validation performed

  • Validated that tests pass
  • Validated that zstd::Decompressor can be used to decompress output from a clp::NetworkReader or clp::FileReader

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a more flexible reader interface for decompression, allowing for compatibility with various input sources.
    • Updated the compression interfaces to utilize a generalized writer interface, enhancing output handling flexibility.
  • Refactor

    • Streamlined decompression and compression processes by replacing file-specific methods with more abstract interfaces.
    • Improved input and output handling across multiple components, enhancing overall system versatility.
    • Enhanced error handling in the close() method for file operations.

Copy link
Contributor

coderabbitai bot commented Jan 27, 2025

Walkthrough

The pull request introduces a significant refactoring of the decompression and compression components across multiple files. The primary change involves replacing FileReader and FileWriter with a more generic ReaderInterface and WriterInterface, respectively, in the streaming compression classes. This modification generalizes the input and output handling mechanisms, allowing for more flexible data source and sink management across different decompression and compression implementations in the passthrough and zstd namespaces.

Changes

File Change Summary
components/core/src/clp/streaming_compression/Decompressor.hpp Updated method signature to use ReaderInterface instead of FileReader.
components/core/src/clp/streaming_compression/passthrough/Decompressor.hpp Replaced FileReader with ReaderInterface, updated method signature, member variables, and enum entry.
components/core/src/clp/streaming_compression/passthrough/Decompressor.cpp Modified input handling to work with ReaderInterface, updated method calls and variable names.
components/core/src/clp/streaming_compression/zstd/Decompressor.hpp Replaced file-specific variables with more generic reader-based variables and updated method signature and enum entry.
components/core/src/clp/streaming_compression/zstd/Decompressor.cpp Updated method implementations to use ReaderInterface, modified reading and seeking logic.
components/core/src/clp/streaming_archive/writer/Segment.hpp Included FileWriter.hpp and updated close() method documentation for error handling.
components/core/src/clp/streaming_compression/Compressor.hpp Updated method signature to use WriterInterface instead of FileWriter and removed FileWriter.hpp import.
components/core/src/clp/streaming_compression/lzma/Compressor.cpp Changed parameter types to WriterInterface in multiple methods and updated internal variable references.
components/core/src/clp/streaming_compression/lzma/Compressor.hpp Updated method signatures and member variables to use WriterInterface instead of FileWriter.
components/core/src/clp/streaming_compression/passthrough/Compressor.cpp Renamed member variable and updated open method parameter to WriterInterface.
components/core/src/clp/streaming_compression/passthrough/Compressor.hpp Changed FileWriter to WriterInterface in method signatures and internal variable declarations.
components/core/src/clp/streaming_compression/zstd/Compressor.cpp Updated method signatures and internal logic to use WriterInterface instead of FileWriter.
components/core/src/clp/streaming_compression/zstd/Compressor.hpp Changed method signatures to accept WriterInterface and updated member variable types.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Decompressor
    participant ReaderInterface
    
    Client->>Decompressor: open(reader, buffer_capacity)
    Decompressor->>ReaderInterface: try_read()
    ReaderInterface-->>Decompressor: return data
    Decompressor->>Decompressor: Process data
    Client->>Decompressor: close()
Loading

Possibly related PRs


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 677bd03 and fde74f4.

📒 Files selected for processing (1)
  • components/core/src/clp/streaming_compression/lzma/Compressor.hpp (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • components/core/src/clp/streaming_compression/lzma/Compressor.hpp
⏰ Context from checks skipped due to timeout of 90000ms (12)
  • GitHub Check: ubuntu-focal-static-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: ubuntu-focal-dynamic-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: build-macos (macos-13, false)
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: build-macos (macos-13, true)
  • GitHub Check: lint-check (macos-latest)
  • GitHub Check: build (macos-latest)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
components/core/src/clp/streaming_compression/passthrough/Decompressor.cpp (1)

88-95: Consider adding bounds checking for read_buffer_capacity.

While the implementation is correct, consider validating that read_buffer_capacity is non-zero to prevent potential issues.

 void Decompressor::open(ReaderInterface& reader, size_t read_buffer_capacity) {
     if (InputType::NotInitialized != m_input_type) {
         throw OperationFailed(ErrorCode_NotReady, __FILENAME__, __LINE__);
     }
 
+    if (0 == read_buffer_capacity) {
+        throw OperationFailed(ErrorCode_BadParam, __FILENAME__, __LINE__);
+    }
+
     m_reader = &reader;
     m_decompressed_stream_pos = 0;
     m_input_type = InputType::ReaderInterface;
 }
components/core/src/clp/streaming_compression/zstd/Decompressor.hpp (1)

129-133: Consider documenting buffer management strategy.

The member variables related to buffer management lack documentation explaining their lifecycle and ownership semantics.

Add documentation for these members:

+    // Reader-related variables
+    /// Pointer to the underlying reader (non-owning)
     ReaderInterface* m_reader;
+    /// Initial position in the reader when opened
     size_t m_reader_initial_pos;
+    /// Buffer for reading compressed data from the reader
     std::unique_ptr<char[]> m_read_buffer;
+    /// Current number of bytes in the read buffer
     size_t m_read_buffer_length;
+    /// Maximum capacity of the read buffer
     size_t m_read_buffer_capacity;
components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (1)

Line range hint 66-73: Update boolean expression to match coding guidelines

The error handling should follow the project's coding guidelines for boolean expressions.

-    if (ErrorCode_Success != error_code) {
-        if (ErrorCode_EndOfFile == error_code) {
+    if (false == (ErrorCode_Success == error_code)) {
+        if (false == (ErrorCode_EndOfFile != error_code)) {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 162acd5 and 53a1118.

📒 Files selected for processing (5)
  • components/core/src/clp/streaming_compression/Decompressor.hpp (1 hunks)
  • components/core/src/clp/streaming_compression/passthrough/Decompressor.cpp (4 hunks)
  • components/core/src/clp/streaming_compression/passthrough/Decompressor.hpp (3 hunks)
  • components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (6 hunks)
  • components/core/src/clp/streaming_compression/zstd/Decompressor.hpp (4 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
components/core/src/clp/streaming_compression/Decompressor.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/passthrough/Decompressor.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/passthrough/Decompressor.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/zstd/Decompressor.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: ubuntu-focal-static-linked-bins
  • GitHub Check: ubuntu-focal-dynamic-linked-bins
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: lint-check (macos-latest)
🔇 Additional comments (15)
components/core/src/clp/streaming_compression/Decompressor.hpp (1)

44-48: LGTM! Clear documentation and appropriate parameter names.

The changes effectively generalize the interface while maintaining clear documentation.

components/core/src/clp/streaming_compression/passthrough/Decompressor.hpp (3)

72-72: LGTM! Method signature aligns with base class.

The updated signature properly implements the virtual method from the base class.


93-93: LGTM! InputType enum updated appropriately.

The enum value name change from File to ReaderInterface better reflects its purpose.


99-99: Consider initializing m_reader in the constructor.

The member variable is declared but not initialized in the constructor. Consider adding it to prevent potential null pointer dereference.

components/core/src/clp/streaming_compression/passthrough/Decompressor.cpp (3)

Line range hint 26-32: LGTM! Proper error handling in try_read.

The implementation correctly propagates errors from the reader and updates the stream position.


Line range hint 52-57: LGTM! Proper error handling in try_seek_from_begin.

The implementation correctly propagates errors from the reader's seek operation.


104-105: LGTM! Proper cleanup in close method.

The implementation correctly nullifies the reader pointer when closing.

components/core/src/clp/streaming_compression/zstd/Decompressor.hpp (3)

9-9: LGTM! Proper include for ReaderInterface.

The include statement is correctly placed and replaces the FileReader include.


76-76: LGTM! Method signature aligns with base class.

The updated signature properly implements the virtual method from the base class.


112-112: LGTM! InputType enum updated appropriately.

The enum value name change from File to ReaderInterface better reflects its purpose.

components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (5)

11-19: LGTM: Constructor initialization list properly updated

The initialization list correctly initializes all member variables needed for the ReaderInterface implementation.


Line range hint 61-81: LGTM: Robust implementation of ReaderInterface reading

The implementation properly handles buffer reading and EOF conditions, maintaining consistency with other input types.


163-180: LGTM: Well-implemented open method for ReaderInterface

The implementation properly handles:

  • Error checking for reader state
  • Safe buffer allocation
  • Initial position tracking

190-194: LGTM: Proper cleanup in close method

The implementation correctly cleans up all resources and resets member variables for the ReaderInterface case.


239-246: LGTM: Proper stream reset implementation

The implementation correctly handles:

  • Reader position reset
  • Buffer state reset
  • EOF condition as specified in PR objectives

Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also update Compressor to support reader interface in this PR to make feature parity?

Comment on lines 11 to 19
: ::clp::streaming_compression::Decompressor{CompressorType::ZSTD},
m_input_type{InputType::NotInitialized},
m_decompression_stream{nullptr},
m_reader{nullptr},
m_reader_initial_pos{0},
m_read_buffer_length{0},
m_read_buffer_capacity{0},
m_decompressed_stream_pos{0},
m_unused_decompressed_stream_block_size{0} {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're touching here anyway, would u mind if we move these initializations to the header file where they're defined?

@gibber9809 gibber9809 changed the title feat(clp): Modify Decompressor sub-classes to accept clp::ReaderInterface. feat(clp): Modify Decompressor sub-classes to accept clp::ReaderInterface; modify Compressor sub-classes to accept clp::WriterInterface. Jan 27, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
components/core/src/clp/streaming_archive/writer/Segment.hpp (1)

Line range hint 89-89: Consider using WriterInterface instead of FileWriter.

The member variable m_file_writer is currently typed as FileWriter. To align with the PR's objective of using generic interfaces, consider updating this to use WriterInterface instead. This would make the Segment class more flexible, allowing it to work with different types of writers (similar to how the PR is updating decompressors to work with different types of readers).

🧹 Nitpick comments (8)
components/core/src/clp/streaming_compression/zstd/Compressor.hpp (1)

75-78: Update parameter names in documentation to match implementation.

The parameter names in the documentation (writer) should match the parameter names used in the method signatures for consistency and clarity.

Apply this diff to update the documentation:

     /**
      * Initializes the compression stream with the default compression level
-     * @param writer
+     * @param writer Writer interface to write compressed data to
      */
     auto open(WriterInterface& writer) -> void override {
         this->open(writer, cDefaultCompressionLevel);
     }

     /**
      * Initializes the compression stream with the given compression level
-     * @param writer
-     * @param compression_level
+     * @param writer Writer interface to write compressed data to
+     * @param compression_level Compression level to use
      */
     auto open(WriterInterface& writer, int compression_level) -> void;

Also applies to: 83-86

components/core/src/clp/streaming_compression/zstd/Compressor.cpp (1)

29-29: Consider adding documentation about WriterInterface requirements.

It would be helpful to document any specific requirements or expectations for the WriterInterface implementation (e.g., thread safety, blocking behaviour, error handling).

components/core/src/clp/streaming_compression/lzma/Compressor.hpp (2)

82-86: Enhance method documentation for consistency.

The open() method's documentation is quite brief compared to other methods in the file. Consider adding more details about the method's purpose and behaviour.

     /**
-     * Open the compression stream for encoding to the writer.
+     * Opens the compression stream for encoding to the specified writer.
      *
+     * @throw OperationFailed if the compressor is already open or if
+     *        initialization fails
      * @param writer The writer interface to compress data to
      */

18-21: Consider documenting thread safety guarantees.

Since this class handles I/O operations through an interface, it would be helpful to document thread safety guarantees or requirements, particularly regarding concurrent access to the writer.

 /**
  * Implements a LZMA compressor that compresses byte input data to a clp::WriterInterface.
+ *
+ * Thread Safety: This class is not thread-safe. Concurrent access to the writer
+ * must be externally synchronized.
  */
components/core/src/clp/streaming_compression/passthrough/Compressor.hpp (1)

73-75: Update parameter documentation.

The documentation comment for the open method should be updated to describe the writer parameter.

Apply this diff to improve the documentation:

     /**
      * Initializes the compression stream
-     * @param writer
+     * @param writer The writer interface to use for output
      */
     auto open(WriterInterface& writer) -> void override;
components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (3)

Line range hint 52-72: Simplify the nested condition in the ReaderInterface case.

The nested condition checking for EOF can be simplified for better readability.

Consider this simplified version:

                 case InputType::ReaderInterface: {
                     auto error_code = m_reader->try_read(
                             reinterpret_cast<char*>(m_read_buffer.get()),
                             m_read_buffer_capacity,
                             m_read_buffer_length
                     );
                     if (ErrorCode_Success != error_code) {
-                        if (ErrorCode_EndOfFile == error_code) {
-                            num_bytes_read = decompressed_stream_block.pos;
-                            if (0 == decompressed_stream_block.pos) {
-                                return ErrorCode_EndOfFile;
-                            } else {
-                                return ErrorCode_Success;
-                            }
-                        } else {
+                        if (ErrorCode_EndOfFile == error_code && decompressed_stream_block.pos > 0) {
+                            num_bytes_read = decompressed_stream_block.pos;
+                            return ErrorCode_Success;
+                        }
+                        num_bytes_read = decompressed_stream_block.pos;
+                        if (ErrorCode_EndOfFile == error_code) {
+                            return ErrorCode_EndOfFile;
+                        }
                         return error_code;
-                        }
                     }

161-165: Improve readability of the error handling condition.

The condition checking for success or EOF can be made more readable.

Consider this more readable version:

-    if (auto const rc = m_reader->try_get_pos(m_reader_initial_pos);
-        false == (ErrorCode_Success == rc || ErrorCode_EndOfFile == rc))
-    {
-        throw OperationFailed(rc, __FILENAME__, __LINE__);
-    }
+    auto const rc = m_reader->try_get_pos(m_reader_initial_pos);
+    if (rc != ErrorCode_Success && rc != ErrorCode_EndOfFile) {
+        throw OperationFailed(rc, __FILENAME__, __LINE__);
+    }

231-235: Apply consistent error handling style.

Similar to the open method, the condition checking for success or EOF can be made more readable.

Consider this more readable version:

-        if (auto const rc = m_reader->try_seek_from_begin(m_reader_initial_pos);
-            false == (ErrorCode_Success == rc || ErrorCode_EndOfFile == rc))
-        {
-            throw OperationFailed(rc, __FILENAME__, __LINE__);
-        }
+        auto const rc = m_reader->try_seek_from_begin(m_reader_initial_pos);
+        if (rc != ErrorCode_Success && rc != ErrorCode_EndOfFile) {
+            throw OperationFailed(rc, __FILENAME__, __LINE__);
+        }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 602d084 and 677bd03.

📒 Files selected for processing (10)
  • components/core/src/clp/streaming_archive/writer/Segment.hpp (1 hunks)
  • components/core/src/clp/streaming_compression/Compressor.hpp (1 hunks)
  • components/core/src/clp/streaming_compression/lzma/Compressor.cpp (5 hunks)
  • components/core/src/clp/streaming_compression/lzma/Compressor.hpp (4 hunks)
  • components/core/src/clp/streaming_compression/passthrough/Compressor.cpp (2 hunks)
  • components/core/src/clp/streaming_compression/passthrough/Compressor.hpp (2 hunks)
  • components/core/src/clp/streaming_compression/zstd/Compressor.cpp (7 hunks)
  • components/core/src/clp/streaming_compression/zstd/Compressor.hpp (3 hunks)
  • components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (6 hunks)
  • components/core/src/clp/streaming_compression/zstd/Decompressor.hpp (4 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
components/core/src/clp/streaming_archive/writer/Segment.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/passthrough/Compressor.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/passthrough/Compressor.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/zstd/Compressor.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/lzma/Compressor.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/zstd/Compressor.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/lzma/Compressor.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/Compressor.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/streaming_compression/zstd/Decompressor.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

📓 Learnings (1)
components/core/src/clp/streaming_compression/lzma/Compressor.cpp (1)
Learnt from: Bill-hbrhbr
PR: y-scope/clp#614
File: components/core/src/clp/streaming_compression/lzma/Compressor.hpp:65-75
Timestamp: 2024-12-19T01:40:13.272Z
Learning: "try_get_pos" method in `Compressor.hpp` does not throw exceptions, as clarified by Bill-hbrhbr.
⏰ Context from checks skipped due to timeout of 90000ms (12)
  • GitHub Check: centos-stream-9-static-linked-bins
  • GitHub Check: build-macos (macos-14, false)
  • GitHub Check: centos-stream-9-dynamic-linked-bins
  • GitHub Check: ubuntu-focal-static-linked-bins
  • GitHub Check: build-macos (macos-13, false)
  • GitHub Check: ubuntu-focal-dynamic-linked-bins
  • GitHub Check: build-macos (macos-13, true)
  • GitHub Check: ubuntu-jammy-static-linked-bins
  • GitHub Check: ubuntu-jammy-dynamic-linked-bins
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: lint-check (macos-latest)
  • GitHub Check: build (macos-latest)
🔇 Additional comments (30)
components/core/src/clp/streaming_compression/zstd/Compressor.hpp (2)

11-11: LGTM! Include statement changes are appropriate.

The replacement of FileWriter.hpp with WriterInterface.hpp aligns with the PR's objective of generalizing the interface.


95-95: LGTM! Member variable changes are appropriate.

The renaming of the member variable and its type change to WriterInterface* properly reflects its more generic nature while maintaining the same initialization pattern.

components/core/src/clp/streaming_compression/lzma/Compressor.cpp (7)

14-14: LGTM! Include statement properly added.

The addition of WriterInterface.hpp is necessary and correctly placed with other local includes.


Line range hint 17-33: LGTM! Method signature and implementation properly updated.

The open method correctly handles the WriterInterface parameter and maintains proper error checking.


Line range hint 35-47: LGTM! Proper cleanup in close method.

The close method correctly handles the WriterInterface member and maintains proper resource cleanup sequence.


Line range hint 49-62: LGTM! Write method properly handles data streaming.

The write method maintains proper error checking and correctly updates the stream position.


64-69: LGTM! Flush method maintains proper state checks.

The flush method correctly verifies initialization state before proceeding.


Line range hint 71-76: LGTM! Position tracking properly implemented.

The try_get_pos method correctly follows the non-throwing behaviour as confirmed by previous learnings.


Line range hint 146-150: LGTM! Buffer flushing properly implemented.

The flush_stream_output_block_buffer method correctly uses the WriterInterface to write compressed data.

components/core/src/clp/streaming_compression/zstd/Compressor.cpp (3)

10-10: LGTM! Include statement follows proper ordering.

The addition of WriterInterface.hpp follows the established include pattern, properly placed with other local includes.


29-30: LGTM! Consistent implementation of WriterInterface throughout the class.

The changes properly implement the generic WriterInterface while maintaining consistent error handling. The member variable name change better reflects its generic nature.

Also applies to: 50-50, 59-59, 125-125


89-92: LGTM! Consistent writer interface usage across all methods.

The WriterInterface is properly utilized for all write operations, maintaining the existing error handling and type safety through appropriate static_cast usage.

Also applies to: 116-119, 149-152

components/core/src/clp/streaming_compression/lzma/Compressor.hpp (3)

12-12: LGTM! Include and documentation updates align with interface changes.

The addition of WriterInterface header and updated class documentation accurately reflect the transition from file-specific to interface-based output handling.

Also applies to: 18-18


61-64: LGTM! Documentation accurately reflects WriterInterface usage.

The flush() method documentation correctly describes the interaction with WriterInterface.


221-221: LGTM! Member variable updated appropriately.

The member variable type and name changes appropriately reflect the transition to the more generic WriterInterface.

components/core/src/clp/streaming_compression/passthrough/Compressor.hpp (3)

8-8: LGTM! Include statement updated correctly.

The include statement has been properly updated to use the generic WriterInterface header, aligning with the PR's objective of moving away from concrete FileWriter implementations.


79-79: LGTM! Member variable updated correctly.

The member variable has been properly updated to use WriterInterface* type with an appropriate name change to reflect its more generic nature.


Line range hint 52-67: Verify error handling in write and flush methods.

Since we're now accepting a generic WriterInterface, we should ensure that the write and flush methods properly handle potential errors from different implementations of WriterInterface.

Let's check the implementation:

✅ Verification successful

Error handling in write and flush methods is properly implemented

The implementation correctly handles:

  • Uninitialized writer state (ErrorCode_NotInit)
  • Invalid input validation (ErrorCode_BadParam)
  • Empty data cases
  • Error propagation from the underlying writer
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for write and flush implementations to verify error handling
rg -A 10 "auto.*write\(.*\).*->.*void.*override" components/core/src/clp/streaming_compression/passthrough/
rg -A 10 "auto.*flush\(.*\).*->.*void.*override" components/core/src/clp/streaming_compression/passthrough/

Length of output: 2554


Script:

#!/bin/bash
# Search for the implementation file and class definition
fd Compressor.cpp components/core/src/clp/streaming_compression/passthrough/ --exec cat {}

Length of output: 1468

components/core/src/clp/streaming_compression/passthrough/Compressor.cpp (6)

7-7: LGTM! Include statement updated appropriately.

The change from FileWriter.hpp to WriterInterface.hpp aligns with the PR's objective of using a more generic interface.


11-11: LGTM! Proper null checks and error handling.

The changes maintain robust error handling while following the coding guidelines for null checks.

Also applies to: 23-23


27-27: LGTM! Consistent error handling and member variable usage.

The flush method maintains consistency with the write method in terms of error handling and member variable usage.

Also applies to: 31-31


35-35: LGTM! Appropriate error handling pattern.

The method correctly uses ErrorCode returns for error handling and maintains consistent null checks.

Also applies to: 39-39


43-43: LGTM! Proper cleanup.

The close method correctly nullifies the writer reference.


46-47: LGTM! Successfully generalized the interface.

The open method now accepts a generic WriterInterface, which aligns with the PR's objective of supporting different writer implementations.

components/core/src/clp/streaming_compression/zstd/Decompressor.hpp (4)

9-9: LGTM!

The include statement for ReaderInterface.hpp is correctly placed and necessary for the class implementation.


76-76: LGTM!

The method signature change from FileReader to ReaderInterface aligns with the PR objectives and enables more flexible input handling.


112-112: LGTM!

The enum update from File to ReaderInterface correctly reflects the architectural change.


123-138: Well-structured member variable declarations with default initializers.

The default initializers improve code safety by ensuring variables have known states, and the naming is more generic and better reflects their purpose.

components/core/src/clp/streaming_archive/writer/Segment.hpp (1)

9-9: Consider using WriterInterface.hpp instead.

Given the PR's objective to move towards generic interfaces, consider including WriterInterface.hpp instead of FileWriter.hpp. This would align with the broader changes being made across the codebase.

Let's verify the availability and usage of WriterInterface:

✅ Verification successful

The review comment is correct - WriterInterface.hpp should be used instead of FileWriter.hpp

The codebase shows a clear pattern of using WriterInterface, particularly in the compression components. The Segment class passes its FileWriter to compressors that expect WriterInterface, indicating it should also use the interface. This change would:

  • Align with the PR's objective of moving towards interfaces
  • Match the existing pattern where compressors work with WriterInterface
  • Improve flexibility by removing direct dependency on FileWriter implementation
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if WriterInterface.hpp exists and is being used elsewhere
echo "Checking for WriterInterface header..."
fd WriterInterface.hpp

echo -e "\nChecking for WriterInterface usage..."
rg "WriterInterface" -A 5

Length of output: 31492


Script:

#!/bin/bash
# Check Segment.hpp implementation and similar files
echo "=== Checking Segment.hpp implementation ==="
rg -A 10 "class Segment" "components/core/src/clp/streaming_archive/writer/Segment.hpp"

echo -e "\n=== Checking FileWriter usage in Segment.hpp ==="
rg "FileWriter" "components/core/src/clp/streaming_archive/writer/Segment.hpp"

echo -e "\n=== Looking for similar patterns in other files ==="
rg "m_file_writer" -A 2

Length of output: 14405

components/core/src/clp/streaming_compression/Compressor.hpp (1)

72-74: LGTM! Verify derived class implementations.

The change to use WriterInterface instead of FileWriter makes the interface more generic and aligns with the PR objectives. This will enable use cases like writing to network streams.

Let's verify that all derived classes have been updated to match the new signature:

✅ Verification successful

All implementations have been properly updated to use WriterInterface

The change from FileWriter to WriterInterface has been consistently applied across all derived compressor implementations in the clp::streaming_compression namespace:

  • LZMA Compressor
  • Passthrough Compressor
  • ZSTD Compressor
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all derived classes implement the new open method signature

# Search for classes that inherit from Compressor
ast-grep --pattern 'class $_ : $$$public$$$Compressor$$$'

# Search for open method implementations to verify the parameter type
rg "void\s+open\s*\([^)]*\)" --type cpp

Length of output: 7340


Script:

#!/bin/bash
# Search for classes inheriting from Compressor with namespace consideration
ast-grep --pattern 'class $_ : $$$public $$$clp::streaming_compression::Compressor$$$'
ast-grep --pattern 'class $_ : $$$public $$$clp::Compressor$$$'

# Get content of specific compressor files for detailed analysis
echo "=== ZstdCompressor.hpp ==="
cat components/core/src/clp_s/ZstdCompressor.hpp
echo -e "\n=== streaming_compression/zstd/Compressor.hpp ==="
cat components/core/src/glt/streaming_compression/zstd/Compressor.hpp
echo -e "\n=== streaming_compression/passthrough/Compressor.hpp ==="
cat components/core/src/glt/streaming_compression/passthrough/Compressor.hpp

Length of output: 8021


Script:

#!/bin/bash
# Find the base Compressor class definition and namespace relationships
echo "=== Base Compressor Class ==="
cat components/core/src/clp/streaming_compression/Compressor.hpp

echo -e "\n=== CLP_S Compressor Class ==="
cat components/core/src/clp_s/Compressor.hpp

echo -e "\n=== GLT Compressor Class ==="
cat components/core/src/glt/streaming_compression/Compressor.hpp

Length of output: 5381


Script:

#!/bin/bash
# Find implementations of clp::streaming_compression::Compressor
echo "=== Searching for implementations ==="
# Look for files in clp/streaming_compression that might contain implementations
fd . components/core/src/clp/streaming_compression --type f -e hpp -e cpp

# Search for includes of our Compressor header
echo -e "\n=== Files including our Compressor ==="
rg -l '#include.*clp/streaming_compression/Compressor\.hpp' --type cpp

Length of output: 1388


Script:

#!/bin/bash
# Check the implementation files
echo "=== LZMA Compressor ==="
cat components/core/src/clp/streaming_compression/lzma/Compressor.hpp

echo -e "\n=== Passthrough Compressor ==="
cat components/core/src/clp/streaming_compression/passthrough/Compressor.hpp

echo -e "\n=== ZSTD Compressor ==="
cat components/core/src/clp/streaming_compression/zstd/Compressor.hpp

Length of output: 14391

@LinZhihao-723
Copy link
Member

Should be almost ready to merge after fixing these comments

Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Copy link
Member

@LinZhihao-723 LinZhihao-723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can we update the PR description to include changes in compressors?
  • For the PR title, how about:
feat(clp::streaming_compression): Update compressors/decompressors to accept generic `WriterInterface`/`ReaderInterface` for improved flexibility over `FileWriter`/`FileReader`.

@gibber9809 gibber9809 changed the title feat(clp): Modify Decompressor sub-classes to accept clp::ReaderInterface; modify Compressor sub-classes to accept clp::WriterInterface. feat(clp::streaming_compression): Update compressors/decompressors to accept generic WriterInterface/ReaderInterface for improved flexibility over FileWriter/FileReader. Jan 28, 2025
@gibber9809 gibber9809 merged commit 9eca636 into y-scope:main Jan 28, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants