-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZipArchive: Apply strategy pattern depending on ZipArchiveMode #61820
Conversation
Tagging subscribers to this area: @dotnet/area-system-io-compression Issue DetailsThe Currently, the This PR organizes the This change would make it easier to address some zip-related issues that we might consider for .NET 7.0:
If this proposed PR is accepted, then the next step would be to also move
|
@adamsitnik I'm going to run the benchmarks. Do you know if there's potential for regressions due to using an abstract class? I mainly added it to avoid duplicating code, because there were many cases where two modes shared logic (read and update do one thing, but create does another one, for example). But I wouldn't mind have some duplicate code if we have to avoid abstract overrides to prevent perf regressions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your ZipArchiveStrategy is currently just abstracting the Mode
, opposed to FileStreamStrategy which abstracts the OS, Sync/Async, buffering, if base or derived.
I would suggest to go with a more lightweight approach for abstracting Mode
and just separate the logic for each in static methods.
If there are other factors that support having the abstraction, then cool, but right now, the Mode is not enough IMO.
@@ -262,19 +262,19 @@ public long Length | |||
/// <exception cref="ObjectDisposedException">The ZipArchive that this entry belongs to has been disposed.</exception> | |||
public void Delete() | |||
{ | |||
if (_archive == null) | |||
if (_strategy == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_strategy
is not nullable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll check if there are other similar existing conditions I can simplify.
|
||
namespace System.IO.Compression | ||
{ | ||
internal interface IZipArchiveStrategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why an interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems I don't need it. With just the abstract class is enough. I'll remove it.
|
||
namespace System.IO.Compression | ||
{ | ||
internal class ZipArchiveCreateStrategy : ZipArchiveStrategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: seal all the leaf classes.
internal class ZipArchiveCreateStrategy : ZipArchiveStrategy | |
internal sealed class ZipArchiveCreateStrategy : ZipArchiveStrategy |
...ibraries/System.IO.Compression/src/System/IO/Compression/ZipStrategies/ZipArchiveStrategy.cs
Show resolved
Hide resolved
private Stream? _backingStream; | ||
private byte[]? _archiveComment; | ||
private Encoding? _entryNameEncoding; | ||
internal IZipArchiveStrategy Strategy { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's similar to what we have in FileStream. If we find places where we can improve perf, it can compensate this extra allocation. Unfortunately, due to the inheritance of the abstract class, I cannot switch to using structs. Do you have any alternatives?
I generally agree with @jozkee here; I'm not seeing what this abstraction is really buying. With FileStream, we were contending with Windows vs Unix, buffering vs unbuffered, pipe vs file, async vs sync, etc., and the matrix of all of those is a large part of what made the implementation untenable. In contrast, here there appears to be just be a single dimension, and in each of the concrete implementations, most of the members are empty/minimal. I'm not against an internal abstraction here if it helps more than it hurts, but I'm not seeing it at the moment. You mention this will help better address future issues; can you elaborate? |
3 things come to mind:
A good alternative @jozkee gave me is to add specialized methods instead, and split the methods into partial files depending on the method. I like the option since I don't have strong arguments for points 2 and 3 yet. So until I can confirm we have an additional abstraction layer besides How does that sound? |
Thanks for the additional details. My suggestion is to wait to do such a refactoring an introduction of an internal abstraction until there's actually necessity, i.e. do it in the same PR where it's being used. Otherwise, we're speculating about what the abstraction should be and whether it's helpful without actually being able to take it for a spin. It's the same in this regard to introducing new public abstractions: you want to both consume them and implement them several times to have confidence in their shape. Obviously we have more flexibility with an internal implementation detail, but it's similar. |
Ok thanks. I'll continue investigating the |
The
ZipArchiveMode
parameter that is passed to theZipArchive
constructor plays an important role on how we decide to manipulate the zip stream that we pass to theZipArchive
constructor.Currently, the
ZipArchive
code heavily intermixes the logic for reading/creating/updating, making it difficult to debug, diagnose, fix bugs, or even to extend the class with new features.This PR organizes the
ZipArchive
code using the strategy pattern, like we did recently withFileStream
, to help reduce those problems considerably.This change would make it easier to address some zip-related issues that we might consider for .NET 7.0:
If this proposed PR is accepted, then the next step would be to also separate concerns in
ZipArchiveEntry
, since it also intermixes create/read/update logic, although not as heavily asZipArchive
.