-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Literal
for compression
in zipfile
#9346
Conversation
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: We could use
Literal
for the module constants, e.g.ZIP_STORED
,ZIP_DEFLATED
, etc. I can add another commit to this PR if required.
We need to either make both changes together, or not at all. Otherwise, type checkers will emit false positives on code like this, due to the fact that they will infer ZIP_DEFLATED
as being an int
, while the function requires a Literal
:
import zipfile
zip_output = zipfile.ZipFile("some/path.zip", "w", zipfile.ZIP_DEFLATED, allowZip64=True)
This is the cause of the pip
error in the mypy_primer
output: https://github.com/pypa/pip/blob/5f3f592c4581a059ff9de0fb8052ef5c6ef25fd4/src/pip/_internal/req/req_install.py#L714-L719
Note: I'm not yet sure I'm sold on the idea of making a change here at all, due to the risk of false-positive errors being emitted. But in order to see the impact of the change, we'll need to see what it looks like if we change the module-level constants to Literal
values as well.
Yes I've been checking the reported |
This comment has been minimized.
This comment has been minimized.
Related code failing: # vision (https://github.com/pytorch/vision)
_ZIP_COMPRESSION_MAP: Dict[str, int] = {
".bz2": zipfile.ZIP_BZIP2,
".xz": zipfile.ZIP_LZMA,
}
def _extract_zip(from_path: str, to_path: str, compression: Optional[str]) -> None:
with zipfile.ZipFile(
from_path, "r", compression=_ZIP_COMPRESSION_MAP[compression] if compression else zipfile.ZIP_STORED
) as zip:
zip.extractall(to_path) # jinja (https://github.com/pallets/jinja)
zip_file = ZipFile(
target, "w", dict(deflated=ZIP_DEFLATED, stored=ZIP_STORED)[zip]
) This would require a manual fix on these repos. So this is still breaking things, but imo it would result in safer code, as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think we need to bite the bullet and need to make these kind of changes rather sooner than later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM too, though I'm not sure it makes as much sense for ZIP64_LIMIT
, ZIP_FILECOUNT_LIMIT
and ZIP_MAX_COMMENT
to be made Literal
s as it does for the other constants. Those are very big numbers, and the constants serve a different purpose to ZIP_STORED
, ZIP_DEFLATED
, ZIP_BZIP2
and ZIP_LZMA
.
I agree. Personally, I would have left them as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Viicos!
The stubtest failures are due to the recent release of Python 3.10.9, and are unrelated to this PR. |
Diff from mypy_primer, showing the effect of this PR on open source code: vision (https://github.com/pytorch/vision)
+ torchvision/datasets/utils.py:301: error: Argument "compression" to "ZipFile" has incompatible type "int"; expected "Literal[0, 8, 12, 14]" [arg-type]
jinja (https://github.com/pallets/jinja)
+ src/jinja2/environment.py:864: error: Argument 3 to "ZipFile" has incompatible type "int"; expected "Literal[0, 8, 12, 14]" [arg-type]
|
This reverts commit 034cfab. The commit exposed the value of the compression constants, which seem to be implementation details, in the public API. I don't see anything in the documentation about the values of the constants such as `ZIP_STORED`: https://docs.python.org/3/library/zipfile.html?highlight=zipfile#zipfile.ZipFile Example where this makes a difference: ``` from typing import Literal import zipfile def f1(p: str, compression: int) -> None: """Error: compression should have the type Literal[0, 8, 12, 14]""" zipfile.ZipFile(p, compression=compression) def f2(p: str, compression: Literal[0, 8, 12, 14]) -> None: """Works, but cryptic and exposes internal implementation details""" zipfile.ZipFile(p, compression=compression) ``` The values are of constants need to be explicitly specified if somebody wants to wrap `zipfipe.ZipFile`, which arguably exposes implementation details in a problematic fashion. Here is a real-world example where this caused a regression: https://github.com/pytorch/vision/blob/main/torchvision/datasets/utils.py#L301
This reverts commit 034cfab. The commit exposed the value of the compression constants, which seem to be implementation details, in the public API. I don't see anything in the documentation about the values of the constants such as `ZIP_STORED`: https://docs.python.org/3/library/zipfile.html?highlight=zipfile#zipfile.ZipFile Example where this makes a difference: ``` from typing import Literal import zipfile def f1(p: str, compression: int) -> None: """Error: compression should have the type Literal[0, 8, 12, 14]""" zipfile.ZipFile(p, compression=compression) def f2(p: str, compression: Literal[0, 8, 12, 14]) -> None: """Works, but cryptic and exposes internal implementation details""" zipfile.ZipFile(p, compression=compression) ``` The values are of constants need to be explicitly specified if somebody wants to wrap `zipfipe.ZipFile`, which arguably exposes implementation details in a problematic fashion. Here is a real-world example where this caused a regression: https://github.com/pytorch/vision/blob/main/torchvision/datasets/utils.py#L301
According to docs:
Literal
could be used for thecompresslevel
parameter as well:But this would require extra overloads for each
compression
value, so considering the fact that there's already a split between3.8
and3.11
this would make things more complicated.Note: We could use
Literal
for the module constants, e.g.ZIP_STORED
,ZIP_DEFLATED
, etc. I can add another commit to this PR if required.