-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incosistent hashing #674
Changes from 6 commits
ed544b8
f253d37
65cd68b
dce3cea
310d0d4
fb89dfe
d66d8d1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,14 +22,12 @@ | |
import hashlib | ||
import io | ||
import os | ||
import re | ||
from pathlib import Path | ||
from typing import Any, Dict, Generator, Optional, Sized, Tuple, cast | ||
|
||
import base58 | ||
|
||
from aea.helpers.cid import to_v1 | ||
from aea.helpers.io import open_file | ||
from aea.helpers.ipfs.utils import _protobuf_python_implementation | ||
|
||
|
||
|
@@ -43,35 +41,13 @@ | |
from aea.helpers.ipfs.pb.merkledag_pb2 import PBNode # type: ignore | ||
|
||
|
||
def _dos2unix(file_content: bytes) -> bytes: | ||
""" | ||
Replace occurrences of Windows line terminator CR/LF with only LF. | ||
|
||
:param file_content: the content of the file. | ||
:return: the same content but with the line terminator | ||
""" | ||
return re.sub(b"\r\n", b"\n", file_content, flags=re.M) | ||
|
||
|
||
def _is_text(file_path: str) -> bool: | ||
"""Check if a file can be read as text or not.""" | ||
try: | ||
with open_file(file_path, "r") as f: | ||
f.read() | ||
return True | ||
except UnicodeDecodeError: | ||
return False | ||
|
||
|
||
def _read(file_path: str) -> bytes: | ||
"""Read a file, replacing Windows line endings if it is a text file.""" | ||
is_text = _is_text(file_path) | ||
"""Read and verify the file is not empty.""" | ||
with open(file_path, "rb") as file: | ||
file_b = file.read() | ||
if is_text: | ||
file_b = _dos2unix(file_b) | ||
|
||
return file_b | ||
data = file.read() | ||
if len(data) == 0: | ||
raise ValueError(f"File cannot be empty: {file_path}") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @angrybayblade why can not be empty? looks like empty file is a common case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The IPFS daemon treats empty files differently, if you push an empty file to the IPFS node it'll produce different hash compared to using the IPFSHashTool, now the reason behind this unclear I tried looking at the IPFS docs but could not find anything related. So for now we just raise an error when a user tries to hash an empty file so they don't run into these mismatching hashes issue when publishing the packages There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @angrybayblade ok, let stay as is. |
||
return data | ||
|
||
|
||
def chunks(data: Sized, size: int) -> Generator: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the logic which caused the inconsistent hash issues when locking and publishing packages, not sure why this was here in the first place but removing this logic fixes the issue and does not cause any new issue AFAIC so can be removed safely