-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract out the connector changelog modification out of the bump_version code #34586
extract out the connector changelog modification out of the bump_version code #34586
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @stephane-airbyte and the rest of your teammates on Graphite |
0ea0a74
to
f3301f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall logic looks perfect to me. I made a couple of suggestion to boost reusability and make it a bit more refactor proof.
def __init__(self, changelog_file_path: Path) -> None: | ||
self.changelog_file_path = changelog_file_path | ||
if not changelog_file_path.exists(): | ||
raise Exception(f"File {changelog_file_path} does not exist") | ||
if not changelog_file_path.is_file(): | ||
raise Exception(f"File {changelog_file_path} is not a regular file") | ||
self.original_markdown_lines = changelog_file_path.read_text().splitlines() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pass the original_markdown_lines
as a constructor parameter and avoid path checks logic there?
I'd love to avoid direct filesystem access if possible.
I'd prefer the changelog read to happen from a dagger directory object. This will help us make this Changelog agnostic of the underlying filesystem / path: a dagger directory could come from a container, a remote git repo or the host fs.
Feel free to not change how the documentation file path is fetched, but please change the constructor to take the raw markdown string as an input instead of the file path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
changelog_entry_re = ( | ||
"^\\| *(?P<version>[0-9]+\\.[0-9+]+\\.[0-9]+?) *\\| *" | ||
+ "(?P<day>[0-9]{4}-[0-9]{2}-[0-9]{2}) *\\| *" | ||
+ "\\[?(?P<pr_number1>[0-9]*)\\]? ?\\(https://github.com/airbytehq/airbyte/pull/(?P<pr_number2>[0-9]*)\\) *\\| *" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a github_repo_url
parameter to the constructor which defaults to a constant built from this one?
It'll make it less coupled to our repo if we allow airbyte-ci
to run on different repo anytime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
) | ||
return "\n".join(new_lines) + "\n" | ||
|
||
def __parse_markdown(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a docstring describing the parsing logic?
changelog_header_line_index = line_index | ||
break | ||
if changelog_header_line_index == -1: | ||
raise Exception("Could not find the changelog section table in the documentation file.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you create a custom ChangelogParsing
exception? Exception
is too broad 😀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
self.old_entries.add(changelog_entry) | ||
|
||
|
||
class ChangelogEntry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a frozen dataclass there? It'll make the class leaner and make it explicitely a stateless object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what a "frozen dataclass" is... I guess I have some reading to do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can consider Dataclass to be models / POJO. They're could frozen when they're immutable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
self.pr_number = int(pr_number) | ||
self.comment = comment | ||
|
||
def to_str(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def to_str(self) -> str: | |
def to_markdown(self) -> str: |
or
def to_str(self) -> str: | |
def __str__(self) -> str: |
if you'd like to call str(changelog_entry)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
) | ||
return "\n".join(new_lines) + "\n" | ||
|
||
def __parse_markdown(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this a public static or class method which would return a list of changelog entry?
I'd find it more elegant to call it this way in the contrusctor:
self.old_entries = self.parse_markdown(original_markdown_line)
Making it public and a static/class method can simplify testing and reuse in other contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the parse_markdown does a lot more than just returning the old entries. It also sets the start and end lines of the changelog. I guess we could redo that in the to_markdown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly the approach I suggested before, thank you so much!
7b3473a
to
9338a2e
Compare
@alafanechere I believe I addressed all your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephane-airbyte the code looks good to me now! Thank you for the changes. I added a minor suggestion.
I manually tried it on a couple of python connectors and realized:
- The error reporting was not working. I fixed it in 1c91364
- The exception thrown is
The changelog table in the documentation file is missing the header delimiter
.
Can you have a look to make sure the majority of our python connectors can be bumped with this command?
I tried:
airbyte-ci connectors --name=source-stripe bump_version minor 123 "new stuff"
return self.__str__().__hash__() | ||
|
||
|
||
def parse_markdown(markdown_lines: list[str], github_repo: str) -> [int, set[ChangelogEntry]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def parse_markdown(markdown_lines: list[str], github_repo: str) -> [int, set[ChangelogEntry]]: | |
def parse_changelog_in_markdown(markdown_lines: list[str], github_repo: str) -> [int, set[ChangelogEntry]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a docstring?
1c91364
to
b12bbd7
Compare
502a276
to
4b43fa9
Compare
4b43fa9
to
723d101
Compare
723d101
to
848e8a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed type checks which failed the CI.
I tried the command on python connectors and it works as expected 👌
Thanks Stephane for this improvement!
848e8a7
to
51659f3
Compare
We want to move the changelog mutation out of the connectors command, so we can reuse it for java CDK.
all connectors were tested, and the only ones that were failing are fixed in #36039