Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-104683: Argument clinic: make format_docstring() a method on Function objects #107840

Closed

Conversation

AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Aug 10, 2023

format_docstring() only ever modifies the internal state of the Function object that is self.function on the DSLParser. It makes much more sense for it to be an instance method on Function objects rather than DSLParser objects.

@erlend-aasland
Copy link
Contributor

Could you add tests for @text_signature first? There are a lot of untested paths here.

@serhiy-storchaka
Copy link
Member

I am not sure that it is a good idea. Function is a small data class. Its function is to store the data related to a function. It has few simple properties which represent that data in different forms. format_docstring() is very different. It is not idempotent. It can (and should) only be called once after processing the clinic input for the function. It can be a part of the parser, or the part of the renderer, or an independent utility, but it does not fit as a Function method.

I do not see benefits from this change.

@erlend-aasland
Copy link
Contributor

It can be a part of the parser, or the part of the renderer, or an independent utility, but it does not fit as a Function method.

IMO, it makes most sense to render (and format) the docstring in render_function(). Currently, format_docstring() is pretty complex. It may make more sense to try and clean it up first.

Regarding where the docstring render code should live: perhaps a DocstringRenderer class that can take care of this is the best option. I see the value of keeping the Function class small.

No matter where it ends up, I think it makes sense to tear this functionality out of the parser.

@AlexWaygood
Copy link
Member Author

I do not see benefits from this change.

Since the format_docstring() method solely exists to modify the internal state of Function objects, and since it nearly entirely only used the internal state of Function objects, I felt like it would better fit the OOP principle of encapsulation to have it as a method on Function objects rather than as a method on DSLParser objects. The benefit I was trying to achieve was improved code readability and maintainability; the PR achieves small simplifications of the code in several places.

However, it is true that this PR does not fix any known bugs, so if we disagree that this improves readability, then it should be abandoned. It's also true that this would be a significant increase in the complexity of the Function class, which does currently mostly serve as a simple "data holder" class.

It is not idempotent. It can (and should) only be called once after processing the clinic input for the function.

It's true that these facts about format_docstring() differ from the other methods on Function objects currently. Other than that, however, I'm not sure I agree that they're good reasons, in and of themselves, for format_docstring() not to be a method on Function. The "footgun" whereby format_docstring could accidentally be called at any point during the state machine, but should only be called in one specific place during the state machine, already exists in the code's current form, where the method exists on the DSLParser class.

@serhiy-storchaka
Copy link
Member

The problem is that Function.docstring serves different functions during its lifetime. It is a buffer to accumulate docstring lines while processing the clinic input. It is an input to format_docstring() used as a template for generating docstring. It is a generated docstring after calling format_docstring().

format_docstring() itself has two functions: parse and validate input docstring lines and generate a docstring. The former part should stay in the parser, the later part should be moved to the renderer.

The Function object passed from the parser to the renderer should have the following attributes: summary -- one line, prologue -- the part which will be inserted between the summary and the parameters list (it is empty in all current applications), epilogue -- the part which will be inserted between the summary and the parameters list. Then the renderer will combine a docstring from signature, summary, prologue, parameters and epilogue, in this order.

@AlexWaygood
Copy link
Member Author

Thanks @serhiy-storchaka, that seems like a solid analysis. Let's leave this for now, then, and save it for a more principled refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants