Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-29237: Create enum for pstats sorting options #5103

Merged
merged 9 commits into from
Jan 26, 2018

Conversation

mwidjaja
Copy link
Contributor

@mwidjaja mwidjaja commented Jan 5, 2018

Copy link
Member

@ethanfurman ethanfurman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum member names need to be in all caps: SortKey.TIME.

Where several values mean the same thing, have one canonical member, and the rest duplicates, e.g.:
CUMULATIVE = 'cumulative'
CUMTIME = 'cumulative'

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@@ -151,11 +151,17 @@ might try the following sort calls::
p.sort_stats('name')
p.print_stats()

p.sort_stats(SortKey.name)
p.print_stats()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello and thanks for the PR!

Here I would recomment keeping only the example with the enum, since it’s the way that we want to recommend.

To match the import style (see import pstats in the previous code block), the example would need to be p.sort_stats(pstats.SortKey.NAME) (assuming the upper-case fix asked by the other reviewer is done).

I wonder if the enum should be defined on the Stats class, to make code shorter: p.sort_stats(p.SortKey.NAME). Maybe worth asking on the ticket to get other people’s opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change the import to

from pstats import Stats, SortKey

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pstats module is all about stats processing, so there is no need to hide SortKey inside the Stats class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comment. Re: 'Here I would recomment keeping only the example with the enum, since it’s the way that we want to recommend.' I could do that, but I have also received comment about having both example as the string argument is still valid. Showing only the enum would make the doc incomplete.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes there is a need to also show an example with a string form, but not all code examples need both forms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwidjaja

When you make the duplicate SortKeys actually duplicates, please put the better name first -- something like:

class SortKey(str, Enum):
    calls = 'calls'
    ncalls = 'calls'
    cumulative = 'cumulative'
    cumtime = 'cumulative'
    module = 'module'
    file = 'module'
    filename = 'module'
    line = 'line'
    name = 'name'
    nfl = 'nfl'
    pcalls = 'pcalls'
    stdname = 'stdname'
    time = 'time'
    tottime = 'time'

@mwidjaja
Copy link
Contributor Author

mwidjaja commented Jan 5, 2018

@ethanfurman:

Thank you for your review. Regarding the request to change CUMTIME to 'cumulative', wouldn't this break the existing code that uses 'cumtime' as the sort criteria? I guess I can internally map the string 'cumtime' to 'cumulative' to allow it. If you have other suggestion, let me know.

@ethanfurman
Copy link
Member

ethanfurman commented Jan 5, 2018 via email

@mwidjaja
Copy link
Contributor Author

mwidjaja commented Jan 6, 2018

@ethanfurman:
I do not see that path. Would you point me to the string-based code path?
I see this path for both arg type:

for word in field:
    sort_tuple = sort_tuple + sort_arg_defs[word][0]
    self.sort_type += connector + sort_arg_defs[word][1]
    connector = ", "

p.print_stats()

The first call will actually sort the list by function name, and the second call
will print out the statistics. The following are some interesting calls to
experiment with::

p.sort_stats('cumulative').print_stats(10)
p.sort_stats(SortKey.cumulative).print_stats(10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think showing both ways in these examples is not useful and could create confusion.
It would be enough to have one example here and the changes that you already did in the method docs later in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. unless there're objections from anyone else, I'll remove them and just keep the one in the method doc.

@mwidjaja
Copy link
Contributor Author

@ethanfurman, would you be able to give me some hints on your comments about separate code path. Thank you.

@mwidjaja
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@ethanfurman: please review the changes made to this pull request.

@mwidjaja
Copy link
Contributor Author

@merwok, @ethanfurman : If there's anything else I need to do, let me know, otherwise this PR is ready for another review. Thank you

@@ -148,14 +148,14 @@ entries according to the standard module/line/name string that is printed. The
:meth:`~pstats.Stats.print_stats` method printed out all the statistics. You
might try the following sort calls::

p.sort_stats('name')
p.sort_stats(SortKey.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If people follow the code examples here, SortKey will be undefined. It needs to be imported.

@@ -424,6 +428,8 @@ Analysis of the profiler data is done using the :class:`~pstats.Stats` class.

.. For compatibility with the old profiler.

.. versionadded:: 3.7
Added the SortKey enums.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum*

Lib/pstats.py Outdated
"pcalls" : (((0,-1), ), "primitive call count"),
"stdname" : (((7, 1), ), "standard name"),
"time" : (((2,-1), ), "internal time"),
"tottime" : (((2,-1), ), "internal time"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we still have to support string arguments, and the SortKey members are also strings, I would just leave the original code alone here.

Lib/pstats.py Outdated
STDNAME = 'stdname'
TIME = 'time'
TOTTIME = 'tottime'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial understanding of this was flawed. Since we need to support the old string values we can't just have CUMTIME and CUMULATIVE map to "cumulative" because then "cumtime" is gone (unless we do extra work, which I see you did).

What we need here is a MultiValueEnum, so SortKey.CUMULATIVE maps to both "cumulative" and "cumtime".

The Enum should look like this:

class SortKey(str, Enum):

    CALLS = 'calls', 'ncalls'
    CUMULATIVE = 'cumulative', "cumtime"
    MODULE = 'module', 'filename', 'file'
    LINE = 'line'
    NAME = 'name'
    NFL = 'nfl'
    PCALLS = 'pcalls'
    STDNAME = 'stdname'
    TIME = 'time', 'tottime'

    def __new__(cls, *values):
        obj = object.__new__(cls)
        # first value is canonical value
        obj._value_ = values[0]
        for other_value in values[1:]:
            cls._value2member_map_[other_value] = obj
        obj._all_values = values
        return obj

Operations like SortKey('file') will return SortKey.MODULE. The downside is that SortKey.FILE is undefined, but I think that is an acceptable trade-off since we are still supporting the original string values, and those values can be used to get a correct SortKey member.

Lib/pstats.py Outdated
else:
for word in field:
if isinstance(word, str) and word == 'cumtime':
word = 'cumulative'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new SortKey enum, these three lines are no longer needed.


def test_sort_stats_string(self):
for arg in SortKey:
arg = arg.value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When dealing with Enums, member is a better name than arg.

# 'file' sorting criteria will not work because it creates
# ambiquity with 'filename'
if arg == 'file':
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand -- 'file', 'filename', and 'module' should all create the same sort -- what ambiguity is there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While what you're saying is essentially correct, however as it is currently implemented (without any of my changes), we're allowed to abbreviate the string argument into the sort_stats as long as the abbreviation is unambiguous. For example, sort_stats would work if either 'filename' or 'filena' string is passed in. But, if string 'file' is passed, then sort_stats can't differentiate between 'filename' or 'file' and sort_stats will fail with KeyError exception. We should probably remove 'file'. I can do that, if you agree. I'm guessing that no existing code is using 'file' argument at this time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file and filename map to the same stat, so if we removed file and somebody passed in file it should still work.

Remove "file" and write a test to make sure it still works as intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Will make another pass to remove "file".

self.stats.sort_arg_dict_default[arg_str][-1])

def test_sort_stats_string(self):
for arg in SortKey:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the string test, we should be iterating through all the possible strings (no longer a given with the new SortKey enum). Listing them all is probably the best way to go:

for arg in ('calls', 'ncalls', 'cumtime', 'cumulative', ... ):

Copy link
Member

@ethanfurman ethanfurman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes with regards to how the SortKey enum is created, and the cascading changes from that.

My apologies for my initial misunderstanding of the problem.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@mwidjaja
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@ethanfurman: please review the changes made to this pull request.

@@ -393,7 +393,7 @@ Analysis of the profiler data is done using the :class:`~pstats.Stats` class.
+------------------+---------------------+----------------------+
| ``'filename'`` | N/A | file name |
+------------------+---------------------+----------------------+
| ``'module'`` | SortKey.MODULE | file name |
| ``'module'`` | SortKey.FILENAME | file name |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move SortKey.FILENAME up one line to the 'filename' entry, change the 'module' entry to N/A, and we're done! Thank you!

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@mwidjaja
Copy link
Contributor Author

Thank you @ethanfurman @merwok for the review comments.

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@ethanfurman: please review the changes made to this pull request.

@ethanfurman ethanfurman merged commit 863b1e4 into python:master Jan 26, 2018
@ethanfurman ethanfurman self-assigned this Jan 26, 2018
@mwidjaja mwidjaja deleted the fix-bpo-29237 branch January 26, 2018 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants