Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Correct misuse of term high-cardinality in docs. #29811

Merged
merged 3 commits into from
Nov 26, 2019

Conversation

ebrassell
Copy link
Contributor

@ebrassell ebrassell commented Nov 23, 2019

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Sorry, something went wrong.

Verified

This commit was signed with the committer’s verified signature.
adevade Andréas Lundgren
@ebrassell ebrassell changed the title Correct misuse of high-cardinality Correct misuse of term high-cardinality in docs. Nov 23, 2019
@ebrassell ebrassell changed the title Correct misuse of term high-cardinality in docs. DOC: Correct misuse of term high-cardinality in docs. Nov 23, 2019
@jbrockmendel
Copy link
Member

cardinality seems like a weird term to use here. can you suggest a clearer one?

@jorisvandenbossche
Copy link
Member

It's indeed a very unclear term if you don't know it, but it's also commonly used in the machine learning community for this purpose (so it's certainly a correct term, at least with the fix here).
Since it is explain afterwards, I am fine with keeping it. But maybe switching it around (first the more easy explanation, then the technical term might be better)? Eg This is especially true for text data with only few unique values (low-cardinality data).

@jorisvandenbossche jorisvandenbossche added this to the 1.0 milestone Nov 23, 2019

Verified

This commit was signed with the committer’s verified signature.
adevade Andréas Lundgren
@ebrassell
Copy link
Contributor Author

I agree that cardinality is the correct term here and certainly expect the target audience to be familiar with its proper usage, but I've reworded it as suggested for the benefit of those that might not be familiar with the term.

Verified

This commit was signed with the committer’s verified signature.
adevade Andréas Lundgren
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jbrockmendel
Copy link
Member

LGTM

@jreback
Copy link
Contributor

jreback commented Nov 25, 2019

@gfyoung can you commit your suggestion and merge.

@gfyoung
Copy link
Member

gfyoung commented Nov 26, 2019

@jreback : This PR didn't allow for maintainers to make changes.

I will merge this and follow-up with the grammar fix.

@gfyoung gfyoung merged commit db60ab6 into pandas-dev:master Nov 26, 2019
@gfyoung
Copy link
Member

gfyoung commented Nov 26, 2019

Thanks @ebrassell !

gfyoung added a commit to forking-repos/pandas that referenced this pull request Nov 26, 2019

Verified

This commit was signed with the committer’s verified signature.
adevade Andréas Lundgren
Follow-up to:

pandas-dev#29811
jbrockmendel pushed a commit that referenced this pull request Nov 27, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants