Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

levels!(): enhance performance #360

Merged
merged 2 commits into from
Aug 19, 2021
Merged

Conversation

alyst
Copy link
Contributor

@alyst alyst commented Jul 10, 2021

Enhance the performance of levels!() by avoiding redundant checks for duplicate new levels, levels intersection, recoding etc.
These checks become expensive as the number of levels grow.

@bkamins
Copy link
Member

bkamins commented Jul 10, 2021

@alyst - please allow if the review of the PR might be possibly delayed a bit. If you do not get any feedback in 10 days please bump and I will review (but @nalimilan is a better person to do this and I am sure he eventually will comment on it). Thank you!

src/array.jl Outdated Show resolved Hide resolved
src/array.jl Outdated Show resolved Hide resolved
src/array.jl Outdated Show resolved Hide resolved
src/array.jl Show resolved Hide resolved
src/array.jl Outdated Show resolved Hide resolved
src/array.jl Show resolved Hide resolved
@alyst
Copy link
Contributor Author

alyst commented Aug 13, 2021

gentle bump

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay!

Comment on lines 219 to 225
# check that x is restored correctly when dropping levels is not allowed
@test_throws ArgumentError levels!(x, ["e", "c"])
@test x == ["c", "b", "b"]
@test levels(x) == ["e", "a", "b", "c"]

@test levels!(x, ["e", "c"], allowmissing=true) === x
@test levels(x) == ["e", "c"]
@test x[1] === CategoricalValue(x.pool, 2)
@test x[2] === missing
@test x[3] === missing
@test levels(x) == ["e", "c"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also apply these changes to 11_array.jl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I've addressed your latest review (thanks!) and squashed the commits.

test/13_arraycommon.jl Outdated Show resolved Hide resolved
test/13_arraycommon.jl Outdated Show resolved Hide resolved
test/11_array.jl Outdated
Comment on lines 190 to 200
@test_throws ArgumentError levels!(x, ["a"])
# check that x is restored correctly when dropping levels is not allowed
@test x == ["a", "b", "b"]
@test levels(x) == ["b", "a"]

@test_throws ArgumentError levels!(x, ["e", "b"])

@test_throws ArgumentError levels!(x, ["e", "a", "b", "a"])
# one again check that x is restored correctly when dropping levels is not allowed
@test x == ["a", "b", "b"]
@test levels(x) == ["b", "a"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a pain, but there are equivalent lines in 12_missingarray.jl: can you keep them in sync? :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem :) Done

test levels!() exceptions handling and rollback
@nalimilan nalimilan merged commit d9402f5 into JuliaData:master Aug 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants