Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/SetColumnDtype #53

Merged

Conversation

davidhopkinson26
Copy link
Contributor

added SetColumnDtype transformer, test module and example

Transformer to set column dtypes

@davidhopkinson26 davidhopkinson26 changed the title added SetColumnDtype transformer, test module and example feature/SetColumnDtype Dec 13, 2022

def __init__(self, columns, dtype):

super().__init__(columns, copy=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a string input check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype can be a string or a dtype object. Have added clarity to docstring in 36a9dde


return X

def __validate_dtype(self, dtype: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion it would be be nicer to do this validation without having to use a try/except clause - I think that the errors from pd.api.types.pandas_dtype are maybe expressive enough. Not sure that having the freedom to add self.classname is worth adding a try except. (feel free to ignore this one as it's quite picky)

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The self.classname in the error can be really handy when you have dozens of steps in a pipeline. Always nice to know where things are breaking. I agree it is slightly overwrought but since it has been done now I'm inclined to leave it in.

from tubular.misc import SetColumnDtype


class TestSetColumnDtype(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are not in the usual TestInit, TestTransform etc format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reorganised in cf59b9c

@@ -0,0 +1,447 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this could do with more markdown - not that clear what is going on in the examples

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 2874ad5

@davidhopkinson26 davidhopkinson26 merged commit 68af2a8 into azukds:develop Jan 17, 2023
@davidhopkinson26 davidhopkinson26 deleted the feature/setDtypeTransformer branch January 17, 2023 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants