Skip to content

Commit

Permalink
Revert "v0.3.3 changes into main (#64)"
Browse files Browse the repository at this point in the history
This reverts commit 0098637.
  • Loading branch information
davidhopkinson26 authored Mar 15, 2023
1 parent 0098637 commit cf158de
Show file tree
Hide file tree
Showing 61 changed files with 366 additions and 7,886 deletions.
34 changes: 0 additions & 34 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,40 +16,6 @@ Subsections for each version can be one of the following;

Each individual change should have a link to the pull request after the description of the change.

0.3.3 (2023-01-19)
------------------

Added
^^^^^
- added support for prior mean encoding (regularised encodings) `#46 <https://github.com/lvgig/tubular/pull/46>`_

- added support for weights to mean, median and mode imputers `#47 <https://github.com/lvgig/tubular/pull/47>`_

- added classname() method to BaseTransformer and prefixed all errors with classname call for easier debugging `#48 <https://github.com/lvgig/tubular/pull/48>`_

- added DatetimeInfoExtractor transformer in ``tubular/dates.py``associated tests with ``tests/dates/test_DatetimeInfoExtractor.py`` and examples with ``examples/dates/DatetimeInfoExtractor.ipynb`` `#49 <https://github.com/lvgig/tubular/pull/49>`_

- added DatetimeSinusoidCalculator in ``tubular/dates.py``associated tests with ``tests/dates/test_DatetimeSinusoidCalculator.py`` and examples with ``examples/dates/DatetimeSinusoidCalculator.ipynb`` `#50 <https://github.com/lvgig/tubular/pull/50>`_

- added TwoColumnOperatorTransformer in ``tubular/numeric.py``associated tests with ``tests/numeric/test_TwoColumnOperatorTransformer.py`` and examples with ``examples/dates/TwoColumnOperatorTransformer.ipynb`` `#51 <https://github.com/lvgig/tubular/pull/51>`_

- added StringConcatenator in ``tubular/strings.py``associated tests with ``tests/strings/test_StringConcatenator.py`` and examples with ``examples/strings/StringConcatenator.ipynb`` `#52 <https://github.com/lvgig/tubular/pull/52>`_

- added SetColumnDtype in ``tubular/misc.py``associated tests with ``tests/misc/test_StringConcatenator.py`` and examples with ``examples/strings/StringConcatenator.ipynb`` `#53 <https://github.com/lvgig/tubular/pull/53>`_

- added waring to MappingTransformer in ``tubular/mapping.py`` for unexpected changes in dtype `#54 <https://github.com/lvgig/tubular/pull/54>`_

- added new module ``tubular/comparison.py`` containing EqualityChecker. Also added associated tests with ``tests/comparison/test_EqualityChecker.py`` and examples with ``examples/comparison/EqualityChecker.ipynb`` `#55 <https://github.com/lvgig/tubular/pull/55>`_

- added PCATransformer in ``tubular/numeric.py``associated tests with ``tests/misc/test_PCATransformer.py`` and examples with ``examples/numeric/PCATransformer.ipynb`` `#57 <https://github.com/lvgig/tubular/pull/57>`_

Fixed
^^^^^
- updated black version to 22.3.0 and flake8 version to 5.0.4 to fix compatibility issues `#45 <https://github.com/lvgig/tubular/pull/45>`_

- removed **kwargs argument from BaseTransfomer in ``tubular/base.py``to avoid silent erroring if incorrect arguments passed to transformers. Fixed a few tests which were revealed to have incorrect arguments passed by change `#56 <https://github.com/lvgig/tubular/pull/56>`_

0.3.2 (2022-01-13)
------------------

Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ General
^^^^^^^

- Please try and keep each pull request to one change or feature only
- Make sure to update the `changelog <https://github.com/lvgig/tubular/blob/main/CHANGELOG.rst>`_ with details of your change
- Make sure to update the `changelog <https://github.com/lvgig/test-aide/blob/main/CHANGELOG.rst>`_ with details of your change

Code formatting
^^^^^^^^^^^^^^^
Expand Down
16 changes: 1 addition & 15 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,6 @@ capping module

capping.CappingTransformer
capping.OutOfRangeNullTransformer

comparison module
------------------

.. autosummary::
:toctree: api/

comparison.EqualityChecker

dates module
------------------
Expand All @@ -40,8 +32,6 @@ dates module
dates.DateDiffLeapYearTransformer
dates.SeriesDtMethodTransformer
dates.ToDatetimeTransformer
dates.DatetimeInfoExtractor
dates.DatetimeSinusoidCalculator

imputers module
------------------
Expand Down Expand Up @@ -77,7 +67,6 @@ misc module
:toctree: api/

misc.SetValueTransformer
misc.SetColumnDtype

nominal module
------------------
Expand All @@ -99,11 +88,9 @@ numeric module
:toctree: api/

numeric.LogTransformer
numeric.CutTransformer
numeric.TwoColumnOperatorTransformer
numeric.CutTransformer
numeric.ScalingTransformer
numeric.InteractionTransformer
numeric.PCATransformer

strings module
------------------
Expand All @@ -112,4 +99,3 @@ strings module
:toctree: api/

strings.SeriesStrMethodTransformer
strings.StringConcatenator
38 changes: 11 additions & 27 deletions docs/source/quick-start.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
Quick Start
====================
|logo|

Welcome to the quick start guide for tubular!
Welcome to the quick start guide for |logo| !

.. |logo| image:: ../../logo.png
:height: 200px
:height: 50px

Installation
--------------------
Expand All @@ -16,6 +15,7 @@ The easiest way to get ``tubular`` is to install directly from ``pypi``;
pip install tubular
.. important::

Thanks for installing tubular! We hope you find it useful!

Expand Down Expand Up @@ -54,24 +54,20 @@ The standard `OutOfRangeNullTransformer <https://tubular.readthedocs.io/en/lates
Dates
^^^^^

This module contains transformers to deal with datetime columns.
This module contains transformers to deal with date columns.

Date differencing is available - accounting for leap years `DateDiffLeapYearTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.dates.DateDiffLeapYearTransformer.html>`_ or not `DateDifferenceTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.dates.DateDifferenceTransformer.html>`_.
Date differencing is available - accounting for leap years (`DateDiffLeapYearTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.dates.DateDiffLeapYearTransformer.html>`_) or not (`DateDifferenceTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.dates.DateDifferenceTransformer.html>`_).

The `BetweenDatesTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.dates.BetweenDatesTransformer.html>`_ calculates if one date falls between two others.

The `ToDatetimeTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.dates.ToDatetimeTransformer.html>`_ converts columns to datetime type.

The `SeriesDtMethodTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.dates.SeriesDtMethodTransformer.html>`_ allows the user to use `pandas.Series.dt <https://pandas.pydata.org/docs/reference/api/pandas.Series.dt.html>`_ methods in a similar way to `base.DataFrameMethodTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.base.DataFrameMethodTransformer.html>`_.

The `DatetimeInfoExtractor <https://tubular.readthedocs.io/en/latest/api/tubular.dates.DatetimeInfoExtractor.html>`_ allows the user to extract datetime info such as the time of day or month from a datetime field.

The `DatetimeSinusoidCalculator <https://tubular.readthedocs.io/en/latest/api/tubular.dates.DatetimeSinusoidCalculator.html>`_ derives a feature in a dataframe by calculating the sine or cosine of a datetime column.
Finally the `SeriesDtMethodTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.dates.SeriesDtMethodTransformer.html>`_ allows the user to use `pandas.Series.dt <https://pandas.pydata.org/docs/reference/api/pandas.Series.dt.html>`_ methods in a similar way to `base.DataFrameMethodTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.base.DataFrameMethodTransformer.html>`_.

Imputers
^^^^^^^^

This module contains standard imputation techniques - mean, median mode as well as `NearestMeanResponseImputer <https://tubular.readthedocs.io/en/feature-version_0_3_0/api/tubular.imputers.NearestMeanResponseImputer.html>`_ which imputes with the value which is closest to the ``null`` values in terms of average response. All of these support weights.
This module contains standard imputation techniques - mean, median mode as well as `NearestMeanResponseImputer <https://tubular.readthedocs.io/en/feature-version_0_3_0/api/tubular.imputers.NearestMeanResponseImputer.html>`_ which imputes with the value which is closest to the ``null`` values in terms of average response.

The `NullIndicator <https://tubular.readthedocs.io/en/feature-version_0_3_0/api/tubular.imputers.NullIndicator.html>`_ is used to create binary indicators of where ``null`` values are present in a column.

Expand All @@ -87,36 +83,24 @@ The `CrossColumnMappingTransformer <https://tubular.readthedocs.io/en/latest/api
Misc
^^^^

The misc module contains transformers which do not fit into other categories.

`SetValueTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.misc.SetValueTransformer.html>`_ creates a constant column with arbitrary value.

`SetDtype <https://tubular.readthedocs.io/en/latest/api/tubular.misc.SetDtype.html>`_ allows the user to set the dtype of a column.
The misc module currently contains only one transformer, `SetValueTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.misc.SetValueTransformer.html>`_, which creates a constant column with arbitrary value.

Nominal
^^^^^^^

This module contains categorical encoding techniques.

There are respone encoding techniques such as `MeanResponseTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.nominal.MeanResponseTransformer.html>`_, one hot encoding `OneHotEncodingTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.nominal.OneHotEncodingTransformer.html>`_ and grouping of infrequently occuring levels `GroupRareLevelsTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.nominal.GroupRareLevelsTransformer.html>`_.

`MeanResponseTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.nominal.MeanResponseTransformer.html>`_ also supports regularisation of encodings using a prior.
There are respone encoding techniques such as `MeanResponseTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.nominal.MeanResponseTransformer.html>`_, one hot encoding (`OneHotEncodingTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.nominal.OneHotEncodingTransformer.html>`_) and grouping of infrequently occuring levels (`GroupRareLevelsTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.nominal.GroupRareLevelsTransformer.html>`_).

Numeric
^^^^^^^

This module contains numeric transformations - cut `CutTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.CutTransformer.html>`_, log `LogTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.LogTransformer.html>`_, and scaling `ScalingTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.ScalingTransformer.html>`_.

`TwoColumnOperatorTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.TwoColumnOperatorTransformer.html>`_ allows a user to apply operations to two colmns using methods from `pandas.DataFrame method <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html>`_ which require a multiple columns (e.g. add, subtract, multiply etc

It also contains `InteractionTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.InteractionTransformer.html>`_ and `PCATransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.PCATransformer.html>`_ which create interaction terms and pca components.
This module contains numeric transformations - cut (`CutTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.CutTransformer.html>`_), log (`LogTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.LogTransformer.html>`_) and scaling (`ScalingTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.numeric.ScalingTransformer.html>`_).

Strings
^^^^^^^

The strings module contains useful transformers for working with strings. `SeriesStrMethodTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.strings.SeriesStrMethodTransformer.html>`_, allows the user to access `pandas.Series.str <https://pandas.pydata.org/docs/reference/api/pandas.Series.str.html>`_ methods within ``tubular``. `StringConcatenator <https://tubular.readthedocs.io/en/latest/api/tubular.strings.StringConcatenator.html>`_ allows a user to concatenate multiple columns together of varied dtype into a string output.


The strings module contains a single transformer, `SeriesStrMethodTransformer <https://tubular.readthedocs.io/en/latest/api/tubular.strings.SeriesStrMethodTransformer.html>`_, that allows the user to access `pandas.Series.str <https://pandas.pydata.org/docs/reference/api/pandas.Series.str.html>`_ methods within ``tubular``.

Reporting an issue
---------------------------------
Expand Down
Loading

0 comments on commit cf158de

Please sign in to comment.