Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functional PLS for dimensionality reduction and regression #548

Merged
merged 54 commits into from
Nov 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
92ca069
Merge remote-tracking branch 'origin/feature/speedUpPenalization' int…
Ddelval May 7, 2023
245bb3a
Initial PLS implementation
Ddelval May 7, 2023
96a12a8
Fix some style issues
Ddelval May 7, 2023
548594c
Fix test execution
Ddelval May 7, 2023
0891dcf
Rename test file
Ddelval May 8, 2023
eed4be0
Add pls tets
Ddelval Jun 17, 2023
d6569c2
Change default weights
Ddelval Jun 17, 2023
73e6689
Refactor PLS to improve typing
Ddelval Jun 17, 2023
49748fc
Fix style issues in _fpls
Ddelval Jun 17, 2023
4ceb414
Check transform methods in test
Ddelval Jun 21, 2023
b6280fc
Cleanup fpls implementation
Ddelval Jun 21, 2023
d16299c
Fix typing and linting errors
Ddelval Jun 22, 2023
a1773e2
Use mean method of FData
Ddelval Jun 22, 2023
dea6013
Redesign fpls blocks with generics
Ddelval Jun 23, 2023
0fe647f
Move centering operations into FPLSBlock
Ddelval Jun 23, 2023
2430376
Simplify FPLS regression implementation
Ddelval Jun 23, 2023
5481f63
Add individual transform methods
Ddelval Jun 23, 2023
238d622
Fix typing and style issues
Ddelval Jun 23, 2023
906c2bd
Merge branch 'develop' into feature/pls
Ddelval Jun 23, 2023
beeb58c
Fix FPLSRegression test
Ddelval Jun 23, 2023
0dab123
Merge branch 'develop' into feature/pls
Ddelval Jun 24, 2023
95c2eab
Merge remote-tracking branch 'origin/develop' into feature/pls
Ddelval Jun 24, 2023
e171549
Ignore linter warning in init
Ddelval Jun 24, 2023
d3e4178
Specify ignore type
Ddelval Jun 24, 2023
ec5471c
Test different mypy command
Ddelval Jun 24, 2023
b0d8212
Revert "Test different mypy command"
Ddelval Jun 24, 2023
6ebdb92
Merge branch 'develop' into feature/pls
Ddelval Jun 25, 2023
9c35a8c
Add an epsilon to normalization calculations
Ddelval Jun 26, 2023
0d96714
Fix bugs
Ddelval Jun 26, 2023
1080dcf
Tidy fpls implementation
Ddelval Jun 26, 2023
6b61c17
Complete fpls documentation
Ddelval Jun 27, 2023
b24f677
Tidy fpls implementation
Ddelval Jun 30, 2023
80879ed
Add doctest to PLS classes
Ddelval Jul 31, 2023
7cc738c
Merge remote-tracking branch 'origin/develop' into feature/pls
Ddelval Jul 31, 2023
28d48b2
In the new version of sklearn the coefficients are transposed
Ddelval Jul 31, 2023
7f0c84d
Fix documentation generation for PLS
Ddelval Aug 1, 2023
24fdc4c
Move NIPALS into the FPLS class.
Ddelval Oct 28, 2023
fb366d3
FPLSBlock subclasses for each representation
Ddelval Oct 28, 2023
ec1890f
Merge remote-tracking branch 'origin/develop' into feature/pls
Ddelval Oct 28, 2023
5c63ea7
Improve documentation
Ddelval Oct 28, 2023
04731f2
Ignore WPS413: Found bad magic module function
Ddelval Oct 28, 2023
508709b
Address review comments
Ddelval Oct 28, 2023
9341220
Improve the behaviour when the number of components is not adecuate.
Ddelval Oct 28, 2023
aa20749
Improve tolerance definition.
Ddelval Oct 28, 2023
9387a7c
Add warning when convergence is reached
Ddelval Oct 28, 2023
9c032cc
Sort imports
Ddelval Oct 28, 2023
a7783e0
Check that the warning is being generated in the test.
Ddelval Oct 28, 2023
c0fdc20
Fix style error
Ddelval Oct 28, 2023
4c89c5a
Remove unnecessary noqa
Ddelval Nov 1, 2023
65eb007
Create new FData instead of copying
Ddelval Nov 1, 2023
74b3ea2
Extract the maximum number of components by default
Ddelval Nov 1, 2023
ab7f91c
Fix deprecation warning
Ddelval Nov 1, 2023
e7c79af
Use all components in regression as well
Ddelval Nov 1, 2023
e0c4ab5
Merge branch 'develop' into feature/pls
vnmabus Nov 7, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion docs/modules/ml/regression.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,17 @@ regression is fitted using the coefficients of the functions in said basis.
.. autosummary::
:toctree: autosummary

skfda.ml.regression.FPCARegression
skfda.ml.regression.FPCARegression

FPLS regression
-----------------
This module includes the implementation of FPLS (Functional Partial Least Squares)
regression. This implementation accepts either functional or multivariate data as the regressor and the response.
FPLS regression consists on performing the FPLS dimensionality reduction algorithm
but using a regression deflation strategy.


.. autosummary::
:toctree: autosummary

skfda.ml.regression.FPLSRegression
7 changes: 5 additions & 2 deletions docs/modules/preprocessing/dim_reduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,12 @@ Other dimensionality reduction methods construct new features from
existing ones. For example, in functional principal component
analysis, we project the data samples into a smaller sample of
functions that preserve most of the original
variance.
variance. Similarly, in functional partial least squares, we project
the data samples into a smaller sample of functions that preserve most
of the covariance between the two data blocks.

.. autosummary::
:toctree: autosummary

skfda.preprocessing.dim_reduction.FPCA
skfda.preprocessing.dim_reduction.FPCA
skfda.preprocessing.dim_reduction.FPLS
2 changes: 2 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ ignore =
WPS507,
# Comparison with not is not the same as with equality
WPS520,
# Found bad magic module function: {0}
WPS413

per-file-ignores =
__init__.py:
Expand Down
2 changes: 2 additions & 0 deletions skfda/ml/regression/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
"_kernel_regression": ["KernelRegression"],
"_linear_regression": ["LinearRegression"],
"_fpca_regression": ["FPCARegression"],
"_fpls_regression": ["FPLSRegression"],
"_neighbors_regression": [
"KNeighborsRegressor",
"RadiusNeighborsRegressor",
Expand All @@ -19,6 +20,7 @@

if TYPE_CHECKING:
from ._fpca_regression import FPCARegression
from ._fpls_regression import FPLSRegression as FPLSRegression
from ._historical_linear_model import (
HistoricalLinearRegression as HistoricalLinearRegression,
)
Expand Down
118 changes: 118 additions & 0 deletions skfda/ml/regression/_fpls_regression.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
from __future__ import annotations

from typing import Any, TypeVar, Union

from sklearn.utils.validation import check_is_fitted

from ..._utils._sklearn_adapter import BaseEstimator, RegressorMixin
from ...misc.regularization import L2Regularization
from ...preprocessing.dim_reduction import FPLS
from ...representation import FDataGrid
from ...representation.basis import Basis, FDataBasis
from ...typing._numpy import NDArrayFloat

InputType = TypeVar(
"InputType",
bound=Union[FDataGrid, FDataBasis, NDArrayFloat],
)

OutputType = TypeVar(
"OutputType",
bound=Union[FDataGrid, FDataBasis, NDArrayFloat],
)


class FPLSRegression(
BaseEstimator,
RegressorMixin[InputType, OutputType],
):
r"""
Regression using Functional Partial Least Squares.

Parameters:
n_components: Number of components to keep.
By default all available components are utilized.
regularization_X: Regularization for the calculation of the X weights.
weight_basis_X: Basis to use for the X block. Only
applicable if X is a FDataBasis. Otherwise it must be None.
weight_basis_Y: Basis to use for the Y block. Only
applicable if Y is a FDataBasis. Otherwise it must be None.

Attributes:
coef\_: Coefficients of the linear model.
fpls\_: FPLS object used to fit the model.

Examples:
Fit a FPLS regression model with two components.

>>> from skfda.ml.regression import FPLSRegression
>>> from skfda.datasets import fetch_tecator
>>> from skfda.representation import FDataGrid
>>> from skfda.typing._numpy import NDArrayFloat

>>> X, y = fetch_tecator(return_X_y=True)
Ddelval marked this conversation as resolved.
Show resolved Hide resolved
>>> fpls = FPLSRegression[FDataGrid, NDArrayFloat](n_components=2)
>>> fpls = fpls.fit(X, y)

"""

def __init__(
self,
n_components: int | None = None,
regularization_X: L2Regularization[Any] | None = None,
weight_basis_X: Basis | None = None,
weight_basis_Y: Basis | None = None,
_integration_weights_X: NDArrayFloat | None = None,
_integration_weights_Y: NDArrayFloat | None = None,
) -> None:
self.n_components = n_components
self._integration_weights_X = _integration_weights_X
self._integration_weights_Y = _integration_weights_Y
self.regularization_X = regularization_X
self.weight_basis_X = weight_basis_X
self.weight_basis_Y = weight_basis_Y

def fit(
self,
X: InputType,
y: OutputType,
) -> FPLSRegression[InputType, OutputType]:
"""
Fit the model using the data for both blocks.

Args:
X: Data of the X block
y: Data of the Y block

Returns:
self
"""
self.fpls_ = FPLS[InputType, OutputType](
n_components=self.n_components,
regularization_X=self.regularization_X,
component_basis_X=self.weight_basis_X,
component_basis_Y=self.weight_basis_Y,
_integration_weights_X=self._integration_weights_X,
_integration_weights_Y=self._integration_weights_Y,
_deflation_mode="reg",
)

self.fpls_.fit(X, y)

self.coef_ = (
self.fpls_.x_rotations_matrix_
@ self.fpls_.y_loadings_matrix_.T
)
return self

def predict(self, X: InputType) -> OutputType:
"""Predict using the model.

Args:
X: Data to predict.

Returns:
Predicted values.
"""
check_is_fitted(self)
return self.fpls_.inverse_transform_y(self.fpls_.transform_x(X))
8 changes: 6 additions & 2 deletions skfda/preprocessing/dim_reduction/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,17 @@
],
submod_attrs={
"_fpca": ["FPCA"],
"_neighbor_transforms": ["KNeighborsTransformer"]
"_fpls": ["FPLS"],
"_neighbor_transforms": ["KNeighborsTransformer"],
},
)

if TYPE_CHECKING:
from ._fpca import FPCA as FPCA
from ._neighbor_transforms import KNeighborsTransformer as KNeighborsTransformer
from ._fpls import FPLS as FPLS
from ._neighbor_transforms import (
KNeighborsTransformer as KNeighborsTransformer,
)


def __getattr__(name: str) -> Any:
Expand Down
Loading
Loading