CARTE:
Pretraining and Transfer for Tabular Learning

This repository contains the implementation of the paper CARTE: Pretraining and Transfer for Tabular Learning.

CARTE is a pretrained model for tabular data by treating each table row as a star graph and training a graph transformer on top of this representation.

Colab Examples (Give it a test):

CARTERegressor on Wine Poland dataset
CARTEClassifier on Spotify dataset

Other datasets are available for testing: datasets

Warning

This library is currently in a phase of active development. All features are subject to change without prior notice. If you are interested in collaborating, please feel free to reach out by opening an issue or starting a discussion.

01 Install 🚀

The library has been tested on Linux, MacOSX and Windows.

CARTE-AI can be installed from PyPI:

pip install carte-ai
pip install huggingface_hub

Post installation check

After a correct installation, you should be able to import the module without errors:

import carte_ai

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

import pandas as pd
from carte_ai.data.load_data import *

num_train = 128  # Example: set the number of training groups/entities
random_state = 1  # Set a random seed for reproducibility
X_train, X_test, y_train, y_test = wina_pl(num_train, random_state)
print("Wina Poland dataset:", X_train.shape, X_test.shape)

2️⃣ Convert Table 2 Graph 🪵

The basic preparations are:

preprocess raw data
load the prepared data and configs; set train/test split
generate graphs for each table entries (rows) using the Table2GraphTransformer
create an estimator and make inference

import fasttext
from huggingface_hub import hf_hub_download
from carte_ai import Table2GraphTransformer

model_path = hf_hub_download(repo_id="hi-paris/fastText", filename="cc.en.300.bin")

preprocessor = Table2GraphTransformer(fasttext_model_path=model_path)

# Fit and transform the training data
X_train = preprocessor.fit_transform(X_train, y=y_train)

# Transform the test data
X_test = preprocessor.transform(X_test)

3️⃣ Make Predictions🔮

For learning, CARTE currently runs with the sklearn interface (fit/predict) and the process is:

Define parameters
Set the estimator
Run 'fit' to train the model and 'predict' to make predictions

from carte_ai import CARTERegressor, CARTEClassifier

# Define some parameters
fixed_params = dict()
fixed_params["num_model"] = 10 # 10 models for the bagging strategy
fixed_params["disable_pbar"] = False # True if you want cleanness
fixed_params["random_state"] = 0
fixed_params["device"] = "cpu"
fixed_params["n_jobs"] = 10
fixed_params["pretrained_model_path"] = config_directory["pretrained_model"]


# Define the estimator and run fit/predict

estimator = CARTERegressor(**fixed_params) # CARTERegressor for Regression
estimator.fit(X=X_train, y=y_train)
y_pred = estimator.predict(X_test)

# Obtain the r2 score on predictions

score = r2_score(y_test, y_pred)
print(f"\nThe R2 score for CARTE:", "{:.4f}".format(score))

03 Reproducing paper results ⚙️

➡️ installation instructions setup paper

04 Contribute to the package 🚀

➡️ read the contributions guidelines

05 Star History ⭐️

06 CARTE-AI references 📚

@article{kim2024carte,
  title={CARTE: pretraining and transfer for tabular learning},
  author={Kim, Myung Jun and Grinsztajn, L{\'e}o and Varoquaux, Ga{\"e}l},
  journal={arXiv preprint arXiv:2402.16785},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.github/workflows		.github/workflows
carte_ai		carte_ai
data/single_tables		data/single_tables
examples		examples
images		images
tests		tests
.flake8		.flake8
.gitignore		.gitignore
CONTRIBUTIONS.md		CONTRIBUTIONS.md
INSTALL.md		INSTALL.md
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements-optional.txt		requirements-optional.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CARTE:
Pretraining and Transfer for Tabular Learning

Colab Examples (Give it a test):

01 Install 🚀

Post installation check

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

2️⃣ Convert Table 2 Graph 🪵

3️⃣ Make Predictions🔮

03 Reproducing paper results ⚙️

04 Contribute to the package 🚀

05 Star History ⭐️

06 CARTE-AI references 📚

About

Releases

Packages

Contributors 3

Languages

License

soda-inria/carte

Folders and files

Latest commit

History

Repository files navigation

CARTE: Pretraining and Transfer for Tabular Learning

Colab Examples (Give it a test):

01 Install 🚀

Post installation check

02 CARTE-AI example on sampled data step by step ➡️

1️⃣ Load the Data 💽

2️⃣ Convert Table 2 Graph 🪵

3️⃣ Make Predictions🔮

03 Reproducing paper results ⚙️

04 Contribute to the package 🚀

05 Star History ⭐️

06 CARTE-AI references 📚

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

CARTE:
Pretraining and Transfer for Tabular Learning

Packages