Skip to content

tubo213/nsw-with-optimal-transport

Repository files navigation

Fast solution to the fair ranking problem using the Sinkhorn algorithm

This repository contains the code used for the experiments in "Fast solution to the fair ranking problem using the Sinkhorn algorithm" (PRICAI 2024)

Abstract

In two-sided marketplaces such as online flea markets, recommender systems for providing consumers with personalized item rankings play a key role in promoting transactions between providers and consumers. Meanwhile, two-sided marketplaces face the problem of balancing consumer satisfaction and fairness among items to stimulate activity of item providers. Saito and Joachims (2022) devised an impact-based fair ranking method for maximizing the Nash social welfare based on fair division; however, this method, which requires solving a large-scale constrained nonlinear optimization problem, is very difficult to apply to practical-scale recommender systems. We thus propose a fast solution to the impact-based fair ranking problem. We first transform the fair ranking problem into an unconstrained optimization problem and then design a gradient ascent method that repeatedly executes the Sinkhorn algorithm. Experimental results demonstrate that our algorithm provides fair rankings of high quality and is about 1000 times faster than application of commercial optimization software.

Build Environment

1. install rye

install documentation

MacOS

curl -sSf https://rye-up.com/get | bash
echo 'source "$HOME/.rye/env"' >> ~/.zshrc
source ~/.zshrc

Linux

curl -sSf https://rye-up.com/get | bash
echo 'source "$HOME/.rye/env"' >> ~/.bashrc
source ~/.bashrc

Windows
see install documentation

Set the dependency resolution method to uv

rye config --set-bool behavior.use-uv=true

2. Create virtual environment

First, Please match the index url to the environment at hand to use torch.

# pyproject.toml
dependencies = [
    ...,
    "torch == 2.3.0+cu121", # change this to match your environment
]

[[tool.rye.sources]]
name = "torch"
url = "https://download.pytorch.org/whl/cu121" # change this to match your environment
type = "index"

Second, create virtual environment.

rye sync

3. Activate virtual environment

. .venv/bin/activate

Datasets

(This explanation is based on the repository of the kdd2022-fair-ranking-nsw by yuta-saito)

We use "Delicious" and "Wiki10-31K" from The Extreme Classification Repository. Please install the above datasets from the repository and put them under the ./data/ as follows.

root/
├── data
│   ├── delicious.txt
│   └── wiki.txt

Note that we rename the files as follows:

  • Delicious_data.txt in Delicious.zip to delicious.txt
  • train.txt in Wiki10.bow.zip to wiki.txt , respectively.

Basic Usage

from src import create_generator, create_optimizer, evaluate_pi

g = create_generator(
    generator_name="synthetic",
    # specific generator parameters
    n_query=250,
    n_doc=1600,
    lam=1.0,
)
rel_mat_true, rel_mat_obs = g.generate_rel_mat()  # relevance matrix
expo = g.exam_func(K=10, shape="inv")  # exposure

optimizer = create_optimizer(
    optimizer_name="ot_nsw",
    # specific optimizer parameters
    eps=1,
    lr=0.5,
    use_amp=False,
    tol=0.01,
    device="cuda",
)
pi = optimizer.solve(rel_mat_obs, expo)

print(evaluate_pi(pi, rel_mat_true, expo))

List available optimizers and generators

from src import list_optimizers, list_generators

print(list_optimizers())
print(list_generators())

Run Experiment

Before running an experiment, you need register wandb. please see wandb documentation for details.

You can easily perform experiments by changing the parameters because hydra is used. The following commands perform experiments with optimizer=nsw, number of documents of 100, 200, 300, and 400.

rye run python main.py -m optimizer=nsw generator.n_doc=100,200,300,400

Reproduce the results of the paper

make run-all

Export experiment results

The following command exports the experiment results from wandb to the local directory.

rye run python tools/export_results.py --user_name {your wandb user name}

References

  • Saito, Y., & Joachims, T. (2022, August). Fair Ranking as Fair Division: Impact-Based Individual Fairness in Ranking. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 1514-1524).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published