ArbiNet

ArbiNet is the MEV detection model that doesn't require knowledge about DeFi smart contracts.

We trained our model using the block #15,540,000 ~ # 15,585,000 , with our own labeled data.

All stuffs are open : Pretrained GNN-based model (.pkl files) , training/test dataset, code for training model, code for labeling MEV.

(Updated)

The labeled dataset is accessible online through the link : https://drive.google.com/drive/u/0/folders/1M36tcAqObNo1gPzJ5_Z_QtNrqprj6V1s.

Citation

If you want to cite ArbiNet, please cite as

@article{park2023unraveling,
  title={Unraveling the MEV Enigma: ABI-Free Detection Model using Graph Neural Networks},
  author={Park, Seongwan and Jeong, Woojin and Lee, Yunyoung and Son, Bumho and Jang, Huisu and Lee, Jaewook},
  journal={arXiv preprint arXiv:2305.05952},
  year={2023}
}

Requirements

ArbiNet requires :

1. Erigon archive node with support for traces and receipts

2. (Not Necessary) MySQL database to store preprocessed transactions.

MySQL is only needed when you want to generate training data and test data from scratch.
Train and test data is provided in mev-detection/pretrain_models/

Environment

Ubuntu 20.04.5 LTS
Pytorch 1.13.1 cpu
Anaconda 4.10.3

To install torch and torch_geometric, install them with following command

### torch cpu
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu

### torch geometric
pip install pyg-lib torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cpu.html

Setup

Clone git project

git clone https://github.com/etelpmoc/arbinet.git

Make conda environement

conda create -n mev python=3.7.11

Install packages

pip install -r requirements.txt

Initialize (for configuring Erigon Archive node and MySQL DB)

cd mev_detection
python mev_detection/initialize.py

Enter your DB info and edit port number of Erigon node if necessary

# mev_detection/settings.py
DB_HOST="~~"
DB_USER="~~"
DB_PASSWORD="~~"
DB_NAME="~~"
ERIGON_HOST="~~"
ERIGON_PORT=8545

Usage

This repository supports 3 levels of actions as follows.

(1 may be the one you are looking for / 2 and 3 are time consuming, resource consuming)

Using pretrained ArbiNet for detecting sandwich, arbitrage
Training ArbiNet with your own GNN parameter settings (given train,test data)
Constructing training dataset and test dataset from scratch

1. Use Pretrained model

Inspecting block 17167403

./arbinet.sh 17167403

returns

Inspecting Block #17167403..🤔
Sandwiches
-> Frontrun : 0x8c6b406617861ddbff8b09b74af5533502eea8e41d8ff84ee75287c761ebf357,
-> Backrun  : 0xd5e131d3fa2a728c43e1bf45fb787cca75e1c147f53f17f4a5a13b21572d60cf
Arbitrages
-> 0xf3e455bd2fb0bc6da000daf921b99d69e28e049e52dc666ae27718d5eeae3c7f
-> 0x4781dd06f5af3ce6d27c88c6d472662f80bb2bd0dbe7eb1bdb327eeb8a5e24a7
-> 0x46e61d5e7a992b21944c43cb93f28ce9fcfc1f28c889878c1ff2fdff73f47359

Inspecting blocks from 17000000 to 17000010

./arbinet.sh 17000000 17000011

2. Train ArbiNet with your own parameter settings

To train ArbiNet in your own, you should get train data and test data from our open Dropbox cloud. (110MB each)

You can download data by

cd mev_detection
python download_dataset.py

You can see train_dataset.pt, test_dataset.pt in pretrained_models/ . Or you can download manually from our google drive folder.

To get data, simply train model with

python train_arbinet.py GAT

Supported GNN layers are GAT, GCN, GraphSAGE.

To change parameter settings, you can modify train_arbinet.py. For example, in train_arbinet.py

~~
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001, weight_decay=0)
criterion = torch.nn.CrossEntropyLoss(torch.tensor([1, 1], dtype=torch.float))
~~

Modify learning rate, optimizer, loss functions, weight decay, and so on.

To change layers, modify gnn.py. Add or remove layers, change number of hidden states, pooling layer, and so on.

# gnn.py
...
class GAT(torch.nn.Module):
    def __init__(self, input_features,hidden_channels):
        super(GAT, self).__init__()
        torch.manual_seed(12345)
        self.conv1 = GATv2Conv(input_features, hidden_channels)
        self.conv2 = GATv2Conv(hidden_channels, hidden_channels)
...

Your models will be saved in custom_models/ .

To test model performance,

python test_arbinet.py

3. Constructing training dataset and test dataset from scratch

To construct dataset, first make empty databases and tables

cd mev_detection
python create_db.py

Add arbitrages and sandwicihes to database

python label_arbitrages.py 15500000 15900000

This might take a few hours to a few days depending on your I/O, node speed, CPU performance.

python label_sandwich.py 15500000 15900000

Add token transfer data to database (which will be used for graph construction)

python preprocess_transactions.py 15500000 15590000

(Train : 15540000 ~ 15585000 balanced data) (Test : 15585000 ~ 15590000 unbalanced data which is actual block data)

Performance

Model	Train F1	Test F1
GCN	0.9934	0.9659
GAT	0.9974	0.9805
GraphSage	0.9956	0.9814

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
mev_detection		mev_detection
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arbinet.sh		arbinet.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArbiNet

Citation

Requirements

1. Erigon archive node with support for traces and receipts

2. (Not Necessary) MySQL database to store preprocessed transactions.

Environment

Setup

Usage

1. Use Pretrained model

2. Train ArbiNet with your own parameter settings

3. Constructing training dataset and test dataset from scratch

Performance

About

Releases

Packages

Languages

License

BoscoHuang88/GNN-for-MEV

Folders and files

Latest commit

History

Repository files navigation

ArbiNet

Citation

Requirements

1. Erigon archive node with support for traces and receipts

2. (Not Necessary) MySQL database to store preprocessed transactions.

Environment

Setup

Usage

1. Use Pretrained model

2. Train ArbiNet with your own parameter settings

3. Constructing training dataset and test dataset from scratch

Performance

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages