Wasserstein Conditional GAN with Gradient Penalty or WCGAN-GP for short, is a Generative Adversarial Network model used by Walia, Tierney and McKeever 2020 to create synthetic tabular data.
WCGAN-GP uses Wasserstein loss to overcome mode collapse, gradient penalty instead of weight clipping to increase stability in training, while also being a conditional GAN meaning that it can create data conditioned by input label.
This repo contains the TensorFlow 2 implementation of WCGAN-GP and demonstrates its use by creating synthetic data from CIC-IDS-2017 dataset.
https://github.com/marzekan/WCGAN-GP.git
cd WCGAN-GP
pip install -r requirements.txt
or
pipenv shell
Checkout the Jupyter Notebook
pipenv run juypter notebook
... and open wcgan-gp.ipynb
or
use the packaged module: wcgan.py
Synthetic data evaluation was done using the TableEvaluator package.
Important!
For demo purposes and to reduce resource usage the original CIC-IDS-2017 is sampled to 25% of the original dataset size.
This is sure to implact the results so if you want to get the best possible results you should train the WCGAN-GP on entire CIC-IDS-2017 after running it through cleaning-cic-ids-2017.ipynb and data-preproc.ipynb notebooks.