Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Jupyter/Colab notebooks for guides/tutorials about Adv. ML & cleverhans #1097

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Created using Colaboratory
axantillon committed Aug 7, 2019
commit 987b1e00c54972a9681c754c24c9d02363ca399e
311 changes: 311 additions & 0 deletions tutorials/future/tf2/mnist_fgsm_tutorial
Original file line number Diff line number Diff line change
@@ -0,0 +1,311 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Cleverhans FGSM attack.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [
"q53TkEuD-gR9"
],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/github/andantillon/cleverhans/blob/master/tutorials/future/tf2/mnist_fgsm_tutorial\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "RHwgLwN4nXPi",
"colab_type": "text"
},
"source": [
"# Fast Gradient Sign Method (FGSM) using Cleverhans\n",
"\n",
"#### Easily implement an FGSM attack on a model using Cleverhans and Tensorflow 2.0\n",
">\n",
"Checkout the Cleverhans on Github [here](https://www.github.org/tensorflow/cleverhans).\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Vs9wD38he0cM",
"colab_type": "text"
},
"source": [
"## Import dependent libraries\n",
"Tensorflow 2.0 required \n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "eBsLelavnv0u",
"colab_type": "code",
"colab": {}
},
"source": [
"!pip install -q tensorflow==2.0.0b1\n",
"# Install bleeding edge version of cleverhans\n",
"!pip install git+https://github.com/tensorflow/cleverhans.git#egg=cleverhans\n",
"\n",
"import cleverhans\n",
"import tensorflow as tf\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"print(\"\\nTensorflow Version: \" + tf.__version__)\n",
"print(\"Cleverhans Version: \" + cleverhans.__version__)\n",
"print(\"GPU Available: \", tf.test.is_gpu_available())"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "b4kgPQqp-hwR",
"colab_type": "text"
},
"source": [
"## Training a simple model on the MNIST dataset\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "_z99mq1bKkjz",
"colab_type": "text"
},
"source": [
"> If you would like to experiment with other datasets feel free to replace this code by whatever you need to do so. Just remember that to make the rest of the notebook work without any other major changes keep the variables train_images, train_labels, test_images, test_labels and assign them to the corresponding new data. I recommend any of the datasets offered by keras as they use the same mechanics to import them.\n",
"\n",
">Keep in mind that the MNIST dataset was used in this guide because of the little amount of time it takes to both train a good model and craft the attacks."
]
},
{
"cell_type": "code",
"metadata": {
"id": "uQVP1re__pbf",
"colab_type": "code",
"colab": {}
},
"source": [
"mnist = tf.keras.datasets.mnist\n",
"(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
"\n",
"train_images = train_images / 255.0\n",
"test_images = test_images / 255.0\n",
"\n",
"num_classes = 10\n",
"\n",
"model = tf.keras.Sequential([\n",
" tf.keras.layers.Flatten(input_shape=(28, 28)),\n",
" tf.keras.layers.Dense(32, activation=tf.nn.relu),\n",
" tf.keras.layers.Dense(16, activation=tf.nn.relu),\n",
" tf.keras.layers.Dense(10),\n",
" tf.keras.layers.Activation(tf.nn.softmax) # We seperate the activation layer to be able to access the logits of the previous layer later\n",
"])\n",
"\n",
"model.compile(optimizer='adam',\n",
" loss= 'sparse_categorical_crossentropy',\n",
" metrics=['accuracy'])\n",
"\n",
"model.fit(train_images, train_labels, epochs=10, validation_split=0.2)\n",
"test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
"\n",
"print('Test accuracy:', test_acc)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "QpC4KgTS_Fj8",
"colab_type": "text"
},
"source": [
"## Implementing the FGSM attack using fast_gradient_method()\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "U68CRVKW5ULk",
"colab_type": "text"
},
"source": [
"Cleverhans implements the fgsm attack with the following method:\n",
"\n",
"\n",
"```\n",
"fast_gradient_method(model_fn, x, eps, norm, clip_min=None, clip_max=None, y=None, targeted=False, sanity_checks=False)\n",
"\n",
"```\n",
"\n",
"> * **model_fn**: a callable that takes an input tensor and returns the model logits\n",
"* **x**: input_tensor\n",
"* **eps**: epsilon (dictates the \"strength\" of the distortion created)\n",
"* **norm**: Order of the norm (mimics NumPy). Possible values: np.inf, 1 or 2.\n",
"* **clip_min**: (optional, default=None) float. Minimum float value for adversarial example components.\n",
"* **clip_max**: (optional, default=None) float. Maximum float value for adversarial example components\n",
"* **y**: (optional, default=None) Tensor with true labels. If targeted is true, then provide the target label. Otherwise, only provide this parameter if you'd like to use true labels when crafting adversarial samples. Otherwise, model predictions are used as labels to avoid the \"label leaking\" effect (explained in this [paper](https://arxiv.org/abs/1611.01236)).\n",
"* **targeted**: (optional, default=False) bool. Is the attack targeted or untargeted? Untargeted, the default, will try to make the label incorrect. Targeted will instead try to move in the direction of being more like y.\n",
"* **param sanity_checks**: (optional, default=False) bool, if True, include asserts (Turn them off to use less runtime / memory or for unit tests that intentionally pass strange input)\n",
">\n",
"\n",
"Find it in the library [here](https://github.com/tensorflow/cleverhans/blob/master/cleverhans/future/tf2/attacks/fast_gradient_method.py)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "LgZhvgzbAuU8",
"colab_type": "code",
"colab": {}
},
"source": [
"# Import the attack\n",
"from cleverhans.future.tf2.attacks import fast_gradient_method\n",
"\n",
"#The attack requires the model to ouput the logits\n",
"logits_model = tf.keras.Model(model.input,model.layers[-1].output)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "q53TkEuD-gR9",
"colab_type": "text"
},
"source": [
"### Choose a random image to attack from the test set\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"colab_type": "code",
"id": "wA7_JKrm1wsm",
"colab": {}
},
"source": [
"random_index = np.random.randint(test_images.shape[0])\n",
"\n",
"original_image = test_images[random_index]\n",
"original_image = tf.convert_to_tensor(original_image.reshape((1,28,28))) #The .reshape just gives it the proper form to input into the model, a batch of 1 a.k.a a tensor\n",
"\n",
"original_label = test_labels[random_index]\n",
"original_label = np.reshape(original_label, (1,)).astype('int64')\n",
"\n",
"#Show the image\n",
"plt.figure()\n",
"plt.grid(False)\n",
"\n",
"plt.imshow(np.reshape(original_image, (28,28)))\n",
"plt.title(\"Label: {}\".format(original_label[0]))\n",
"\n",
"plt.show()"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "rz8QBZJ2-DO5",
"colab_type": "text"
},
"source": [
"### Non-targeted FGSM attack\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "6LSemuYWrZWz",
"colab_type": "code",
"colab": {}
},
"source": [
"epsilon = 0.1\n",
"\n",
"adv_example_untargeted_label = fast_gradient_method(logits_model, original_image, epsilon, np.inf, targeted=False)\n",
"\n",
"adv_example_untargeted_label_pred = model.predict(adv_example_untargeted_label)\n",
"\n",
"#Show the image\n",
"plt.figure()\n",
"plt.grid(False)\n",
"\n",
"plt.imshow(np.reshape(adv_example_untargeted_label, (28,28)))\n",
"plt.title(\"Model Prediction: {}\".format(np.argmax(adv_example_untargeted_label_pred)))\n",
"plt.xlabel(\"Original Label: {}\".format(original_label[0]))\n",
"\n",
"plt.show()"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "-l-eIvZZvSC6",
"colab_type": "text"
},
"source": [
"### Targeted FGSM Attack"
]
},
{
"cell_type": "code",
"metadata": {
"id": "O4teWeVYvUGQ",
"colab_type": "code",
"colab": {}
},
"source": [
"epsilon = 0.1\n",
"# The target value may have to be changed to work, some images are more easily missclassified as different labels\n",
"target = 2\n",
"\n",
"target_label = np.reshape(target, (1,)).astype('int64') # Give target label proper size and dtype to feed through\n",
"\n",
"adv_example_targeted_label = fast_gradient_method(logits_model, original_image, epsilon, np.inf, y=target_label, targeted=True)\n",
"\n",
"adv_example_targeted_label_pred = model.predict(adv_example_targeted_label)\n",
"\n",
"#Show the image\n",
"plt.figure()\n",
"plt.grid(False)\n",
"\n",
"plt.imshow(np.reshape(adv_example_targeted_label, (28,28)))\n",
"plt.title(\"Model Prediction: {}\".format(np.argmax(adv_example_targeted_label_pred)))\n",
"plt.xlabel(\"Original Label: {}\".format(original_label[0]))\n",
"\n",
"plt.show()"
],
"execution_count": 0,
"outputs": []
}
]
}