Created using Colaboratory

cleverhans-lab · axantillon · Aug 5, 2019 · Aug 5, 2019 · Aug 6, 2019 · Aug 6, 2019
commit 987b1e00c54972a9681c754c24c9d02363ca399e
diff --git a/tutorials/future/tf2/mnist_fgsm_tutorial b/tutorials/future/tf2/mnist_fgsm_tutorial
@@ -0,0 +1,311 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "Cleverhans FGSM attack.ipynb",
+      "version": "0.3.2",
+      "provenance": [],
+      "collapsed_sections": [
+        "q53TkEuD-gR9"
+      ],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/andantillon/cleverhans/blob/master/tutorials/future/tf2/mnist_fgsm_tutorial\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "RHwgLwN4nXPi",
+        "colab_type": "text"
+      },
+      "source": [
+        "# Fast Gradient Sign Method (FGSM) using Cleverhans\n",
+        "\n",
+        "#### Easily implement an FGSM attack on a model using Cleverhans and Tensorflow 2.0\n",
+        ">\n",
+        "Checkout the Cleverhans on Github [here](https://www.github.org/tensorflow/cleverhans).\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Vs9wD38he0cM",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Import dependent libraries\n",
+        "Tensorflow 2.0 required \n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "eBsLelavnv0u",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "!pip install -q tensorflow==2.0.0b1\n",
+        "# Install bleeding edge version of cleverhans\n",
+        "!pip install git+https://github.com/tensorflow/cleverhans.git#egg=cleverhans\n",
+        "\n",
+        "import cleverhans\n",
+        "import tensorflow as tf\n",
+        "import numpy as np\n",
+        "import matplotlib.pyplot as plt\n",
+        "\n",
+        "print(\"\\nTensorflow Version: \" + tf.__version__)\n",
+        "print(\"Cleverhans Version: \" + cleverhans.__version__)\n",
+        "print(\"GPU Available: \", tf.test.is_gpu_available())"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b4kgPQqp-hwR",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Training a simple model on the MNIST dataset\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_z99mq1bKkjz",
+        "colab_type": "text"
+      },
+      "source": [
+        "> If you would like to experiment with other datasets feel free to replace this code by whatever you need to do so. Just remember that to make the rest of the notebook work without any other major changes keep the variables train_images, train_labels, test_images, test_labels and assign them to the corresponding new data. I recommend any of the datasets offered by keras as they use the same mechanics to import them.\n",
+        "\n",
+        ">Keep in mind that the MNIST dataset was used in this guide because of the little amount of time it takes to both train a good model and craft the attacks."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "uQVP1re__pbf",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "mnist = tf.keras.datasets.mnist\n",
+        "(train_images, train_labels), (test_images, test_labels) = mnist.load_data()\n",
+        "\n",
+        "train_images = train_images / 255.0\n",
+        "test_images = test_images / 255.0\n",
+        "\n",
+        "num_classes = 10\n",
+        "\n",
+        "model = tf.keras.Sequential([\n",
+        "    tf.keras.layers.Flatten(input_shape=(28, 28)),\n",
+        "    tf.keras.layers.Dense(32, activation=tf.nn.relu),\n",
+        "    tf.keras.layers.Dense(16, activation=tf.nn.relu),\n",
+        "    tf.keras.layers.Dense(10),\n",
+        "    tf.keras.layers.Activation(tf.nn.softmax) # We seperate the activation layer to be able to access the logits of the previous layer later\n",
+        "])\n",
+        "\n",
+        "model.compile(optimizer='adam',\n",
+        "              loss= 'sparse_categorical_crossentropy',\n",
+        "              metrics=['accuracy'])\n",
+        "\n",
+        "model.fit(train_images, train_labels, epochs=10, validation_split=0.2)\n",
+        "test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
+        "\n",
+        "print('Test accuracy:', test_acc)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "QpC4KgTS_Fj8",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Implementing the FGSM attack using fast_gradient_method()\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "U68CRVKW5ULk",
+        "colab_type": "text"
+      },
+      "source": [
+        "Cleverhans implements the fgsm attack with the following method:\n",
+        "\n",
+        "\n",
+        "```\n",
+        "fast_gradient_method(model_fn, x, eps, norm, clip_min=None, clip_max=None, y=None, targeted=False, sanity_checks=False)\n",
+        "\n",
+        "```\n",
+        "\n",
+        "> * **model_fn**: a callable that takes an input tensor and returns the model logits\n",
+        "* **x**: input_tensor\n",
+        "* **eps**: epsilon (dictates the \"strength\" of the distortion created)\n",
+        "* **norm**: Order of the norm (mimics NumPy). Possible values: np.inf, 1 or 2.\n",
+        "* **clip_min**: (optional, default=None) float. Minimum float value for adversarial example components.\n",
+        "* **clip_max**: (optional, default=None) float. Maximum float value for adversarial example components\n",
+        "* **y**: (optional, default=None) Tensor with true labels. If targeted is true, then provide the target label. Otherwise, only provide this parameter if you'd like to use true labels when crafting adversarial samples. Otherwise, model predictions are used as labels to avoid the \"label leaking\" effect (explained in this [paper](https://arxiv.org/abs/1611.01236)).\n",
+        "* **targeted**: (optional, default=False) bool. Is the attack targeted or untargeted? Untargeted, the default, will try to make the label incorrect. Targeted will instead try to move in the direction of being more like y.\n",
+        "* **param sanity_checks**: (optional, default=False) bool, if True, include  asserts (Turn them off to use less runtime / memory or for unit tests that intentionally pass strange input)\n",
+        ">\n",
+        "\n",
+        "Find it in the library [here](https://github.com/tensorflow/cleverhans/blob/master/cleverhans/future/tf2/attacks/fast_gradient_method.py)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "LgZhvgzbAuU8",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "# Import the attack\n",
+        "from cleverhans.future.tf2.attacks import fast_gradient_method\n",
+        "\n",
+        "#The attack requires the model to ouput the logits\n",
+        "logits_model = tf.keras.Model(model.input,model.layers[-1].output)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "q53TkEuD-gR9",
+        "colab_type": "text"
+      },
+      "source": [
+        "### Choose a random image to attack from the test set\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "colab_type": "code",
+        "id": "wA7_JKrm1wsm",
+        "colab": {}
+      },
+      "source": [
+        "random_index = np.random.randint(test_images.shape[0])\n",
+        "\n",
+        "original_image = test_images[random_index]\n",
+        "original_image = tf.convert_to_tensor(original_image.reshape((1,28,28))) #The .reshape just gives it the proper form to input into the model, a batch of 1 a.k.a a tensor\n",
+        "\n",
+        "original_label = test_labels[random_index]\n",
+        "original_label = np.reshape(original_label, (1,)).astype('int64')\n",
+        "\n",
+        "#Show the image\n",
+        "plt.figure()\n",
+        "plt.grid(False)\n",
+        "\n",
+        "plt.imshow(np.reshape(original_image, (28,28)))\n",
+        "plt.title(\"Label: {}\".format(original_label[0]))\n",
+        "\n",
+        "plt.show()"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "rz8QBZJ2-DO5",
+        "colab_type": "text"
+      },
+      "source": [
+        "### Non-targeted FGSM attack\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "6LSemuYWrZWz",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "epsilon = 0.1\n",
+        "\n",
+        "adv_example_untargeted_label = fast_gradient_method(logits_model, original_image, epsilon, np.inf, targeted=False)\n",
+        "\n",
+        "adv_example_untargeted_label_pred = model.predict(adv_example_untargeted_label)\n",
+        "\n",
+        "#Show the image\n",
+        "plt.figure()\n",
+        "plt.grid(False)\n",
+        "\n",
+        "plt.imshow(np.reshape(adv_example_untargeted_label, (28,28)))\n",
+        "plt.title(\"Model Prediction: {}\".format(np.argmax(adv_example_untargeted_label_pred)))\n",
+        "plt.xlabel(\"Original Label: {}\".format(original_label[0]))\n",
+        "\n",
+        "plt.show()"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "-l-eIvZZvSC6",
+        "colab_type": "text"
+      },
+      "source": [
+        "### Targeted FGSM Attack"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "metadata": {
+        "id": "O4teWeVYvUGQ",
+        "colab_type": "code",
+        "colab": {}
+      },
+      "source": [
+        "epsilon = 0.1\n",
+        "# The target value may have to be changed to work, some images are more easily missclassified as different labels\n",
+        "target = 2\n",
+        "\n",
+        "target_label = np.reshape(target, (1,)).astype('int64') # Give target label proper size and dtype to feed through\n",
+        "\n",
+        "adv_example_targeted_label = fast_gradient_method(logits_model, original_image, epsilon, np.inf, y=target_label, targeted=True)\n",
+        "\n",
+        "adv_example_targeted_label_pred = model.predict(adv_example_targeted_label)\n",
+        "\n",
+        "#Show the image\n",
+        "plt.figure()\n",
+        "plt.grid(False)\n",
+        "\n",
+        "plt.imshow(np.reshape(adv_example_targeted_label, (28,28)))\n",
+        "plt.title(\"Model Prediction: {}\".format(np.argmax(adv_example_targeted_label_pred)))\n",
+        "plt.xlabel(\"Original Label: {}\".format(original_label[0]))\n",
+        "\n",
+        "plt.show()"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    }
+  ]
+}