diff --git a/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/README.md b/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/README.md index 5ad8982a..541e4256 100644 --- a/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/README.md +++ b/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/README.md @@ -10,7 +10,8 @@ memorization is present and thus the less privacy-preserving the model is. The privacy vulnerability (or memorization potential) is measured via the area under the ROC-curve (`auc`) or via max{|fpr - tpr|} (`advantage`) of the attack -classifier. These measures are very closely related. +classifier. These measures are very closely related. We can also obtain a lower +bound for the differential privacy epsilon. The tests provided by the library are "black box". That is, only the outputs of the model are used (e.g., losses, logits, predictions). Neither model internals @@ -38,9 +39,8 @@ from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.data_s # loss_test shape: (n_test, ) attacks_result = mia.run_attacks( - AttackInputData( - loss_train = loss_train, - loss_test = loss_test)) + AttackInputData(loss_train=loss_train, loss_test=loss_test) +) ``` This example calls `run_attacks` with the default options to run a host of @@ -57,9 +57,11 @@ Then, we can view the attack results by: ```python print(attacks_result.summary()) # Example output: -# -> Best-performing attacks over all slices -# THRESHOLD_ATTACK (with 50000 training and 10000 test examples) achieved an AUC of 0.59 on slice Entire dataset -# THRESHOLD_ATTACK (with 50000 training and 10000 test examples) achieved an advantage of 0.20 on slice Entire dataset +# Best-performing attacks over all slices +# LOGISTIC_REGRESSION (with 7041 training and 3156 test examples) achieved an AUC of 0.72 on slice CORRECTLY_CLASSIFIED=False +# LOGISTIC_REGRESSION (with 7041 training and 3156 test examples) achieved an advantage of 0.34 on slice CORRECTLY_CLASSIFIED=False +# LOGISTIC_REGRESSION (with 5000 training and 1000 test examples) achieved a positive predictive value of 1.00 on slice CLASS=0 +# THRESHOLD_ATTACK (with 50000 training and 10000 test examples) achieved top-5 epsilon lower bounds of 4.6254, 4.6121, 4.5986, 4.5850, 4.5711 on slice Entire dataset ``` ### Other codelabs @@ -100,16 +102,17 @@ First, similar as before, we specify the input for the attack as an # loss_test shape: (n_test, ) attack_input = AttackInputData( - logits_train = logits_train, - logits_test = logits_test, - loss_train = loss_train, - loss_test = loss_test, - labels_train = labels_train, - labels_test = labels_test) + logits_train=logits_train, + logits_test=logits_test, + loss_train=loss_train, + loss_test=loss_test, + labels_train=labels_train, + labels_test=labels_test, +) ``` Instead of `logits`, you can also specify `probs_train` and `probs_test` as the -predicted probabilty vectors of each example. +predicted probability vectors of each example. Then, we specify some details of the attack. The first part includes the specifications of the slicing of the data. For example, we may want to evaluate @@ -118,10 +121,11 @@ the model's classification. These can be specified by a `SlicingSpec` object. ```python slicing_spec = SlicingSpec( - entire_dataset = True, - by_class = True, - by_percentiles = False, - by_classification_correctness = True) + entire_dataset=True, + by_class=True, + by_percentiles=False, + by_classification_correctness=True, +) ``` The second part specifies the classifiers for the attacker to use. Currently, @@ -129,56 +133,64 @@ our API supports five classifiers, including `AttackType.THRESHOLD_ATTACK` for simple threshold attack, `AttackType.LOGISTIC_REGRESSION`, `AttackType.MULTI_LAYERED_PERCEPTRON`, `AttackType.RANDOM_FOREST`, and `AttackType.K_NEAREST_NEIGHBORS` which use the corresponding machine learning -models. For some model, different classifiers can yield pertty different -results. We can put multiple classifers in a list: +models. For some model, different classifiers can yield pretty different +results. We can put multiple classifiers in a list: ```python attack_types = [ AttackType.THRESHOLD_ATTACK, - AttackType.LOGISTIC_REGRESSION + AttackType.LOGISTIC_REGRESSION, ] ``` Now, we can call the `run_attacks` methods with all specifications: ```python -attacks_result = mia.run_attacks(attack_input=attack_input, - slicing_spec=slicing_spec, - attack_types=attack_types) +attacks_result = mia.run_attacks( + attack_input=attack_input, + slicing_spec=slicing_spec, + attack_types=attack_types, +) ``` This returns an object of type `AttackResults`. We can, for example, use the -following code to see the attack results specificed per-slice, as we have -request attacks by class and by model's classification correctness. +following code to see the attack results specified per-slice, as we have request +attacks by class and by model's classification correctness. ```python print(attacks_result.summary(by_slices = True)) # Example output: -# -> Best-performing attacks over all slices -# THRESHOLD_ATTACK achieved an AUC of 0.75 on slice CORRECTLY_CLASSIFIED=False -# THRESHOLD_ATTACK achieved an advantage of 0.38 on slice CORRECTLY_CLASSIFIED=False -# -# Best-performing attacks over slice: "Entire dataset" -# LOGISTIC_REGRESSION achieved an AUC of 0.61 -# THRESHOLD_ATTACK achieved an advantage of 0.22 -# -# Best-performing attacks over slice: "CLASS=0" -# LOGISTIC_REGRESSION achieved an AUC of 0.62 -# LOGISTIC_REGRESSION achieved an advantage of 0.24 -# -# Best-performing attacks over slice: "CLASS=1" -# LOGISTIC_REGRESSION achieved an AUC of 0.61 -# LOGISTIC_REGRESSION achieved an advantage of 0.19 -# -# ... -# -# Best-performing attacks over slice: "CORRECTLY_CLASSIFIED=True" -# LOGISTIC_REGRESSION achieved an AUC of 0.53 -# THRESHOLD_ATTACK achieved an advantage of 0.05 -# -# Best-performing attacks over slice: "CORRECTLY_CLASSIFIED=False" -# THRESHOLD_ATTACK achieved an AUC of 0.75 -# THRESHOLD_ATTACK achieved an advantage of 0.38 +# Best-performing attacks over all slices +# LOGISTIC_REGRESSION (with 7041 training and 3156 test examples) achieved an AUC of 0.72 on slice CORRECTLY_CLASSIFIED=False +# LOGISTIC_REGRESSION (with 7041 training and 3156 test examples) achieved an advantage of 0.34 on slice CORRECTLY_CLASSIFIED=False +# LOGISTIC_REGRESSION (with 5000 training and 1000 test examples) achieved a positive predictive value of 1.00 on slice CLASS=0 +# THRESHOLD_ATTACK (with 50000 training and 10000 test examples) achieved top-5 epsilon lower bounds of 4.6254, 4.6121, 4.5986, 4.5850, 4.5711 on slice Entire dataset + +# Best-performing attacks over slice: "Entire dataset" +# LOGISTIC_REGRESSION (with 50000 training and 10000 test examples) achieved an AUC of 0.58 +# LOGISTIC_REGRESSION (with 50000 training and 10000 test examples) achieved an advantage of 0.17 +# THRESHOLD_ATTACK (with 50000 training and 10000 test examples) achieved a positive predictive value of 0.86 +# THRESHOLD_ATTACK (with 50000 training and 10000 test examples) achieved top-5 epsilon lower bounds of 4.6254, 4.6121, 4.5986, 4.5850, 4.5711 + +# Best-performing attacks over slice: "CLASS=0" +# LOGISTIC_REGRESSION (with 5000 training and 1000 test examples) achieved an AUC of 0.63 +# LOGISTIC_REGRESSION (with 5000 training and 1000 test examples) achieved an advantage of 0.19 +# LOGISTIC_REGRESSION (with 5000 training and 1000 test examples) achieved a positive predictive value of 1.00 +# THRESHOLD_ATTACK (with 5000 training and 1000 test examples) achieved top-5 epsilon lower bounds of 4.1920, 4.1645, 4.1364, 4.1074, 4.0775 + +# ... + +# Best-performing attacks over slice: "CORRECTLY_CLASSIFIED=True" +# LOGISTIC_REGRESSION (with 42959 training and 6844 test examples) achieved an AUC of 0.51 +# LOGISTIC_REGRESSION (with 42959 training and 6844 test examples) achieved an advantage of 0.05 +# LOGISTIC_REGRESSION (with 42959 training and 6844 test examples) achieved a positive predictive value of 0.94 +# THRESHOLD_ATTACK (with 42959 training and 6844 test examples) achieved top-5 epsilon lower bounds of 0.9495, 0.6358, 0.5630, 0.4536, 0.4341 + +# Best-performing attacks over slice: "CORRECTLY_CLASSIFIED=False" +# LOGISTIC_REGRESSION (with 7041 training and 3156 test examples) achieved an AUC of 0.72 +# LOGISTIC_REGRESSION (with 7041 training and 3156 test examples) achieved an advantage of 0.34 +# LOGISTIC_REGRESSION (with 7041 training and 3156 test examples) achieved a positive predictive value of 0.97 +# LOGISTIC_REGRESSION (with 7041 training and 3156 test examples) achieved top-5 epsilon lower bounds of 3.8844, 3.8678, 3.8510, 3.8339, 3.8165 ``` #### Viewing and plotting the attack results @@ -186,23 +198,30 @@ print(attacks_result.summary(by_slices = True)) We have seen an example of using `summary()` to view the attack results as text. We also provide some other ways for inspecting the attack results. -To get the attack that achieves the maximum attacker advantage or AUC, we can do +To get the attack that achieves the maximum attacker advantage, AUC, or epsilon +lower bound, we can do ```python max_auc_attacker = attacks_result.get_result_with_max_auc() max_advantage_attacker = attacks_result.get_result_with_max_attacker_advantage() +max_epsilon_attacker = attacks_result.get_result_with_max_epsilon() ``` Then, for individual attack, such as `max_auc_attacker`, we can check its type, -attacker advantage and AUC by +attacker advantage, AUC, and epsilon lower bound by ```python -print("Attack type with max AUC: %s, AUC of %.2f, Attacker advantage of %.2f" % - (max_auc_attacker.attack_type, - max_auc_attacker.roc_curve.get_auc(), - max_auc_attacker.roc_curve.get_attacker_advantage())) +print( + "Attack type with max AUC: %s, AUC of %.2f, Attacker advantage of %.2f, Epsilon lower bound of %s" + % ( + max_auc_attacker.attack_type, + max_auc_attacker.roc_curve.get_auc(), + max_auc_attacker.roc_curve.get_attacker_advantage(), + max_auc_attacker.get_epsilon_lower_bound() + ) +) # Example output: -# -> Attack type with max AUC: THRESHOLD_ATTACK, AUC of 0.75, Attacker advantage of 0.38 +# Attack type with max AUC: LOGISTIC_REGRESSION, AUC of 0.72, Attacker advantage of 0.34, Epsilon lower bound of [3.88435257 3.86781797 3.85100545 3.83390548 3.81650809] ``` We can also plot its ROC curve by @@ -217,7 +236,7 @@ which would give a figure like the one below ![roc_fig](https://github.com/tensorflow/privacy/blob/master/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/codelab_roc_fig.png?raw=true) Additionally, we provide functionality to convert the attack results into Pandas -data frame: +dataframe: ```python import pandas as pd @@ -225,16 +244,16 @@ import pandas as pd pd.set_option("display.max_rows", 8, "display.max_columns", None) print(attacks_result.calculate_pd_dataframe()) # Example output: -# slice feature slice value attack type Attacker advantage AUC -# 0 entire_dataset threshold 0.216440 0.600630 -# 1 entire_dataset lr 0.212073 0.612989 -# 2 class 0 threshold 0.226000 0.611669 -# 3 class 0 lr 0.239452 0.624076 -# .. ... ... ... ... ... -# 22 correctly_classfied True threshold 0.054907 0.471290 -# 23 correctly_classfied True lr 0.046986 0.525194 -# 24 correctly_classfied False threshold 0.379465 0.748138 -# 25 correctly_classfied False lr 0.370713 0.737148 +# slice feature slice value train size test size attack type Attacker advantage Positive predictive value AUC Epsilon lower bound_1 Epsilon lower bound_2 Epsilon lower bound_3 Epsilon lower bound_4 Epsilon lower bound_5 +# 0 Entire dataset 50000 10000 THRESHOLD_ATTACK 0.172520 0.862614 0.581630 4.625393 4.612104 4.598635 4.584982 4.571140 +# 1 Entire dataset 50000 10000 LOGISTIC_REGRESSION 0.173060 0.862081 0.583981 4.531399 4.513775 4.511974 4.498905 4.492165 +# 2 class 0 5000 1000 THRESHOLD_ATTACK 0.162000 0.877551 0.580728 4.191954 4.164547 4.136368 4.107372 4.077511 +# 3 class 0 5000 1000 LOGISTIC_REGRESSION 0.193800 1.000000 0.627758 3.289194 3.220285 3.146292 3.118849 3.066407 +# ... +# 22 correctly_classified True 42959 6844 THRESHOLD_ATTACK 0.043953 0.862643 0.474713 0.949550 0.635773 0.563032 0.453640 0.434125 +# 23 correctly_classified True 42959 6844 LOGISTIC_REGRESSION 0.048963 0.943218 0.505334 0.597257 0.596095 0.594016 0.592702 0.590765 +# 24 correctly_classified False 7041 3156 THRESHOLD_ATTACK 0.326865 0.941176 0.707597 3.818741 3.805451 3.791982 3.778329 3.764488 +# 25 correctly_classified False 7041 3156 LOGISTIC_REGRESSION 0.336655 0.972222 0.717386 3.884353 3.867818 3.851005 3.833905 3.816508 ``` #### Advanced Membership Inference Attacks diff --git a/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/codelabs/codelab.ipynb b/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/codelabs/codelab.ipynb index b0903101..00a3c323 100644 --- a/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/codelabs/codelab.ipynb +++ b/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/codelabs/codelab.ipynb @@ -1,393 +1,386 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "1eiwVljWpzM7" - }, - "source": [ - "Copyright 2020 The TensorFlow Authors.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "cellView": "both", - "colab": {}, - "colab_type": "code", - "id": "4rmwPgXeptiS" - }, - "outputs": [], - "source": [ - "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "YM2gRaJMqvMi" - }, - "source": [ - "# Assess privacy risks with TensorFlow Privacy Membership Inference Attacks" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "-B5ZvlSqqLaR" - }, - "source": [ - "\n", - " \n", - " \n", - "
\n", - " Run in Google Colab\n", - " \n", - " View source on GitHub\n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "9rMuytY7Nn8P" - }, - "source": [ - "##Overview\n", - "In this codelab we'll train a simple image classification model on the CIFAR10 dataset, and then use the \"membership inference attack\" against this model to assess if the attacker is able to \"guess\" whether a particular sample was present in the training set." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "FUWqArj_q8vs" - }, - "source": [ - "## Setup\n", - "First, set this notebook's runtime to use a GPU, under Runtime > Change runtime type > Hardware accelerator. Then, begin importing the necessary libraries." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "cellView": "form", - "colab": {}, - "colab_type": "code", - "id": "Lr1pwHcbralz" - }, - "outputs": [], - "source": [ - "#@title Import statements.\n", - "import numpy as np\n", - "from typing import Tuple, Text\n", - "from scipy import special\n", - "\n", - "import tensorflow as tf\n", - "import tensorflow_datasets as tfds\n", - "\n", - "# Set verbosity.\n", - "tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)\n", - "from warnings import simplefilter\n", - "from sklearn.exceptions import ConvergenceWarning\n", - "simplefilter(action=\"ignore\", category=ConvergenceWarning)\n", - "simplefilter(action=\"ignore\", category=FutureWarning)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "ucw81ar6ru-6" - }, - "source": [ - "### Install TensorFlow Privacy." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "cellView": "both", - "colab": {}, - "colab_type": "code", - "id": "zcqAmiGH90kl" - }, - "outputs": [], - "source": [ - "!pip3 install git+https://github.com/tensorflow/privacy\n", - "\n", - "from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack import membership_inference_attack as mia" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "pBbcG86th_sW" - }, - "source": [ - "## Train a model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "cellView": "form", - "colab": {}, - "colab_type": "code", - "id": "vCyOWyyhXLib" - }, - "outputs": [], - "source": [ - "#@markdown Train a simple model on CIFAR10 with Keras.\n", - "\n", - "dataset = 'cifar10'\n", - "num_classes = 10\n", - "num_conv = 3\n", - "activation = 'relu'\n", - "lr = 0.02\n", - "momentum = 0.9\n", - "batch_size = 250\n", - "epochs = 100 # Privacy risks are especially visible with lots of epochs.\n", - "\n", - "\n", - "def small_cnn(input_shape: Tuple[int],\n", - " num_classes: int,\n", - " num_conv: int,\n", - " activation: Text = 'relu') -> tf.keras.models.Sequential:\n", - " \"\"\"Setup a small CNN for image classification.\n", - "\n", - " Args:\n", - " input_shape: Integer tuple for the shape of the images.\n", - " num_classes: Number of prediction classes.\n", - " num_conv: Number of convolutional layers.\n", - " activation: The activation function to use for conv and dense layers.\n", - "\n", - " Returns:\n", - " The Keras model.\n", - " \"\"\"\n", - " model = tf.keras.models.Sequential()\n", - " model.add(tf.keras.layers.Input(shape=input_shape))\n", - "\n", - " # Conv layers\n", - " for _ in range(num_conv):\n", - " model.add(tf.keras.layers.Conv2D(32, (3, 3), activation=activation))\n", - " model.add(tf.keras.layers.MaxPooling2D())\n", - "\n", - " model.add(tf.keras.layers.Flatten())\n", - " model.add(tf.keras.layers.Dense(64, activation=activation))\n", - " model.add(tf.keras.layers.Dense(num_classes))\n", - " return model\n", - "\n", - "\n", - "print('Loading the dataset.')\n", - "train_ds = tfds.as_numpy(\n", - " tfds.load(dataset, split=tfds.Split.TRAIN, batch_size=-1))\n", - "test_ds = tfds.as_numpy(\n", - " tfds.load(dataset, split=tfds.Split.TEST, batch_size=-1))\n", - "x_train = train_ds['image'].astype('float32') / 255.\n", - "y_train_indices = train_ds['label'][:, np.newaxis]\n", - "x_test = test_ds['image'].astype('float32') / 255.\n", - "y_test_indices = test_ds['label'][:, np.newaxis]\n", - "\n", - "# Convert class vectors to binary class matrices.\n", - "y_train = tf.keras.utils.to_categorical(y_train_indices, num_classes)\n", - "y_test = tf.keras.utils.to_categorical(y_test_indices, num_classes)\n", - "\n", - "input_shape = x_train.shape[1:]\n", - "\n", - "model = small_cnn(\n", - " input_shape, num_classes, num_conv=num_conv, activation=activation)\n", - "\n", - "print('learning rate %f', lr)\n", - "\n", - "optimizer = tf.keras.optimizers.SGD(lr=lr, momentum=momentum)\n", - "\n", - "loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True)\n", - "model.compile(loss=loss, optimizer=optimizer, metrics=['accuracy'])\n", - "model.summary()\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " batch_size=batch_size,\n", - " epochs=epochs,\n", - " validation_data=(x_test, y_test),\n", - " shuffle=True)\n", - "print('Finished training.')" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "ee-zjGGGV1DC" - }, - "source": [ - "## Calculate logits, probabilities and loss values for training and test sets.\n", - "\n", - "We will use these values later in the membership inference attack to separate training and test samples." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "cellView": "both", - "colab": {}, - "colab_type": "code", - "id": "um9r0tSiPx4u" - }, - "outputs": [], - "source": [ - "print('Predict on train...')\n", - "logits_train = model.predict(x_train, batch_size=batch_size)\n", - "print('Predict on test...')\n", - "logits_test = model.predict(x_test, batch_size=batch_size)\n", - "\n", - "print('Apply softmax to get probabilities from logits...')\n", - "prob_train = special.softmax(logits_train, axis=1)\n", - "prob_test = special.softmax(logits_test, axis=1)\n", - "\n", - "print('Compute losses...')\n", - "cce = tf.keras.backend.categorical_crossentropy\n", - "constant = tf.keras.backend.constant\n", - "\n", - "loss_train = cce(constant(y_train), constant(prob_train), from_logits=False).numpy()\n", - "loss_test = cce(constant(y_test), constant(prob_test), from_logits=False).numpy()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "QETxVOHLiHP4" - }, - "source": [ - "## Run membership inference attacks.\n", - "\n", - "We will now execute a membership inference attack against the previously trained CIFAR10 model. This will generate a number of scores, most notably, attacker advantage and AUC for the membership inference classifier.\n", - "\n", - "An AUC of close to 0.5 means that the attack wasn't able to identify training samples, which means that the model doesn't have privacy issues according to this test. Higher values, on the contrary, indicate potential privacy issues." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "colab": {}, - "colab_type": "code", - "id": "B8NIwhVwQT7I" - }, - "outputs": [], - "source": [ - "from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.data_structures import AttackInputData\n", - "from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.data_structures import SlicingSpec\n", - "from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.data_structures import AttackType\n", - "\n", - "import tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.plotting as plotting\n", - "\n", - "labels_train = np.argmax(y_train, axis=1)\n", - "labels_test = np.argmax(y_test, axis=1)\n", - "\n", - "input = AttackInputData(\n", - " logits_train = logits_train,\n", - " logits_test = logits_test,\n", - " loss_train = loss_train,\n", - " loss_test = loss_test,\n", - " labels_train = labels_train,\n", - " labels_test = labels_test\n", - ")\n", - "\n", - "# Run several attacks for different data slices\n", - "attacks_result = mia.run_attacks(input,\n", - " SlicingSpec(\n", - " entire_dataset = True,\n", - " by_class = True,\n", - " by_classification_correctness = True\n", - " ),\n", - " attack_types = [\n", - " AttackType.THRESHOLD_ATTACK,\n", - " AttackType.LOGISTIC_REGRESSION])\n", - "\n", - "# Plot the ROC curve of the best classifier\n", - "fig = plotting.plot_roc_curve(\n", - " attacks_result.get_result_with_max_auc().roc_curve)\n", - "\n", - "# Print a user-friendly summary of the attacks\n", - "print(attacks_result.summary(by_slices = True))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "E9zwsPGFujVq" - }, - "source": [ - "This is the end of the codelab!\n", - "Feel free to change the parameters to see how the privacy risks change.\n", - "\n", - "You can try playing with:\n", - "* the number of training epochs\n", - "* different attack_types" - ] - } - ], - "metadata": { - "colab": { - "collapsed_sections": [], - "last_runtime": { - "build_target": "//learning/deepmind/public/tools/ml_python:ml_notebook", - "kind": "private" - }, - "name": "Membership inference codelab", - "provenance": [] - }, - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.10" - }, - "pycharm": { - "stem_cell": { - "cell_type": "raw", - "metadata": { - "collapsed": false + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "1eiwVljWpzM7" + }, + "source": [ + "Copyright 2020 The TensorFlow Authors.\n" + ] }, - "source": [] - } - } - }, - "nbformat": 4, - "nbformat_minor": 1 + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "4rmwPgXeptiS" + }, + "outputs": [], + "source": [ + "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YM2gRaJMqvMi" + }, + "source": [ + "# Assess privacy risks with TensorFlow Privacy Membership Inference Attacks" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-B5ZvlSqqLaR" + }, + "source": [ + "\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n", + " \u003ctd\u003e\n", + " \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/privacy/blob/master/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/codelabs/codelab.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n", + " \u003c/td\u003e\n", + " \u003ctd\u003e\n", + " \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/privacy/blob/master/tensorflow_privacy/privacy/privacy_tests/membership_inference_attack/codelabs/codelab.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n", + " \u003c/td\u003e\n", + "\u003c/table\u003e" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9rMuytY7Nn8P" + }, + "source": [ + "##Overview\n", + "In this codelab we'll train a simple image classification model on the CIFAR10 dataset, and then use the \"membership inference attack\" against this model to assess if the attacker is able to \"guess\" whether a particular sample was present in the training set." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FUWqArj_q8vs" + }, + "source": [ + "## Setup\n", + "First, set this notebook's runtime to use a GPU, under Runtime \u003e Change runtime type \u003e Hardware accelerator. Then, begin importing the necessary libraries." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "executionInfo": { + "elapsed": 4130, + "status": "ok", + "timestamp": 1729790860657, + "user": { + "displayName": "", + "userId": "" + }, + "user_tz": 420 + }, + "id": "Lr1pwHcbralz" + }, + "outputs": [], + "source": [ + "# @title Import statements.\n", + "from typing import Text, Tuple\n", + "import numpy as np\n", + "from scipy import special\n", + "import tensorflow as tf\n", + "import tensorflow_datasets as tfds\n", + "\n", + "# Set verbosity.\n", + "tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)\n", + "from warnings import simplefilter\n", + "from sklearn.exceptions import ConvergenceWarning\n", + "\n", + "simplefilter(action=\"ignore\", category=ConvergenceWarning)\n", + "simplefilter(action=\"ignore\", category=FutureWarning)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ucw81ar6ru-6" + }, + "source": [ + "### Install TensorFlow Privacy." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "both", + "id": "zcqAmiGH90kl" + }, + "outputs": [], + "source": [ + "!pip3 install git+https://github.com/tensorflow/privacy\n", + "\n", + "from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack import membership_inference_attack as mia" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "pBbcG86th_sW" + }, + "source": [ + "## Train a model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true, + "id": "vCyOWyyhXLib" + }, + "outputs": [], + "source": [ + "# @markdown Train a simple model on CIFAR10 with Keras.\n", + "\n", + "dataset = 'cifar10'\n", + "num_classes = 10\n", + "num_conv = 3\n", + "activation = 'relu'\n", + "lr = 0.02\n", + "momentum = 0.9\n", + "batch_size = 250\n", + "epochs = 100 # Privacy risks are especially visible with lots of epochs.\n", + "\n", + "\n", + "def small_cnn(\n", + " input_shape: Tuple[int],\n", + " num_classes: int,\n", + " num_conv: int,\n", + " activation: Text = 'relu',\n", + ") -\u003e tf.keras.models.Sequential:\n", + " \"\"\"Setup a small CNN for image classification.\n", + "\n", + " Args:\n", + " input_shape: Integer tuple for the shape of the images.\n", + " num_classes: Number of prediction classes.\n", + " num_conv: Number of convolutional layers.\n", + " activation: The activation function to use for conv and dense layers.\n", + "\n", + " Returns:\n", + " The Keras model.\n", + " \"\"\"\n", + " model = tf.keras.models.Sequential()\n", + " model.add(tf.keras.layers.Input(shape=input_shape))\n", + "\n", + " # Conv layers\n", + " for _ in range(num_conv):\n", + " model.add(tf.keras.layers.Conv2D(32, (3, 3), activation=activation))\n", + " model.add(tf.keras.layers.MaxPooling2D())\n", + "\n", + " model.add(tf.keras.layers.Flatten())\n", + " model.add(tf.keras.layers.Dense(64, activation=activation))\n", + " model.add(tf.keras.layers.Dense(num_classes))\n", + " return model\n", + "\n", + "\n", + "print('Loading the dataset.')\n", + "train_ds = tfds.as_numpy(\n", + " tfds.load(dataset, split=tfds.Split.TRAIN, batch_size=-1)\n", + ")\n", + "test_ds = tfds.as_numpy(\n", + " tfds.load(dataset, split=tfds.Split.TEST, batch_size=-1)\n", + ")\n", + "x_train = train_ds['image'].astype('float32') / 255.0\n", + "y_train_indices = train_ds['label'][:, np.newaxis]\n", + "x_test = test_ds['image'].astype('float32') / 255.0\n", + "y_test_indices = test_ds['label'][:, np.newaxis]\n", + "\n", + "# Convert class vectors to binary class matrices.\n", + "y_train = tf.keras.utils.to_categorical(y_train_indices, num_classes)\n", + "y_test = tf.keras.utils.to_categorical(y_test_indices, num_classes)\n", + "\n", + "input_shape = x_train.shape[1:]\n", + "\n", + "model = small_cnn(\n", + " input_shape, num_classes, num_conv=num_conv, activation=activation\n", + ")\n", + "\n", + "print('learning rate %f', lr)\n", + "\n", + "optimizer = tf.keras.optimizers.SGD(lr=lr, momentum=momentum)\n", + "\n", + "loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True)\n", + "model.compile(loss=loss, optimizer=optimizer, metrics=['accuracy'])\n", + "model.summary()\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " batch_size=batch_size,\n", + " epochs=epochs,\n", + " validation_data=(x_test, y_test),\n", + " shuffle=True,\n", + ")\n", + "print('Finished training.')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ee-zjGGGV1DC" + }, + "source": [ + "## Calculate logits, probabilities and loss values for training and test sets.\n", + "\n", + "We will use these values later in the membership inference attack to separate training and test samples." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "both", + "id": "um9r0tSiPx4u" + }, + "outputs": [], + "source": [ + "print('Predict on train...')\n", + "logits_train = model.predict(x_train, batch_size=batch_size)\n", + "print('Predict on test...')\n", + "logits_test = model.predict(x_test, batch_size=batch_size)\n", + "\n", + "print('Apply softmax to get probabilities from logits...')\n", + "prob_train = special.softmax(logits_train, axis=1)\n", + "prob_test = special.softmax(logits_test, axis=1)\n", + "\n", + "print('Compute losses...')\n", + "cce = tf.keras.backend.categorical_crossentropy\n", + "constant = tf.keras.backend.constant\n", + "\n", + "loss_train = cce(\n", + " constant(y_train), constant(prob_train), from_logits=False\n", + ").numpy()\n", + "loss_test = cce(\n", + " constant(y_test), constant(prob_test), from_logits=False\n", + ").numpy()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QETxVOHLiHP4" + }, + "source": [ + "## Run membership inference attacks.\n", + "\n", + "We will now execute a membership inference attack against the previously trained CIFAR10 model. This will generate a number of scores, most notably, attacker advantage and AUC for the membership inference classifier.\n", + "\n", + "An AUC of close to 0.5 means that the attack wasn't able to identify training samples, which means that the model doesn't have privacy issues according to this test. Higher values, on the contrary, indicate potential privacy issues." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "B8NIwhVwQT7I" + }, + "outputs": [], + "source": [ + "from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.data_structures import AttackInputData\n", + "from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.data_structures import AttackType\n", + "from tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.data_structures import SlicingSpec\n", + "import tensorflow_privacy.privacy.privacy_tests.membership_inference_attack.plotting as plotting\n", + "\n", + "labels_train = np.argmax(y_train, axis=1)\n", + "labels_test = np.argmax(y_test, axis=1)\n", + "\n", + "attack_input = AttackInputData(\n", + " logits_train=logits_train,\n", + " logits_test=logits_test,\n", + " loss_train=loss_train,\n", + " loss_test=loss_test,\n", + " labels_train=labels_train,\n", + " labels_test=labels_test,\n", + ")\n", + "\n", + "# Run several attacks for different data slices\n", + "attacks_result = mia.run_attacks(\n", + " attack_input=attack_input,\n", + " slicing_spec=SlicingSpec(\n", + " entire_dataset=True, by_class=True, by_classification_correctness=True\n", + " ),\n", + " attack_types=[AttackType.THRESHOLD_ATTACK, AttackType.LOGISTIC_REGRESSION],\n", + ")\n", + "\n", + "# Plot the ROC curve of the best classifier\n", + "fig = plotting.plot_roc_curve(\n", + " attacks_result.get_result_with_max_auc().roc_curve\n", + ")\n", + "\n", + "# Print a user-friendly summary of the attacks\n", + "print(attacks_result.summary(by_slices=True))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "E9zwsPGFujVq" + }, + "source": [ + "This is the end of the codelab!\n", + "Feel free to change the parameters to see how the privacy risks change.\n", + "\n", + "You can try playing with:\n", + "* the number of training epochs\n", + "* different attack_types" + ] + } + ], + "metadata": { + "colab": { + "last_runtime": { + "build_target": "//learning/grp/tools/ml_python/gpu:ml_notebook", + "kind": "private" + }, + "name": "Membership inference codelab", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.10" + }, + "pycharm": { + "stem_cell": { + "cell_type": "raw", + "metadata": { + "collapsed": false + }, + "source": [] + } + } + }, + "nbformat": 4, + "nbformat_minor": 0 }