Now that you have background knowledge regarding how Convolution Neural Networks (CNNs) work and how to build them using Keras, its time to practice those skills a little more independently in order to build a CNN (or ConvNet) on your own to solve a image recognition problem. In this lab, you'll practice building an image classifier from start to finish using a CNN.
In this lab you will:
- Load images from a hierarchical file structure using an image datagenerator
- Apply data augmentation to image files before training a neural network
- Build a CNN using Keras
- Visualize and evaluate the performance of CNN models
The data for this lab are a bunch of pictures of cats and dogs, and our task is to correctly classify a picture as one or the other. The original dataset is from Kaggle. We have downsampled this dataset in order to reduce training time for you when you design and fit your model to the data. ⏰ It is anticipated that this process will take approximately one hour to run on a standard machine, although times will vary depending on your particular computer and set up. At the end of this lab, you are welcome to try training on the complete dataset and observe the impact on the model's overall accuracy.
You can find the initial downsampled dataset in a subdirectory, cats_dogs_downsampled, of this repository.
# Load the images
train_dir = 'cats_dogs_downsampled/train'
validation_dir = 'cats_dogs_downsampled/val/'
test_dir = 'cats_dogs_downsampled/test/'
# Set-up date time to track how long run time takes
import datetime
original_start = datetime.datetime.now()
start = datetime.datetime.now()
# Preprocess the images into tensors
# Rescale the data by 1/.255 and use binary_crossentropy loss
Now it's time to design your CNN using Keras. Remember a few things when doing this:
- You should alternate convolutional and pooling layers
- You should have later layers have a larger number of parameters in order to detect more abstract patterns
- Add some final dense layers to add a classifier to the convolutional base
- Compile this model
# Design the model
# Note: You may get a comment from tf regarding your kernel. This is not a warning per se, but rather informational.
# Compile the model
Remember that training deep networks is resource intensive: depending on the size of the data, even a CNN with 3-4 successive convolutional and pooling layers is apt to take a hours to train on a high end laptop. See the code chunk below to see how long it took to run your model.
If you are concerned with runtime, you may want to set your model to run the training epochs overnight.
If you are going to run this process overnight, be sure to also script code for the following questions concerning data augmentation. Check your code twice (or more) and then set the notebook to run all, or something equivalent to have them train overnight.
# Set the model to train
# Note: You may get a comment from tf regarding your GPU or sometning similar.
# This is not a warning per se, but rather informational.
# ⏰ This cell may take several minutes to run
# Plot history
import matplotlib.pyplot as plt
%matplotlib inline
# Type code here for plot history
# Check runtime
end = datetime.datetime.now()
elapsed = end - start
print('Training took a total of {}'.format(elapsed))
# Save the model for future reference
Recall that data augmentation is typically always a necessary step when using a small dataset as this one which you have been provided. If you haven't already, implement a data augmentation setup.
Warning: ⏰ This process may take awhile depending on your set-up. As such, make allowances for this as necessary.
# Set-up date time to track how long run time takes
start = datetime.datetime.now()
# Add data augmentation to the model setup and set the model to train;
# See the warnings above if you intend to run these blocks of code
# ⏰ These cells where may take quite some time to run
# Check runtime
Save the model for future reference.
# Save the model
Now use the test set to perform a final evaluation on your model of choice.
# Perform a final evaluation using the test set
Well done. In this lab, you practice building your own CNN for image recognition which drastically outperformed our previous attempts using a standard deep learning model alone. In the upcoming sections, we'll continue to investigate further techniques associated with CNNs including visualizing the representations they learn and techniques to further bolster their performance when we have limited training data such as here.