Traffic Sign Recognition

Build a Traffic Sign Recognition Project

The goals / steps of this project are the following:

Load the dataset.
Explore, summarize and visualize the data set
Design, train and test a model architecture
Use the model to make predictions on new images
Analyze the softmax probabilities of the new images
Summarize the results with a written report

You're reading it! and here is a link to my project code

Data Set Summary & Exploration

1. Basic summary of the dataset

I used the numpy library to calculate summary statistics of the traffic signs data set:

The size of training set is 34799
The size of the validation set is 4410
The size of test set is 12630
The shape of a traffic sign image is 32 (width) x 32 (height) x 3 (RGB color channels)
The number of unique classes/labels in the data set is 43 (see file signnames.csv)

2. Exploratory visualization of the dataset

Here is an exploratory visualization of the training data set. It is a bar chart showing the count of each sign. Here I use pandas pivot table and matplot library to plot the histogram of Traffic Sign distribution.

You also can see below a sample of the images from the dataset.

Design and Test a Model Architecture

Pre-Pocessing the image data

As a first step, I decided to convert the images to grayscale because gray images only has one channel for each pixels, which can reduce the use of the computational load and storage cost.

Here is a sample of traffic sign images after grayscaling.

As a last step, I normalized the image data with Feature Standardization because it refers to (independently) setting each dimension of the data to have zero-mean and unit-variance. it will have better training accuracy and results.

Here is a sample of traffic sign images after normalized.

Final model architecture

My final model consisted of the following layers:

Layer	Description
Input	32x32x1 gray image
Convolution 3x3	1x1 stride, valid padding, outputs 28x28x6
RELU
Max pooling	2x2 stride, outputs 14x14x6
Convolution 3x3	1x1 stride, valid padding, outputs 10x10x16
RELU
Max pooling	2x2 stride, outputs 5x5x16
Dropout	0.5 keep probability
Flatten	outputs 400
Fully connected	outputs 120
RELU
Fully connected	outputs 84
RELU
Fully connected	outputs 43
Dropout	0.5 keep probability
Softmax	etc.

How to trained the model

To train the model, I calculate the cross entropy with tf.nn.softmax_cross_entropy_with_logits(), and use tf.reduce_mean(cross_entropy) as the loss function. Then I use AdamOptimizer as the optimizer to reduce the loss and adjust the weights in the model. Finally, I run the model to find the final solutions with 128 as the batch size, 0.001 as the learning rate and 100 as the epochs.

In order to improve the model reliability, I use Dropout to prevents the model from overfitting. Refer to the paper, I defined two levels of dropout, one (p-conv) for convolutional layers, the other (p-fc) for fully connected layers and p-conv >= p-fc.

I tried different paratemers but ultimately settled on p-conv=0.5 and p-fc=0.5, which enabled us to achieve a test accuracy of 94.7% on normalised grayscale images.

Training Approach

My final model results were:

training set accuracy of 99.9%
validation set accuracy of 96.4%
test set accuracy of 95.2%

I use the classical model structure LeNet5 with Tensorflow. The LeNet consists of a convolutional layer with 32x32x1 normalised gray image input and 28x28x6 as output, followed by relu activation layer and max-pooling layer. Then we have same three layers following, the output is 5x5x16. Then a Flatten layer was used to compress the 5x5x16 to 400. and then followed by two full connected layer that convert 400 to number of classes 43.

I compared grayscaled and normalised images, and saw that normalised gray image tended to outperform the only grayscaled. I got the Training Accuracy = 1.000, Validation Accuracy = 0.948 and Test Accuracy = 0.934.

That's not bad, but as the graph above shows that the model is not smooth, which actually meant our model was overfitting on the training set and not generalising.

so I added two levels of dropout, one (p-conv = 0.5) for convolutional layers, the other (p-fc = 0.5) for fully connected layers. I got the Training Accuracy = 0.999, Validation Accuracy = 0.964 and Test Accuracy = 0.952.

Test a Model on New Images

Load the new German traffic signs

Here are Seven German traffic signs that I found on the web:

They represent different traffic signs that we currently classify, the "unknow" image is out of the classes, so it certainly can't be classify correctly.

The "Priority Road" image might be difficult to classify because it is incomplete.

Predictions Result

The model predicted 5 out of 7 signs correctly, it's 71.4% accurate on these new images. Exclusive of the "unknow" sign, it gives an accuracy of 85.7%. This compares favorably to the accuracy on the test set of 95.2%.

Here are the results of the prediction:

Image	Prediction
Keep Right	Keep Right
Stop	Stop
Yield	Yield
Speed Limit 30 km/h	Speed Limit 30 km/h
Priority Road	Traffic Signals
No Entry	No Entry
unknow	Turn Left Ahead

Softmax probabilities

The output results of top_5 softmax probabilities is listed below:

TopKV2(values=array([[  1.00000000e+00,   9.40376828e-25,   3.56742623e-27,
          1.61195630e-28,   1.68107812e-31],
       [  1.00000000e+00,   7.55503151e-21,   1.21045371e-28,
          3.35769100e-31,   2.79243914e-32],
       [  9.99995947e-01,   3.91783442e-06,   8.00379283e-08,
          2.16840110e-08,   9.48551904e-09],
       [  9.67969537e-01,   1.39265228e-02,   1.02096582e-02,
          7.19060469e-03,   3.54055665e-04],
       [  9.99946713e-01,   3.88235858e-05,   1.43164370e-05,
          1.01156807e-07,   5.58384201e-08],
       [  2.87382632e-01,   2.29388788e-01,   1.44465432e-01,
          1.24148376e-01,   5.41160889e-02],
       [  1.00000000e+00,   1.51023500e-16,   2.25728009e-20,
          7.47513083e-22,   1.03131863e-22]], dtype=float32), indices=array([[13, 12,  1, 35, 18],
       [38, 34, 40, 25, 12],
       [ 1,  2,  0,  4,  5],
       [34, 17, 22,  9, 26],
       [17, 10,  9, 34, 23],
       [26, 40, 35,  8, 12],
       [14, 13,  3, 38, 33]], dtype=int32))

This is the distribution histogram of softmax probabilities:

We can clearly see that our model quite confident in its predictions. Except the "Priority Road" image and the "unknow" image.

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
examples		examples
new_signs		new_signs
writeup		writeup
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
README_org.md		README_org.md
Traffic_Sign_Classifier.html		Traffic_Sign_Classifier.html
Traffic_Sign_Classifier.ipynb		Traffic_Sign_Classifier.ipynb
signnames.csv		signnames.csv
visualize_cnn.png		visualize_cnn.png
writeup_template.md		writeup_template.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Traffic Sign Recognition

Data Set Summary & Exploration

1. Basic summary of the dataset

2. Exploratory visualization of the dataset

Design and Test a Model Architecture

Pre-Pocessing the image data

Final model architecture

How to trained the model

Training Approach

Test a Model on New Images

Load the new German traffic signs

Predictions Result

Softmax probabilities

About

Releases

Packages

Languages

License

jingr1/CarND-Traffic-Sign-Classifier-Project

Folders and files

Latest commit

History

Repository files navigation

Traffic Sign Recognition

Data Set Summary & Exploration

1. Basic summary of the dataset

2. Exploratory visualization of the dataset

Design and Test a Model Architecture

Pre-Pocessing the image data

Final model architecture

How to trained the model

Training Approach

Test a Model on New Images

Load the new German traffic signs

Predictions Result

Softmax probabilities

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages