LeNetHWAccelerator - Goal

Design and implement a hardware accelerator for LeNet Inference Deep CNN (DNN) structure in SV, following the Data-flow architecture. This implementation will be parameterized to accommodate other DNN architectures (AlexNet, VGG etc.,) as well.

Approach

Implement LeNet Training DNN using Pytorch library and store Weights and Bias matrices after 5 epochs.
Implement Parameterized LeNet Inference DNN architecture from scratch, using basic Python (to check if my understanding of the implementation details is correct).
Verify convolution, maxpool operations and the DNN flow in the above code, using multiple Image and Kernel (weights) matrices.
Implement the LeNet DNN in SV using the similar steps as (2), paralelly following the data-flow architecture for reduced area and power consumption. i. Implement a parameterized Convolution structure, that can read Image (random image) and Kernel matrices (random : Laplacian, SobelX, SobelY and its variations etc.,). This structure should generate the convoluted output of those 2 matrices. ii. Verify if the SV output is able to reconstruct the convoluted matrix by following data-flow approach. iii. Thoroughly verify Convoluted SV output for different sizes and values of Image and Kernel matrices as well as different Stride values. iv. Follow similar approach and implement the SV code to perform Maxpool, ReLu and FC operation separately. v. Integrate all the different blocks (Conv, Maxpool, FC and ReLu) according to LeNet DNN architecture.
Use the actual Weight (Kernel) matrices generated by the trained DNN in step (1), to verify and calculate the accuracy of the SV based LeNet DNN HW accelerator.
Integrate the individual blocks (Conv, Maxpool, FC and ReLu) to generate other DNN architectures and verify the accuracy.
Use other HW reducing techniques (Modified booth algorithm etc.,) to further optimize the hardware.

Status (Aug 4, 2020)

Completed till step 4.iii

Constraints

-> Images need to be Gray-scaled (ie., only 1 colour channel). -> Real values of weights and Bias matrices should be converted to whole numbers (0,1,2, etc.,). -> Allowed parameterization : Size (l,b) of Image and Kernel matrices, Stride, Bit width of each pixel in Image and Kernal matrices. As of now, padding is kept to 0. Eg: parameter IMAGE_PIXEL_WIDTH = 8, parameter KERNEL_PIXEL_WIDTH = 5, parameter WC1_IMAGE_WIDTH = 30, parameter WC1_KERNEL = 3, parameter WC1_STRIDE = 3, parameter WC1_PADDING = 0

Introduction

Yann LeCun, Leon Bottou, Yosuha Bengio and Patrick Haffner proposed a neural network architecture for handwritten and machine-printed character recognition in 1990’s which they called LeNet-5.

The LeNet-5 architecture consists of two sets of convolutional and average pooling layers, followed by a flattening convolutional layer, then two fully-connected layers and finally a softmax classifier.

First Layer:

The input for LeNet-5 is a 32×32 grayscale image which passes through the first convolutional layer with 6 feature maps or filters having size 5×5 and a stride of one. The image dimensions changes from 32x32x1 to 28x28x6.

Second Layer:

Then the LeNet-5 applies average pooling layer or sub-sampling layer with a filter size 2×2 and a stride of two. The resulting image dimensions will be reduced to 14x14x6.

Third Layer:

Next, there is a second convolutional layer with 16 feature maps having size 5×5 and a stride of 1. In this layer, only 10 out of 16 feature maps are connected to 6 feature maps of the previous layer as shown below.

The main reason is to break the symmetry in the network and keeps the number of connections within reasonable bounds. That’s why the number of training parameters in this layers are 1516 instead of 2400 and similarly, the number of connections are 151600 instead of 240000.

Fourth Layer:

The fourth layer (S4) is again an average pooling layer with filter size 2×2 and a stride of 2. This layer is the same as the second layer (S2) except it has 16 feature maps so the output will be reduced to 5x5x16.

Fifth Layer:

The fifth layer (C5) is a fully connected convolutional layer with 120 feature maps each of size 1×1. Each of the 120 units in C5 is connected to all the 400 nodes (5x5x16) in the fourth layer S4.

Sixth Layer:

The sixth layer is a fully connected layer (F6) with 84 units.

Output Layer:

Finally, there is a fully connected softmax output layer ŷ with 10 possible values corresponding to the digits from 0 to 9.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
files		files
write_files/aug3		write_files/aug3
LeNet.ipynb		LeNet.ipynb
LeNet_flow.ipynb		LeNet_flow.ipynb
README.md		README.md
files.f		files.f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LeNetHWAccelerator - Goal

Approach

Status (Aug 4, 2020)

Constraints

Introduction

About

Releases

Packages

Languages

KushalKumarJujare/LeNetHWAccelerator

Folders and files

Latest commit

History

Repository files navigation

LeNetHWAccelerator - Goal

Approach

Status (Aug 4, 2020)

Constraints

Introduction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages