Skip to content

This project focuses on classifying various audio recordings using CNNs

License

Notifications You must be signed in to change notification settings

soham30rane/Deep-Multiclass-Audio-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Multiclass Audio Classification

Project structure

├── Coursera/
│   ├── soham/
│   │   ├── Coursera Assignments/
│   │   └── Coursera Notes/
│   └── Aanchal/
│       ├── Course1/
│       ├── Course2/
│       └── Course4/
├── EDA/
│   ├── esc-50-explore.ipynb
│   └── esc-preprocess-and-eda.ipynb
├── UI/
│   ├── test/
│   ├── audio_ui.py
│   ├── audio_ui2.py
│   ├── labels.py
│   ├── model.py
│   ├── yamnet.onnx
│   └── yamnet_inference.py
├── mini-projects/
│   ├── Aanchal/
│   │   ├── Audio Classification UrbanSound8k.ipynb
│   │   ├── NN_from_scratch.ipynb
│   │   └── Transfer learning with ResNet-50 cifar10.ipynb
│   └── Soham/
│       ├── Audio Classification UrbanSound8k/
│       ├── Neural-Network-from-scratch/
│       └── Transfer-learning-cifar10/
├── resnets_and_efficientnets/
│   ├── esc-dataset.ipynb
│   ├── esc-model1_2024-08-20_18-11-09.pth
│   ├── esc-transfer-learn.ipynb
│   ├── esc-transfer-learning2.ipynb
│   └── esc-utils.ipynb
├── yamnet/
│   ├── esc-dataset.ipynb
│   ├── esc-dataset2.xpynb
│   ├── esc-model1_20/
│   ├── esc-utils.ipynb
│   ├── esc-utils3.xpynb
│   ├── esc-yamnet.ipynb
│   ├── escyamnetdataset.xpynb
│   ├── getyamnet.xpynb
│   ├── yamnet-load.xpynb
│   └── yamnet.ipynb
├── LICENSE
└── README.md

Table of Contents

Introduction

This project focuses on developing a robust audio classifier that processes user-provided audio files and accurately identifies the category or class to which the audio belongs.

Description

This project seeks to create a cutting-edge audio classification system capable of sorting diverse audio inputs, including speech, music, and environmental sounds.
We used 2 approaches for this project, which are as follows,

  • Convolutional Neural Networks (CNNs)
  • Transfer learning (YAMNet, ResNet50, EfficientNET )
WhatsApp.Video.2024-10-18.at.23.40.22.1.mp4

Tech Stack

Contributors

Future Prospects

  • Hate Speech Detection in low-Resource Languages
  • Audio based Security Systems
  • Environmental Monitoring

Resources

Audio processing by Valerio Valerdo

Coursera course on Deep learning by Andrew Ng and Younes Bensouda Mourri

Pytorch playlist by Patrick Leober

Datasets used are as follows,

  1. ESC-50 dataset
  2. CIFAR 10 dataset
  3. Urban Sound 8k

Acknowledgement

Special thanks to COC VJTI for ProjectX 2024

Special Thanks to our mentors Kshitij Shah and Param Thakkar who guided us throughout our project journey.

About

This project focuses on classifying various audio recordings using CNNs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published