Frequency & Channel Attention for Computationally Efficient Sound Event Detection

Official implementation of

Frequency & Channel Attention for Computationally Efficient Sound Event Detection (Submitted to DCASE 2023 workshop)
by Hyeonuk Nam, Seong-Hu Kim, Doekki Min, Yong-Hwa Park

Requirements

Python version of 3.7.10 is used with following libraries

pytorch==1.8.0
pytorch-lightning==1.2.4
pytorchaudio==0.8.0
scipy==1.4.1
pandas==1.1.3
numpy==1.19.2

Datasets

You can download datasets by reffering to DCASE 2021 Task 4 description page or DCASE 2021 Task 4 baseline. Then, set the dataset directories in config yaml files accordingly. You need DESED real datasets (weak/unlabeled in domain/validation/public eval) and DESED synthetic datasets (train/validation).

Training

You can train and save model in exps folder by running:

python main.py

default model in the config.yaml is SE+tfwSE

Test with saved models

You can test saved models by running:

python main.py -s saved_models/SE+tfwSE/best

this example tests the best SE+tfwSE model saved.

Reference

Citation & Contact

If this repository helped your works, please cite papers below! 3rd paper is about data augmentation method called FilterAugment which is applied to this work.

@article{nam2023frequency,
      title={Frequency & Channel Attention for Computationally Efficient Sound Event Detection}, 
      author={Hyeonuk Nam and Seong-Hu Kim and Deokki Min and Yong-Hwa Park},
      journal={arXiv preprint arXiv:2306.11277},
      year={2023},
}

@inproceedings{nam22_interspeech,
      author={Hyeonuk Nam and Seong-Hu Kim and Byeong-Yun Ko and Yong-Hwa Park},
      title={{Frequency Dynamic Convolution: Frequency-Adaptive Pattern Recognition for Sound Event Detection}},
      year=2022,
      booktitle={Proc. Interspeech 2022},
      pages={2763--2767},
      doi={10.21437/Interspeech.2022-10127}
}

@INPROCEEDINGS{nam2021filteraugment,
    author={Nam, Hyeonuk and Kim, Seong-Hu and Park, Yong-Hwa},
    booktitle={ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
    title={Filteraugment: An Acoustic Environmental Data Augmentation Method}, 
    year={2022},
    pages={4308-4312},
    doi={10.1109/ICASSP43922.2022.9747680}
}

Please contact Hyeonuk Nam at [email protected] for any query.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
saved_models		saved_models
utils		utils
.gitattributes		.gitattributes
README.md		README.md
config.yaml		config.yaml
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Frequency & Channel Attention for Computationally Efficient Sound Event Detection

Requirements

Datasets

Training

Test with saved models

Reference

Citation & Contact

About

Releases

Packages

Languages

frednam93/lightSED

Folders and files

Latest commit

History

Repository files navigation

Frequency & Channel Attention for Computationally Efficient Sound Event Detection

Requirements

Datasets

Training

Test with saved models

Reference

Citation & Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages