Acoustic Scene Classification Using Spatial Pyramid Pooling With Convolutional Neural Networks
| dc.contributor.author | Basbug, Ahmet Melih | |
| dc.contributor.author | Sert, Mustafa | |
| dc.contributor.orcID | 0000-0002-7056-4245 | en_US |
| dc.contributor.researcherID | AAB-8673-2019 | en_US |
| dc.date.accessioned | 2020-12-28T13:28:32Z | |
| dc.date.available | 2020-12-28T13:28:32Z | |
| dc.date.issued | 2019 | |
| dc.description.abstract | Automatic understanding of audio events and acoustic scenes has been an active research topic for researchers from signal processing and machine learning communities. Recognition of acoustic scenes in the real life scenarios is a challenging task due to the diversity of environmental sounds and uncontrolled environments. Efficient methods and feature representations are needed to cope with these challenges. In this study, we address the acoustic scene classification of raw audio signal and propose a cascaded CNN architecture that uses spatial pyramid pooling (SPP, also referred to as spatial pyramid matching) method to aggregate local features coming from convolutional layers of the CNN. We use three well known audio features, namely MFCC, Mel Energy, and spectrogram to represent audio content and evaluate the effectiveness of our proposed CNN-SPP architecture on the DCASE 2018 acoustic scene performance dataset. Our results show that, the proposed CNN-SPP architecture with the spectrogram feature improves the classification accuracy. | en_US |
| dc.identifier.endpage | 131 | en_US |
| dc.identifier.isbn | 978-1-5386-6783-5 | en_US |
| dc.identifier.issn | 2325-6516 | en_US |
| dc.identifier.scopus | 2-s2.0-85064133230 | en_US |
| dc.identifier.startpage | 128 | en_US |
| dc.identifier.uri | http://hdl.handle.net/11727/5285 | |
| dc.identifier.wos | 000467270600020 | en_US |
| dc.language.iso | eng | en_US |
| dc.relation.isversionof | 10.1109/ICSC.2019.00029 | en_US |
| dc.relation.journal | 2019 13TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC) | en_US |
| dc.rights | info:eu-repo/semantics/closedAccess | en_US |
| dc.subject | convolutional neural network | en_US |
| dc.subject | spatial pyramid pooling | en_US |
| dc.subject | acoustic scene classification | en_US |
| dc.subject | spectrograms | en_US |
| dc.title | Acoustic Scene Classification Using Spatial Pyramid Pooling With Convolutional Neural Networks | en_US |
| dc.type | Proceedings Paper | en_US |
Files
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: