Acoustic Scene Classification Using Spatial Pyramid Pooling With Convolutional Neural Networks

dc.contributor.authorBasbug, Ahmet Melih
dc.contributor.authorSert, Mustafa
dc.contributor.orcID0000-0002-7056-4245en_US
dc.contributor.researcherIDAAB-8673-2019en_US
dc.date.accessioned2020-12-28T13:28:32Z
dc.date.available2020-12-28T13:28:32Z
dc.date.issued2019
dc.description.abstractAutomatic understanding of audio events and acoustic scenes has been an active research topic for researchers from signal processing and machine learning communities. Recognition of acoustic scenes in the real life scenarios is a challenging task due to the diversity of environmental sounds and uncontrolled environments. Efficient methods and feature representations are needed to cope with these challenges. In this study, we address the acoustic scene classification of raw audio signal and propose a cascaded CNN architecture that uses spatial pyramid pooling (SPP, also referred to as spatial pyramid matching) method to aggregate local features coming from convolutional layers of the CNN. We use three well known audio features, namely MFCC, Mel Energy, and spectrogram to represent audio content and evaluate the effectiveness of our proposed CNN-SPP architecture on the DCASE 2018 acoustic scene performance dataset. Our results show that, the proposed CNN-SPP architecture with the spectrogram feature improves the classification accuracy.en_US
dc.identifier.endpage131en_US
dc.identifier.isbn978-1-5386-6783-5en_US
dc.identifier.issn2325-6516en_US
dc.identifier.scopus2-s2.0-85064133230en_US
dc.identifier.startpage128en_US
dc.identifier.urihttp://hdl.handle.net/11727/5285
dc.identifier.wos000467270600020en_US
dc.language.isoengen_US
dc.relation.isversionof10.1109/ICSC.2019.00029en_US
dc.relation.journal2019 13TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC)en_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectconvolutional neural networken_US
dc.subjectspatial pyramid poolingen_US
dc.subjectacoustic scene classificationen_US
dc.subjectspectrogramsen_US
dc.titleAcoustic Scene Classification Using Spatial Pyramid Pooling With Convolutional Neural Networksen_US
dc.typeProceedings Paperen_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: