Multimodal Vehicle Type Classification Using Convolutional Neural Network and Statistical Representations of MFCC

dc.contributor.authorSelbes, Berkay
dc.contributor.authorSert, Mustafa
dc.contributor.orcID0000-0002-7056-4245en_US
dc.contributor.researcherIDAAB-8673-2019en_US
dc.date.accessioned2023-05-30T06:57:36Z
dc.date.available2023-05-30T06:57:36Z
dc.date.issued2017
dc.description.abstractRecognition of vehicle types in real life traffic scenarios is a challenging task due to the diversity of vehicles and uncontrolled environments. Efficient methods and feature representations are needed to cope with these challenges. In this paper, we address the vehicle type classification problem in real life traffic scenarios and propose a multimodal method that uses efficient representations of audio-visual modalities in the fusion context. We first separate audio-visual modalities from video data by extracting the keyframes and the corresponding audio fragments. Then we extract deep convolutional neural network (CNN) and the Mel Frequency Cepstral Coefficient (MFCC) features from the visual and audio modalities of the video data, respectively. The Principal Component Analysis (PCA) algorithm is used for the visual part and various types of statistical representations of the MFCC feature vectors are calculated to select representative features. These representations are then fused to form a robust multimodal feature. Finally, we train Support Vector Machine (SVM) classifiers for final classification of vehicle types using the obtained multimodal features. We evaluate the effectiveness of our proposed method on the TRECVID 2012 SIN video performance dataset for both single- and multi-modal cases. Our results show that, fusing the proposed MFCC representations with the GoogLeNet CNN features improves the classification accuracy.en_US
dc.identifier.isbn978-1-5386-2939-0en_US
dc.identifier.scopus2-s2.0-85039898870en_US
dc.identifier.urihttp://hdl.handle.net/11727/9249
dc.identifier.wos000426203700056en_US
dc.language.isoengen_US
dc.relation.journal14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)en_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.titleMultimodal Vehicle Type Classification Using Convolutional Neural Network and Statistical Representations of MFCCen_US
dc.typeConference Objecten_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: