Multimodal Vehicle Type Classification Using Convolutional Neural Network and Statistical Representations of MFCC

Selbes, Berkay; Sert, Mustafa

Multimodal Vehicle Type Classification Using Convolutional Neural Network and Statistical Representations of MFCC

dc.contributor.author	Selbes, Berkay
dc.contributor.author	Sert, Mustafa
dc.contributor.orcID	0000-0002-7056-4245	en_US
dc.contributor.researcherID	AAB-8673-2019	en_US
dc.date.accessioned	2023-05-30T06:57:36Z
dc.date.available	2023-05-30T06:57:36Z
dc.date.issued	2017
dc.description.abstract	Recognition of vehicle types in real life traffic scenarios is a challenging task due to the diversity of vehicles and uncontrolled environments. Efficient methods and feature representations are needed to cope with these challenges. In this paper, we address the vehicle type classification problem in real life traffic scenarios and propose a multimodal method that uses efficient representations of audio-visual modalities in the fusion context. We first separate audio-visual modalities from video data by extracting the keyframes and the corresponding audio fragments. Then we extract deep convolutional neural network (CNN) and the Mel Frequency Cepstral Coefficient (MFCC) features from the visual and audio modalities of the video data, respectively. The Principal Component Analysis (PCA) algorithm is used for the visual part and various types of statistical representations of the MFCC feature vectors are calculated to select representative features. These representations are then fused to form a robust multimodal feature. Finally, we train Support Vector Machine (SVM) classifiers for final classification of vehicle types using the obtained multimodal features. We evaluate the effectiveness of our proposed method on the TRECVID 2012 SIN video performance dataset for both single- and multi-modal cases. Our results show that, fusing the proposed MFCC representations with the GoogLeNet CNN features improves the classification accuracy.	en_US
dc.identifier.isbn	978-1-5386-2939-0	en_US
dc.identifier.scopus	2-s2.0-85039898870	en_US
dc.identifier.uri	http://hdl.handle.net/11727/9249
dc.identifier.wos	000426203700056	en_US
dc.language.iso	eng	en_US
dc.relation.journal	14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.title	Multimodal Vehicle Type Classification Using Convolutional Neural Network and Statistical Representations of MFCC	en_US
dc.type	Conference Object	en_US

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Mühendislik Fakültesi / Faculty of Engineering