Browsing by Author "Tombaloglu, Burak"

Now showing 1 - 3 of 3

Development of a MFCC-SVM Based Turkish Speech Recognition System
(2016) Tombaloglu, Burak; Erdem, Hamit
In this study, a SVM-MFCC based Turkish Speech Recognition system is devoloped. In the structure, Mel Frequency Cepstral Coefficients (MFCC) are used for feature extraction and Support Vector Machines(SVM) are used for classification of the phonemes. Three more phoneme recognition methods are applied to same dataset and their perfomance is compared. The applied methods are the combination of the Linear Prediction Cepstral Coefficients (LPCC), which is a commonly used method of feature extraction and Hidden Markov Method (HMM) which is a known classification method. The applied feature extraction and classification methods has been selected due to phoneme-based property of the Turkish language.
A SVM Based Speech to Text Converter for Turkish Language
(2017) Tombaloglu, Burak; Erdem, Hamit
In proposed speech to text conversion, a Support Vector Machines (SVM) based Turkish speech to text converter system has been developed. In the recognition system, Mel Frequency Cepstral Coefficients (MFCC) has been applied to extract features of Turkish speech and SVM based classifier has been used to classify the phonemes. The morphological structure of Turkish, a language based on phonemes, has been taken into consideration in the devoloped person-dependent voice recognition system. Unlike the multiclass classifiers which are used in the SVM-MFCC based voice recognition system, a new SVM classifier system has been developed that uses fewer classes in layers, increasing the number of multiclass layers. A new Text Comparison Algorithm is proposed, which also uses phoneme sequence to measure similarity in word similarity measurement. Along with these enhancements, as the training period becomes higher, performance of voice recognition is improved and word recognition performance is increased. The performance of the proposed structure is compared with similar systems.
Turkish Speech Recognition Techniques and Applications of Recurrent Units (LSTM and GRU)
(2021) Tombaloglu, Burak; Erdem, Hamit
A typical solution of Automatic Speech Recognition (ASR) problems is realized by feature extraction, feature classification, acoustic modeling and language modeling steps. In classification and modeling steps, Deep Learning Methods have become popular and give more successful recognition results than conventional methods. In this study, an application for solving ASR problem in Turkish Language has been developed. The data sets and studies related to Turkish Language ASR problem are examined. Language models in the ASR problems of agglutative language groups such as Turkish, Finnish and Hungarian are examined. Subword based model is chosen in order not to decrease recognition performance and prevent large vocabulary. The recogniton performance is increased by Deep Learning Methods called Long Short Term Memory (LSTM) Neural Networks and Gated Recurrent Unit (GRU) in the classification and acoustic modeling steps. The recognition performances of systems including LSTM and GRU are compared with the the previous studies using traditional methods and Deep Neural Networks. When the results were evaluated, it is seen that LSTM and GRU based Speech Recognizers performs better than the recognizers with previous methods. Final Word Error Rate (WER) values were obtained for LSTM and GRU as 10,65% and 11,25%, respectively. GRU based systems have similar performance when compared to LSTM based systems. However, it has been observed that the training periods are short. Computation times are 73.518 and 61.020 seconds respectively. The study gave detailed information about the applicability of the latest methods to Turkish ASR research and applications.