Mühendislik Fakültesi / Faculty of Engineering

Permanent URI for this collectionhttps://hdl.handle.net/11727/1401

Browse

Search Results

Now showing 1 - 10 of 12

Lip Reading Using Various Deep Learning Models with Visual Turkish Data
(GAZI UNIVERSITY JOURNAL OF SCIENCE, 2024-09-23) Tumer Sivri, Talya; Berkol, Ali; Erdem, Hamit
In Human-Computer Interaction, lip reading is essential and still an open research problem. In the last decades, there have been many studies in the field of Automatic Lip-Reading (ALR) in different languages, which is important for societies where the essential applications developed. Similarly to other machine learning and artificial intelligence applications, Deep Learning (DL) based classification algorithms have been applied for ALR in order to improve the performance of ALR. In the field of ALR, few studies have been done on the Turkish language. In this study, we undertook a multifaceted approach to address the challenges inherent to Turkish lip reading research. To begin, we established a foundation by creating an original dataset meticulously curated for the purpose of this investigation. Recognizing the significance of data quality and diversity, we implemented three robust image data augmentation techniques: sigmoidal transform, horizontal flip, and inverse transform. These augmentation methods not only elevated the quality of our dataset but also introduced a rich spectrum of variations, thereby bolstering the dataset's utility. Building upon this augmented dataset, we delved into the application of cutting- edge DL models. Our choice of models encompassed Convolutional Neural Networks (CNN), known for their prowess in extracting intricate visual features, Long-Short Term Memory (LSTM), adept at capturing sequential dependencies, and Bidirectional Gated Recurrent Unit (BGRU), renowned for their effectiveness in handling complex temporal data. These advanced models were selected to leverage the potential of the visual Turkish lip reading dataset, ensuring that our research stands at the forefront of this rapidly evolving field. The dataset utilized in this study was gathered with the primary objective of augmenting the extant corpus of Turkish language datasets, thereby substantively enriching the landscape of Turkish language research while concurrently serving as a benchmark reference. The performance of the applied method has been compared regarding precision, recall, and F1 metrics. According to experiment results, BGRU and LSTM models gave the same results up to the fifth decimal, and BGRU had the fastest training time.
A Systematic Review of Transfer Learning-Based Approaches for Diabetic Retinopathy Detection
(2023) Oltu, Burcu; Karaca, Busra Kubra; Erdem, Hamit; Ozgur, Atilla; 0000-0002-9237-8347; 0000-0003-1704-1581; AAD-6546-2019
Diabetic retinopathy, which is extreme visual blindness due to diabetes, has become an alarming issue worldwide. Early and accurate detection of DR is necessary to prevent the progression and reduce the risk of blindness. Recently, many approaches for DR detection have been proposed in the literature. Among them, deep neural networks (DNNs), especially Convolutional Neural Network (CNN) models, have become the most offered approach. However, designing and training new CNN architectures from scratch is a troublesome and labor-intensive task, particularly for medical images. Moreover, it requires training tremendous amounts of parameters. Therefore, transfer learning approaches as pre-trained models have become more prevalent in the last few years. Accordingly, in this study, 43 publications based on DNN and Transfer Learning approaches for DR detection between 2016 and 2021 are reviewed. The reviewed papers are summarized in 4 figures and 10 tables that present detailed information about 29 pre-trained CNN models, 13 DR data sets, and standard performance metrics.
Time Harmonic Analysis in Electric Power Systems
(2015) Germec, Kadir Egemen; Erdem, Hamit
In this study, for time-varying signals in electric power systems, a multi functional system structure involving fundamental frequency detection, phase angle and amplitude estimation of harmonic and interharmonic components have been developed. Due to its simple and open structure, this system provides knowledge of harmonic component values as well as information about at which intervals and to what extend these component values are effective, which is possible with interventions that improve performance. The results of the experimental studies performed by using MATLAB simulation environment show that, this system is convenient and effective for the harmonic analysis of the current and voltage waveforms. Therewith, the individual effects of this time-variant harmonic and interharmonic components could be instantly detected in 3D time-harmonic space.
Journal Finder for TRDIZIN: Baseline Study
(2021) Demirkan, Mert; Ozgur, Atilla; Erdem, Hamit; https://orcid.org/0000-0002-1396-2060; https://orcid.org/0000-0002-9237-8347; https://orcid.org/0000-0003-1704-1581; AAD-6546-2019
One of the main steps in publication of a paper is finding a related journal for the work of the researchers. In the recent years, there have been an increase in scientific papers publications. This situation leads the introduction of journal recommender systems by leading academic publishers. Without using a journal recommender system, this step would be a very time consuming task. This study reviewed similar studies in the literature. Current study is the first version' of journal recommender system for TRDIZIN index which has an increasing amount of articles. A dataset is created by collecting titles, keywords, and abstracts of papers from dergipark web page. Using the collected dataset, a target journal from TRDIZTN is suggested according to title, abstract and keyword of the given article. For the first version of the journal recommender system, cosine similarity is used. The results of the suggested algorithm are evaluated by using performance criteria as the nearest 5 and 10 journals' accuracy.
Parkinson's Disease Monitoring from Gait Analysis via Foot-Worn Sensors
(2018) Asuroglu, Tunc; Acici, Koray; Erdas, Cagatay Berke; Toprak, Munire Kilinc; Erdem, Hamit; Ogul, Hasan; https://orcid.org/0000-0002-3821-6419; https://orcid.org/0000-0001-7979-0276; AAC-7834-2020; HDM-9910-2022; AAJ-8674-2021
Background: In Parkinson's disease (PD), neuronal loss in the substantia nigra ultimate in dopaminergic denervation of the stiratum is followed by disarraying of the movements' preciseness, automatism, and agility. Hence, the seminal sign of PD is a change in motor performance of affected individuals. As PD is a neurodegenerative disease, progression of disability in mobility is an inevitable consequence. Indeed, the major cause of morbidity and mortality among patients with PD is the motor changes restricting their functional independence. Therefore, monitoring the manifestations of the disease is crucial to detect any worsening of symptoms timely, in order to maintain and improve the quality of life of these patients. Aim: The changes in motion of patients with PD can be ascertained by the help of wearable sensors attached to the limbs of subjects. Then analysing the recorded data for variation of signals would make it possible to figure an individualized profile of the disease. Advancement of such tools would improve understanding of the disease evolution in the long term and simplify the detection of precipitous changes in gait on a daily basis in the short term. In both cases the apperception of such events would contribute to improve the clinical decision making process with reliable data. To this end, we offer here a computational solution for effective monitoring of PD patients from gait analysis via multiple foot-worn sensors. Methods: We introduce a supervised model that is fed by ground reaction force (GRF) signals acquired from these gait sensors. We offer a hybrid model, called Locally Weighted Random Forest (LWRF), for regression analysis over the numerical features extracted from input signals to predict the severity of PD symptoms in terms of Universal Parkinson Disease Rating Scale (UPDRS) and Hoehn and Yahr (H&Y) scale. From GRF signals sixteen time-domain features and seven frequency-domain features were extracted and used. Results and conclusion: An experimental analysis conducted on a real data acquired from PD patients and healthy controls has shown that the predictions are highly correlated with the clinical annotations. Proposed approach for severity detection has the best correlation coefficient (CC), mean absolute error (MAE) and root mean squared error (RMSE) values with 0.895, 4.462 and 7.382 respectively in terms of UPDRS. The regression results for H&Y Scale discerns that proposed model outperforms other models with CC, MAE and RMSE with values 0.960, 0.168 and 0.306 respectively. In classification setup, proposed approach achieves higher accuracy in comparison with other studies with accuracy and specificity of 99.0% and 99.5% respectively. Main novelty of this approach is the fact that an exact value of the symptom level can be inferred rather than a categorical result that defines the severity of motor disorders. (C) 2018 Nalecz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences. Published by Elsevier B.V. All rights reserved.
A Random Forest Method to Detect Parkinson's Disease via Gait Analysis
(2017) Acici, Koray; Erdas, Cagatay Berke; Asuroglu, Tunc; Toprak, Munire Kilinc; Erdem, Hamit; Ogul, Hasan; 0000-0001-7979-0276; 0000-0003-4153-0764; 0000-0002-3821-6419; 0000-0003-3467-9923; AAJ-8674-2021; AAC-7834-2020; ITV-2441-2023; HDM-9910-2022
Remote care and telemonitoring have become essential component of current geriatric medicine. Intelligent use of wireless sensors is a major issue in relevant computational studies to realize these concepts in practice. While there has been a growing interest in recognizing daily activities of patients through wearable sensors, the efforts towards utilizing the streaming data from these sensors for clinical practices are limited. Here, we present a practical application of clinical data mining from wearable sensors with a particular objective of diagnosing Parkinson's Disease from gait analysis through a sets of ground reaction force (GRF) sensors worn under the foots. We introduce a supervised learning method based on Random Forests that analyze the multi-sensor data to classify the person wearing these sensors. We offer to extract a set of time-domain and frequency-domain features that would be effective in distinguishing normal and diseased people from their gait signals. The experimental results on a benchmark dataset have shown that proposed method can significantly outperform the previous methods reported in the literature.
Tuning of Output Scaling Factor in PI-Like Fuzzy Controllers for Power Converters Using PSO
(2016) Erdem, Hamit; Altinoz, Okkes Tolga
Proportional-Integral (PI) like Fuzzy Logic Controllers (FLC) has been widely used for control of static power converters (SPC). The performance of these controllers is sensitive to controller rules, parameters of membership functions and input-output scaling factors. Among these parameters, scaling factor (SF) directly affects the controller performance in terms of transient response, steady state error and stability. Therefore, using an optimum SF value increases the performance of the FLC against using a constant value. Hence, in this paper optimizing the output scaling factor (OSF) of the PI-like fuzzy logic controller (PIFLC) by using Particle Swarm Optimization (PSO) algorithm is proposed. In order to optimize and analyze the effect of this parameter on the controller performance, first the output scaling factor of FLC is optimized with various PSO algorithms and one of these algorithms is selected for experimental test. Then optimized FLC is applied to a DC-DC Buck converter, and the performance of the controller is evaluated under nominal load and load disturbance. The controller design, the OSF optimization, and the controller performance analysis approaches are presented in detail.
Development of a MFCC-SVM Based Turkish Speech Recognition System
(2016) Tombaloglu, Burak; Erdem, Hamit
In this study, a SVM-MFCC based Turkish Speech Recognition system is devoloped. In the structure, Mel Frequency Cepstral Coefficients (MFCC) are used for feature extraction and Support Vector Machines(SVM) are used for classification of the phonemes. Three more phoneme recognition methods are applied to same dataset and their perfomance is compared. The applied methods are the combination of the Linear Prediction Cepstral Coefficients (LPCC), which is a commonly used method of feature extraction and Hidden Markov Method (HMM) which is a known classification method. The applied feature extraction and classification methods has been selected due to phoneme-based property of the Turkish language.
A SVM Based Speech to Text Converter for Turkish Language
(2017) Tombaloglu, Burak; Erdem, Hamit
In proposed speech to text conversion, a Support Vector Machines (SVM) based Turkish speech to text converter system has been developed. In the recognition system, Mel Frequency Cepstral Coefficients (MFCC) has been applied to extract features of Turkish speech and SVM based classifier has been used to classify the phonemes. The morphological structure of Turkish, a language based on phonemes, has been taken into consideration in the devoloped person-dependent voice recognition system. Unlike the multiclass classifiers which are used in the SVM-MFCC based voice recognition system, a new SVM classifier system has been developed that uses fewer classes in layers, increasing the number of multiclass layers. A new Text Comparison Algorithm is proposed, which also uses phoneme sequence to measure similarity in word similarity measurement. Along with these enhancements, as the training period becomes higher, performance of voice recognition is improved and word recognition performance is increased. The performance of the proposed structure is compared with similar systems.
Turkish Speech Recognition Techniques and Applications of Recurrent Units (LSTM and GRU)
(2021) Tombaloglu, Burak; Erdem, Hamit
A typical solution of Automatic Speech Recognition (ASR) problems is realized by feature extraction, feature classification, acoustic modeling and language modeling steps. In classification and modeling steps, Deep Learning Methods have become popular and give more successful recognition results than conventional methods. In this study, an application for solving ASR problem in Turkish Language has been developed. The data sets and studies related to Turkish Language ASR problem are examined. Language models in the ASR problems of agglutative language groups such as Turkish, Finnish and Hungarian are examined. Subword based model is chosen in order not to decrease recognition performance and prevent large vocabulary. The recogniton performance is increased by Deep Learning Methods called Long Short Term Memory (LSTM) Neural Networks and Gated Recurrent Unit (GRU) in the classification and acoustic modeling steps. The recognition performances of systems including LSTM and GRU are compared with the the previous studies using traditional methods and Deep Neural Networks. When the results were evaluated, it is seen that LSTM and GRU based Speech Recognizers performs better than the recognizers with previous methods. Final Word Error Rate (WER) values were obtained for LSTM and GRU as 10,65% and 11,25%, respectively. GRU based systems have similar performance when compared to LSTM based systems. However, it has been observed that the training periods are short. Computation times are 73.518 and 61.020 seconds respectively. The study gave detailed information about the applicability of the latest methods to Turkish ASR research and applications.

Mühendislik Fakültesi / Faculty of Engineering

Browse

Filters

Settings

Sort By

Results per page

Search Results