Hata içeren yazılım modüllerinin tespitinde kullanılan makine öğrenme algoritmalarının çok kriterli karar verme yöntemleriyle sıralanması

Alper Işık, Merve

Hata içeren yazılım modüllerinin tespitinde kullanılan makine öğrenme algoritmalarının çok kriterli karar verme yöntemleriyle sıralanması

dc.contributor.advisor	Mamak Ekinci, Elmas Burcu
dc.contributor.author	Alper Işık, Merve
dc.date.accessioned	2024-10-01T08:30:05Z
dc.date.available	2024-10-01T08:30:05Z
dc.date.issued	2024
dc.description.abstract	Yazılım projelerindeki en önemli sorunlardan biri hata içeren yazılım modülleri olmaktadır. Bu modüller için en önemli unsurlar, programda hatalara neden olabilen, performansı azaltabilen ve programın çökmesine yol açabilen kod parçaları olmaktadır. Bu modüllerin, yazılım döngüsünün başlarında fark edilmesi ve düzeltilmesi, yazılım projelerinin başarısı için çok büyük önem taşımaktadır. Hata içeren yazılım modüllerinin yazılım döngüsünün erken safhalarında fark edilmesi için istatistiksel analiz yöntemleri ve makine öğrenme algoritmaları gibi farklı yöntemler bulunmaktadır. Yapılan çalışmalara göre bu yöntemleri kullanarak, hata içeren yazılım modüllerinin erken tespit edilmesi ve düzeltilmesinin yazılım projelerinin başarı şansını artırdığını, daha az maliyetli olduğunu ve yazılım döngüsünün daha verimli yönetilebileceğini ortaya koymaktadır. Bu yöntemlerin kullanılması ve yazılım döngüsünün en başından itibaren uygulanması, yazılım projelerinin başarısını arttırabilmekte ve maliyetlerini düşürebilmektedir. Bu çalışmada hata içeren yazılım modüllerinin tespitinde kullanılan makine öğrenme algoritmalarının seçimi problemi ele alınmıştır. Bu kapsamda, C++ programlama dilinde kodlanan NASA’nın kamusal alan (public domain) olan KC1 hata veri kümesi kullanılarak Naive Bayes, Bagging, Stacking, IBk (Knn), Logistic Regression, Random Tree, Random Forest, SMO ve Neural Networks makine öğrenme algoritmalarının hata yönünden performansları WEKA uygulaması ile incelenmiştir. Analiz sonucunda, Kappa İstatistikleri (Kappa Statistics), Doğru Sınıflandırılmış Örnekler (Correctly Classified Instances), Ortalama Mutlak Hata (Mean Absolute Error), Hata Kareler Ortalamasının Karekökü (Root Mean Squared Error), Bağıl Mutlak Hata (Relative Absolute Error) ve Kök Bağıl Kare Hatası (Root Relative Squared Error) değerleri elde edilmiştir. Bu metrikler, belirlenen makine öğrenme algoritmalarının sıralanması probleminde kullanılan ÇKKV yöntemleri için kriterler olarak belirlenmiştir. Çalışmada ÇKKV yöntemlerinden CRITIC, ARAS ve TOPSİS yöntemleri kullanılmıştır. Yapılan sıralama sonucunda hata metriklerini minimize eden en iyi makine öğrenme algoritması k-En Yakın Komşu (kNN) olarak bulunmuştur. One of the most important problems in software projects is software modules containing errors. The most important elements for these modules are pieces of code that can cause errors in the program, reduces the performance of the program and cause the program to crash. Recognizing and fixing these modules early in the software cycle is of great importance for the success of software projects. There are different methods such as statistical analysis methods and machine learning algorithms to detect software modules containing errors in the early stages of the software cycle. Studies show that early detection and correction of software modules containing errors by using these methods increases the chance of success of software projects, software can be managed more efficiently and software projects can be less costly. Using these methods and applying them from the very beginning of the software cycle can increase the success of software projects and reduce their costs. In this study, the problem of selecting machine learning algorithms used in detecting software modules containing errors is discussed. In this context, Naive Bayes, Bagging, Stacking, IBk (Knn), Logistic Regression, Random Tree, Random Forest, SMO and Neural Networks machine learning algorithms were used by using NASA's public domain KC1 error dataset, coded in C++ programming language. Their performance in terms of errors was examined with the WEKA application. As a result of the analysis, Kappa Statistics, Correctly Classified Instances, Mean Absolute Error, Root Mean Squared Error, Relative Absolute Error and Root Relative Error. Root Relative Squared Error values were obtained. These metrics were determined as criteria for MCDM methods used in the problem of ranking specified machine learning algorithms. CRITIC, ARAS and TOPSIS methods, which are among the MCDM methods, were used in the study. As a result of the sorting, the best machine learning algorithm that minimizes error metrics was found to be k-Nearest Neighbor (kNN).	en_US
dc.identifier.uri	http://hdl.handle.net/11727/12233
dc.language.iso	tur	en_US
dc.publisher	Başkent Üniversitesi Fen Bilimler Enstitüsü
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Yazılım Hata Kestirimi	en_US
dc.subject	Makine Öğrenme	en_US
dc.subject	Performans Kriterleri	en_US
dc.subject	Çok Kriterli Karar Verme	en_US
dc.subject	CRITIC	en_US
dc.subject	ARAS	en_US
dc.subject	TOPSIS	en_US
dc.title	Hata içeren yazılım modüllerinin tespitinde kullanılan makine öğrenme algoritmalarının çok kriterli karar verme yöntemleriyle sıralanması	en_US
dc.type	masterThesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 10612487.pdf
Size:: 793.84 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Fen Bilimleri Enstitüsü / Science Institute