A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

Arabian Journal for Science and Engineering · Şubat 2021

YÖKSİS Kayıtları

Arabian Journal for Science and Engineering · 2021 SCI-Expanded

DOÇENT MUSTAFA SERTER UZER →

A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

Arabian Journal for Science and Engineering · 2021 SCI-Expanded

DOKTOR ÖĞRETİM ÜYESİ ONUR İNAN →

Makale Bilgileri

DergiArabian Journal for Science and Engineering

Yayın TarihiŞubat 2021

Cilt / Sayfa46 · 1199-1212

DOI10.1007/s13369-020-04972-y

Scopus ID2-s2.0-85091735926

Özet Non-system errors that occur during data entry or data collection create noisy data that reduce the success of classification systems. To eliminate this data, a classification system with a new data reduction method consisting of a modified k-means algorithm using relief algorithm coefficients named MKMA-RAC was developed. The main theme of this article is the elimination of noisy data and its consistent application to the classification system using the k-fold cross-validation method. By means of the developed system, the training data became free from noisy data by integrating the support vector machine, linear discriminant analysis (LDA) and decision tree classifiers with MKMA-RAC-based data reduction for every fold. The data reduction process was not applied for the test data. Datasets used in the proposed method were the Hepatitis, Liver Disorders, SPECT images and Statlog (Heart) dataset taken from the UCI database. Classification performance values obtained both from the proposed method and without the proposed method with tenfold CV were given for these datasets. For Hepatitis, Liver Disorders, SPECT images and Statlog (Heart) datasets, and classification successes of the proposed system with SVM classifier were 96.88%, 74.56%, 87.24%, and 90.00%, classification successes of the proposed system with LDA classifier were 94.91%, 69.05%, 82.38%, and 88.52%, classification successes of the proposed system with decision tree classifier were 96.25%, 77.73%, 88.77% and 89.63%, respectively. The test results have shown that the proposed system generally achieved higher classification performance than other literature results. Therefore, the performance is very encouraging for pattern recognition applications.

Yazarlar (2)

Onur Inan

Mustafa Serter Uzer

ORCID: 0000-0002-8829-5987

Anahtar Kelimeler

Clustering-based data elimination Medical dataset classification Relief

Kurumlar

Necmettin Erbakan Üniversitesi

Meram Turkey

Selçuk Üniversitesi

Selçuklu Turkey

Metrikler

Atıf

Yazar

Anahtar Kelime

Sistemimizdeki Yazarlar

MUSTAFA SERTER UZER

DOÇENT

Hızlı Erişim

DOI ile aç Scopus'ta aç Makaleler listesi

A Method of Classification Performance Improvement Via a Strategy of Clustering-Based Data Elimination Integrated with k-Fold Cross-Validation

YÖKSİS DOI Eşleşmesi Bulundu

YÖKSİS Kayıtları

Makale Bilgileri

Yazarlar (2)

Anahtar Kelimeler

Kurumlar

Metrikler

Sistemimizdeki Yazarlar

Hızlı Erişim