CENTRE-BASED HARD CLUSTERING ALGORITHMS FOR Y-STR DATA

Ali Seman, Zainab Abu Bakar, Azizian Mohd. Sapawi

Department of Computer Sciences,
Faculty of Computer and Mathematical Sciences,
Universiti Teknologi MARA (UiTM)
40450 Shah Alam, Selangor

Abstract

This paper presents Centre-based hard clustering approaches for clustering Y-STR data. Two classical partitioning techniques: Centroid-based partitioning technique and Representative object-based partitioning technique are evaluated. The k-Means and the k-Modes algorithms are the fundamental algorithms for the centroid-based partitioning technique, whereas the k-Medoids is a representative object-based partitioning technique. The three algorithms above are experimented and evaluated in partitioning Y-STR haplogroups and Y-STR Surname data. The overall results show that the centroid-based partitioning technique is better than the representative object-based partitioning technique in clustering Y-STR data.

Keywords: Centre-based clustering, k-Means, k-Modes, k-Medoids, Y-STR data

Full Download