PREDICTION OF EMPLOYEE PROMOTION USING HYBRID SAMPLING METHOD WITH MACHINE LEARNING ARCHITECTURE
Shahidan bin Shafie1, Soek Peng Ooi2, Khai Wah Khaw3*
1,2,3*School of Management
Universiti Sains Malaysia, 11800 Minden, Pulau Pinang
1This email address is being protected from spambots. You need JavaScript enabled to view it., 2This email address is being protected from spambots. You need JavaScript enabled to view it., 3*This email address is being protected from spambots. You need JavaScript enabled to view it.
ABSTRACT
Employee promotion plays an important role in an organization. It aids to inspire employees to grow and develop their skills, thus increase employee loyalty and reduce the turnover rate. This study predicts employee job promotion based on employee promotion data by using a hybrid sampling method with machine learning. The purpose of this study is to accelerate the promotion process and share the important features that might be determined when promoting an employee. In this study, there are eight machine learning algorithms have been used, such as Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors, Support Vector Machine, Naïve Bayes, Adaptive Boosting Classifier, and Extreme Gradient Boost. The purpose of using eight machine learning algorithms is to find out the most suitable model to predict employee promotion. Additionally, hybrid sampling methods like Synthetic Minority Oversampling Technique combined with Edited Nearest Neighbor (SMOTE+ENN) and Synthetic Minority Oversampling Technique combined with Tomek Link (SMOTE+Tomek) were adopted. These two techniques are to cure the imbalanced dataset. For the importance of feature selection, the Recursive Feature Elimination method with Random Forest Classifier model (RFE-RFC), Explained Variance Ratio method with Principal Component Analysis (EVR-PCA), and the Rank Feature Importance method with Extra Classifier Tree model (RFI-ECT) is applied. The first 5, 8, and 12 features are selected based on the RFI-ECT to train the machine learning algorithms. As a result, the model is evaluated by precision, recall, and F1-score. In conclusion, the top five rank feature importance methods with the Extra Classifier Tree model are region, department, previous year rating, KPIs met and above 80%, and award won. The results suggest that SMOTE+ENN and Extreme Gradient Boost with eight features have the highest-performing model in this study.
Keywords: Employee Promotion Prediction, Hybrid Sampling, Imbalanced Data, Machine Learning
Published On: 10 April 2023