ADDRESSING CLASS AND DEMOGRAPHIC IMBALANCE IN
E-COMMERCE BEHAVIOR PREDICTION: A CASE STUDY USING RESAMPLING TECHNIQUES
Nurul Ain Mustakim1*, Maslina Abdul Aziz2 , Shuzlina Abdul Rahman3 and Rahmiati Rahmiati4
1* Faculty Business & Management, Universiti Teknologi MARA,
Melaka 75350 Malaysia
2,3 Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA,
40450 Shah Alam, Selangor Malaysia
4 Management Department, Universitas Negeri Padang,
Sumatera Barat 25171, Indonesia
1*
ABSTRACT
In e-commerce predictive modeling, imbalanced data remains a critical challenge, particularly when both class labels and demographic attributes are unequally distributed. This study investigates a combined approach of Synthetic Minority Oversampling Technique (SMOTE) and demographic resampling to improve the performance of models predicting online purchasing behavior in Malaysia. Using a dataset of 1,126 survey responses, six classifiers (J48, Random Tree, REPTree, JRip, PART, and OneR) were evaluated under three conditions: unbalanced, after SMOTE, and after SMOTE with demographic balancing. The results displayed clear improvements in model performance. For example, J48’s accuracy increased from 62.85% (unbalanced) to 98.69% (fully balanced), while Random Tree achieved 99.29%. These results highlight the effectiveness of integrating class and demographic balancing, an approach seldom explored in e-commerce analytics. This study contributes by demonstrating how addressing both types of imbalances yields more reliable predictive model, offering practical insights for consumer segmentation, targeting, and personalization. Future work could extend this approach by balancing additional attributes and applying it to ensemble or deep learning models for improved robustness and interpretability.
Keywords: Classification, Consumer Behavior, Data Imbalance, Demographic Resampling, E-Commerce, SMOTE
Published On: 1 April 2026
