A Comparative Evaluation of Machine Learning and Ensemble Models for Predicting Digital Forensic Investigation Outcomes Using Network Intrusion Data

Abstract:

The increasing complexity of cybercrime and the growing volume of digital evidence have intensified the need for accurate and reliable predictive models in Forensic Science. Traditional statistical approaches often struggle to handle high-dimensional and heterogeneous network data, thereby limiting their effectiveness in modern digital forensic investigations. This study evaluates the performance of machine learning and ensemble techniques in predicting digital forensic investigation outcomes using intrusion detection datasets. A quantitative experimental research design was adopted using CIC-IDS2017, UNSW-NB15, and NSL-KDD. Logistic Regression was employed as the baseline model, while Random Forest, Support Vector Machine, and Gradient Boosting were implemented as advanced predictive models. Data preprocessing included normalisation, feature selection, and the removal of data leakage to ensure validity. Model performance was evaluated using accuracy, precision, recall, F1 score, Area Under the Receiver Operating Characteristic Curve, and error-based metrics. The results indicate that machine learning models significantly outperform the baseline statistical approach. Logistic Regression achieved an accuracy of 84.2 percent, while Random Forest and Support Vector Machine exceeded 92 percent. Gradient Boosting achieved the highest accuracy of 95.2 percent with the lowest false positive and false negative rates. The findings further reveal that data preprocessing plays a critical role in ensuring reliable results, as initial inflated performance was linked to data leakage. The study concludes that ensemble learning techniques provide superior predictive performance for digital forensic investigations. However, effective deployment requires a careful balance between predictive accuracy and interpretability.