Abstract:
In recent years, machine learning (ML)-based approaches have gained increasing attention in occupational accident research. However, the challenges of data uncertainty, unstructured information handling, lower prediction power of algorithms, and crisp rule generation and its linguistic description remain predominant in ML research, particularly in occupational risk prediction. The present study aims to develop a new methodology to effectively address the aforementioned challenges. The methodology is developed for the tasks of prediction and rule generation. The predictive model, namely rough set (RS) sample-based PSO-ANFIS is developed for prediction, and then, decision rules are generated using the lower approximation of RS and Z-number. The developed methodology contributes by: (i) handling data uncertainty using lower approximation of RS, (ii) unstructured information handling using Latent Dirichlet Allocation (LDA)-based topic modeling, and (iii) prediction using an optimized ML model, (iv) extraction of crisp decision rules using the lower approximation of RS, and (v) determining reliability of the crisp decision rules using Z-number. The efficacy of the proposed method over some state-of-the-art (i.e., ANN, KNN, SVM, C4.5, C5.0, CART, RF, and RS-PSO-ANFIS) is demonstrated using some benchmark datasets acquired from the UCI ML repository, and one real-life occupational safety data acquired from an integrated steel plant in India. A total of 22 implementable crisp safety decision rules have been extracted from the predictive results based on the lower approximation of RS. Experimental results also reveal that the RS sample-based PSO-ANFIS produces minimum mean absolute error (MAE) in risk prediction and is found to be the most robust algorithm.