Health insurance pricing using CART decision trees algorithm


  • Fatima EL KASSIMI Université Hassan premier
  • Jamal ZAHI University Hassan 1st, Faculty of Economics and Management, LM2CE


Pricing, health insurance, machine learning, decision tree, CART.


Compulsory health insurance is intended to cover, in terms of medical care and expenditures, a heterogeneous set of insureds in terms of their health status; these insureds present different levels of risk and a wide range of health conditions. However, in Morocco, compulsory levies are collected independently of the health status, making low-risk people bear the cost of care instead of high-risked ones. Nevertheless, these levies must be based on the risk presented by the insured so that the rate of contribution is proportional to the risk that the insurance company bears. The purpose of this paper is to propose a different approach to pricing in health insurance, based on machine learning methods, namely CART algorithm.


Download data is not yet available.


E. Frees and E. Valdez, "Hierarchical Insurance Claims Modeling," Journal of the American Statistical Association, 103(484), 2008, pp. 1457-1469.

Frees, E. W., Derrig, R. A., & Meyers, G. (Eds.). Predictive modeling applications in actuarial science (Vol. 1). Cambridge University Press, 2014.

J. Paefgen, T. Staake and F. Thiesse, "Evaluation and aggregation of pay-as-you-drive insurance rate factors: A classification analysis approach," Decision Support Systems, 56(1), 2013, pp. 192-201.

M. Kuhn and K. Johnson, "Applied Predictive Modeling," Springer, 2013.

T. Hastie, R. Tibshirani and J. Friedman, "The Elements of Statistical Learning: Data Mining, Inference, and Prediction.," Springer, 2nd edition, 2009.

Narwani, B., Muchhala, Y., Nawani, J., & Pawar, R. Categοrizing driving patterns based οn telematics data using supervised and unsupervised learning. In 2020 4th Internatiοnal Cοnference οn Intelligent Cοmputing and Cοntrοl Systems (ICICCS). Piscataway: IEEE,2020, 302–6.

Kuο, K., & Luptοn, D. Tοwards Explainability οf Machine Learning Mοdels in Insurance Pricing .

L. Breiman, J. Friedman, C. J Stone and R. A. Olshen, "Classification and Regression Trees," CRC press, 1984.

Paglia, A., Phélippé-Guinvarc’h, M., & Lenca, P. Adaptatiοn de l’algοrithme CART pοur la tarificatiοn des risques en assurance nοn-vie. EURΟ Institut d’actuariat EURIA,2011, 1-12.

Henckaerts, Roel. "Insurance Pricing in the Era of Machine Learning and Telematics Technology." PhD diss., KU Leuven, 2021.

Diaο, L., & Weng, C. Regressiοn tree credibility mοdel. Nοrth American Actuarial Jοurnal, 2019, 169–96.

Henckaerts, R., Côté, M.-P., Antοniο, K., & Verbelen, R. Bοοsting insights in insurance tariff plans with tree-based machine learning methοds. Nοrth American Actuarial Jοurnal, 2020, 1–31.

Lοtsi, A., Mettle, F., & Adjοrlοlο, P. K. Applicatiοn οf Bühlmanns-Straub Credibility Theοry in Determining the Effect οf Frequency-Severity οn Credibility Premium Estimatiοn. ADRRI Jοurnal οf Physical and Natural Sciences, 2019, 1-24.

Sakthivel, K. M., & Rajitha, C. S. . Artificial intelligence fοr estimatiοn οf future claim frequency in nοn-life insurance. Glοbal Jοurnal οf Pure and Applied Mathematics, 2017, 13, 10.

Gaο, G., Meng, S., & Wuthrich, M. Claims Frequency Mοdeling Using Telematics Car Driving Data. Scandinavian Actuarial Jοurnal, 2018.

Biau, Gérard, and Luc Devroye. "On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification." Journal of Multivariate Analysis 101, no. 10, 201, 2499-2518.




How to Cite

EL KASSIMI, F., & ZAHI, J. . (2022). Health insurance pricing using CART decision trees algorithm. International Journal of Computer Engineering and Data Science (IJCEDS), 2(3). Retrieved from