Healthcare analytics by engaging machine learning

(1) * Pragathi Penikalapati Mail (School of Computer Science and Engineering, VIT University, India)
(2) A Nagaraja Rao Mail (School of Computer Science and Engineering, VIT University, India)
*corresponding author

Abstract


Precise prediction of chronic diseases is the very basis of all healthcare informatics. Early diagnosis of the disease is crucial in delivering any healthcare service. The modern times witness our general vulnerability to several health disorders due to a stressful lifestyle causing anxiety and depression, or susceptibility to hypertension and diabetics or major diseases such as cancer or cardiovascular ailments. Hence, we should undergo periodic screening and diagnostic tests for such possible disorders to lead healthy lives. In this context, Machine Learning technology can play a pivotal role in developing Electronic Health Records (EHR) for implementing quick and comprehensively automated procedures in disease detection among the at-risk individuals at an early stage, so that accelerated processes of referral, counseling, and treatment can be initiated. The scope of the current paper is to survey the utilization of feature selection and techniques of Machine Learning, such as Classification and Clustering in the specific context of disease diagnosis and early prediction. This paper purposes of identifying the best models of Machine Learning duly supported by their performance indices, utility aspects, constraints, and critical issues in the specific context of their effective application in healthcare analytics for the benefit of practitioners and researchers.

Keywords


Electronic Health Records; Feature Selection; Machine Learning; Classification Models; Clustering Models

   

DOI

https://doi.org/10.31763/sitech.v1i1.32
      

Article metrics

10.31763/sitech.v1i1.32 Abstract views : 2353 | PDF views : 747

   

Cite

   

Full Text

Download

References


S. Yang, Effective Learning of Probabilistic Models for Clinical Predictions from Longitudinal Data. ProQuest Dissertations Publishing, 2017.

R. C. Deo, “Machine Learning in Medicine,” Circulation, vol. 132, no. 20, pp. 1920–1930, Nov. 2015, doi: 10.1161/CIRCULATIONAHA.115.001593.

S. Cui, D. Wang, Y. Wang, P.-W. Yu, and Y. Jin, “An improved support vector machine-based diabetic readmission prediction,” Comput. Methods Programs Biomed., vol. 166, pp. 123–135, Nov. 2018, doi: 10.1016/j.cmpb.2018.10.012.

E. Garcia-Ceja, M. Riegler, T. Nordgreen, P. Jakobsen, K. J. Oedegaard, and J. Tørresen, “Mental health monitoring with multimodal sensing and machine learning: A survey,” Pervasive Mob. Comput., vol. 51, pp. 1–26, Dec. 2018, doi: 10.1016/j.pmcj.2018.09.003.

C.-C. Wu et al., “Prediction of fatty liver disease using machine learning algorithms,” Comput. Methods Programs Biomed., vol. 170, pp. 23–29, Mar. 2019, doi: 10.1016/j.cmpb.2018.12.032.

T. S. Brisimi, T. Xu, T. Wang, W. Dai, W. G. Adams, and I. C. Paschalidis, “Predicting Chronic Disease Hospitalizations from Electronic Health Records: An Interpretable Classification Approach,” Proc. IEEE, vol. 106, no. 4, pp. 690–707, Apr. 2018, doi: 10.1109/JPROC.2017.2789319.

M. Chen, Y. Hao, K. Hwang, L. Wang, and L. Wang, “Disease Prediction by Machine Learning Over Big Data From Healthcare Communities,” IEEE Access, vol. 5, pp. 8869–8879, 2017.

D. Jain and V. Singh, “Feature selection and classification systems for chronic disease prediction: A review,” Egypt. Informatics J., vol. 19, no. 3, pp. 179–189, Nov. 2018, doi: 10.1016/j.eij.2018.03.002.

A. F. Simpao, L. M. Ahumada, J. A. Gálvez, and M. A. Rehman, “A Review of Analytics and Clinical Informatics in Health Care,” J. Med. Syst., vol. 38, no. 4, p. 45, Apr. 2014, doi: 10.1007/s10916-014-0045-x.

M. Islam, M. Hasan, X. Wang, H. Germack, and M. Noor-E-Alam, “A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining,” Healthcare, vol. 6, no. 2, p. 54, May 2018, doi: 10.3390/healthcare6020054.

J. Liu, Z. Zhang, and N. Razavian, “Deep EHR: Chronic Disease Prediction Using Medical Notes,” in Proceedings of the 3rd Machine Learning for Healthcare Conference, 2018, vol. 85, pp. 440–464, [Online]. Available: http://proceedings.mlr.press/v85/liu18b.html.

H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer, “Random survival forests,” Ann. Appl. Stat., vol. 2, no. 3, pp. 841–860, Sep. 2008, doi: 10.1214/08-AOAS169.

S. R. Alty, S. C. Millasseau, P. J. Chowienczyk, and A. Jakobsson, “Cardiovascular disease prediction using support vector machines,” in 2003 46th Midwest Symposium on Circuits and Systems, 2003, vol. 1, pp. 376–379, doi: 10.1109/MWSCAS.2003.1562297.

D. H. Mantzaris, G. C. Anastassopoulos, and D. K. Lymberopoulos, “Medical disease prediction using Artificial Neural Networks,” in 2008 8th IEEE International Conference on BioInformatics and BioEngineering, Oct. 2008, pp. 1–6, doi: 10.1109/BIBE.2008.4696782.

F. Jiang et al., “Artificial intelligence in healthcare: past, present and future,” Stroke Vasc. Neurol., vol. 2, no. 4, pp. 230–243, Dec. 2017, doi: 10.1136/svn-2017-000101.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. New York, NY: Springer New York, 2009.

Rui Xu and D. Wunsch, “Survey of clustering algorithms,” IEEE Trans. Neural Networks, vol. 16, no. 3, pp. 645–678, 2005.

J. Han, J. Pei, and M. Kamber, Data mining: concepts and techniques. Elsevier, 2011.

I. Guyon and A. Elisseeff, “An Introduction to Variable and Feature Selection,” J. Mach. Learn. Res., vol. 3, no. null, pp. 1157–1182, 2003.

J. Tang, S. Alelyani, and H. Liu, Feature selection for classification: A review. Chapman and Hall/CRC, 2014.

K. Chandana, Y. Prasanth, and J. Prabhu Das, “A decision support system for predicting diabetic retinopathy using neural networks,” J. Theor. Appl. Inf. Technol., vol. 88, no. 3, pp. 598–606, 2016, doi: 10.1109/ERECT.2015.7499020.

J. Zhang, K. Kowsari, J. H. Harrison, J. M. Lobo, and L. E. Barnes, “Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record,” IEEE Access, vol. 6, pp. 65333–65346, 2018, doi: 10.1109/ACCESS.2018.2875677.

N. Sadati, M. Z. Nezhad, R. B. Chinnam, and D. Zhu, “Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study,” arXiv Prepr. arXiv 1801.02961v2, Jan. 2018, [Online]. Available: http://arxiv.org/abs/1801.02961.

J. C. Ang, A. Mirzal, H. Haron, and H. N. A. Hamed, “Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection,” IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 13, no. 5, pp. 971–989, Sep. 2016, doi: 10.1109/TCBB.2015.2478454.

A. Mustaqeem, S. M. Anwar, and M. Majid, “Multiclass Classification of Cardiac Arrhythmia Using Improved Feature Selection and SVM Invariants,” Comput. Math. Methods Med., vol. 2018, pp. 1–10, 2018, doi: 10.1155/2018/7310496.

Q. K. Al-Shayea and MIS, “Artificial Neural Networks in Medical Diagnosis,” IJCSI Int. J. Comput. Sci., vol. 8, no. 2, pp. 150–154, 2011, doi: 10.1007/978-3-7908-1788-1_8.

J. K. Kim and S. Kang, “Neural Network-Based Coronary Heart Disease Risk Prediction Using Feature Correlation Analysis,” J. Healthc. Eng., vol. 2017, pp. 1–13, 2017, doi: 10.1155/2017/2780501.

R. Narain, S. Saxena, and A. Goyal, “Cardiovascular risk prediction: a comparative study of Framingham and quantum neural network based approach,” Patient Prefer. Adherence, vol. 10, pp. 1259–1270, Jul. 2016, doi: 10.2147/PPA.S108203.

R. Mahajan, R. Kamaleswaran, J. A. Howe, and O. Akbilgic, “Cardiac Rhythm Classification from a Short Single Lead ECG Recording via Random Forest,” 2017 Comput. Cardiol. Conf., vol. 44, pp. 2–5, 2018, doi: 10.22489/cinc.2017.179-403.

C. Vimal and B. Sathish, “Random Forest Classifier Based ECG Arrhythmia Classification,” Int. J. Healthc. Inf. Syst. Informatics, vol. 5, no. 2, pp. 1–10, Apr. 2010, doi: 10.4018/jhisi.2010040101.

M. A. Jabbar and S. Samreen, “Heart disease prediction system based on hidden naïve bayes classifier,” in 2016 International Conference on Circuits, Controls, Communications and Computing (I4C), Oct. 2016, pp. 1–5, doi: 10.1109/CIMCA.2016.8053261.

Y.-J. Son, H.-G. Kim, E.-H. Kim, S. Choi, and S.-K. Lee, “Application of Support Vector Machine for Prediction of Medication Adherence in Heart Failure Patients,” Healthc. Inform. Res., vol. 16, no. 4, p. 253, 2010, doi: 10.4258/hir.2010.16.4.253.

R. Mahajan, R. Kamaleswaran, J. A. Howe, and O. Akbilgic, “Cardiac Rhythm Classification from a Short Single Lead ECG Recording via Random Forest,” in 2017 Computing in Cardiology (CinC), Sep. 2017, pp. 1–4, doi: 10.22489/CinC.2017.179-403.

P. Janardhanan, L. Heena, and F. Sabika, “Effectiveness of support vector machines in medical data mining,” J. Commun. Softw. Syst., vol. 11, no. 1, pp. 25–30, 2015, doi: 10.24138/jcomss.v11i1.114.

G.-M. Huang, K.-Y. Huang, T.-Y. Lee, and J. Weng, “An interpretable rule-based diagnostic classification of diabetic nephropathy among type 2 diabetes patients,” BMC Bioinformatics, vol. 16, no. Suppl 1, p. S5, 2015, doi: 10.1186/1471-2105-16-S1-S5.

S. Malik, R. Khadgawat, S. Anand, and S. Gupta, “Non-invasive detection of fasting blood glucose level via electrochemical measurement of saliva,” Springerplus, vol. 5, no. 1, p. 701, Dec. 2016, doi: 10.1186/s40064-016-2339-6.

D. Sisodia and D. S. Sisodia, “Prediction of Diabetes using Classification Algorithms,” Procedia Comput. Sci., vol. 132, pp. 1578–1585, 2018, doi: 10.1016/j.procs.2018.05.122.

R. K. Leung et al., “Using a multi-staged strategy based on machine learning and mathematical modeling to predict genotype-phenotype risk patterns in diabetic kidney disease: a prospective case–control cohort analysis,” BMC Nephrol., vol. 14, no. 1, p. 162, Dec. 2013, doi: 10.1186/1471-2369-14-162.

Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting Diabetes Mellitus With Machine Learning Techniques,” Front. Genet., vol. 9, p. 515, Nov. 2018, doi: 10.3389/fgene.2018.00515.

K.-J. Wang, B. Makond, and K.-M. Wang, “Modeling and predicting the occurrence of brain metastasis from lung cancer by Bayesian network: A case study of Taiwan,” Comput. Biol. Med., vol. 47, pp. 147–160, Apr. 2014, doi: 10.1016/j.compbiomed.2014.02.002.

O. Regnier-Coudert, J. McCall, R. Lothian, T. Lam, S. McClinton, and J. N’Dow, “Machine learning for improved pathological staging of prostate cancer: A performance comparison on a range of classifiers,” Artif. Intell. Med., vol. 55, no. 1, pp. 25–35, May 2012, doi: 10.1016/j.artmed.2011.11.003.

Y.-C. Chen, W.-C. Ke, and H.-W. Chiu, “Risk classification of cancer survival using ANN with gene expression data from multiple laboratories,” Comput. Biol. Med., vol. 48, pp. 1–7, May 2014, doi: 10.1016/j.compbiomed.2014.02.006.

G. R. Hart, D. A. Roffman, R. Decker, and J. Deng, “A multi-parameterized artificial neural network for lung cancer risk prediction,” PLoS One, vol. 13, no. 10, p. e0205264, Oct. 2018, doi: 10.1371/journal.pone.0205264.

M. M. Khan, A. Mendes, and S. K. Chalup, “Evolutionary Wavelet Neural Network ensembles for breast cancer and Parkinson’s disease prediction,” PLoS One, vol. 13, no. 2, p. e0192192, Feb. 2018, doi: 10.1371/journal.pone.0192192.

C.-J. Tseng, C.-J. Lu, C.-C. Chang, and G.-D. Chen, “Application of machine learning to predict the recurrence-proneness for cervical cancer,” Neural Comput. Appl., vol. 24, no. 6, pp. 1311–1316, May 2014, doi: 10.1007/s00521-013-1359-1.

M.-W. Huang, C.-W. Chen, W.-C. Lin, S.-W. Ke, and C.-F. Tsai, “SVM and SVM Ensembles in Breast Cancer Prediction,” PLoS One, vol. 12, no. 1, p. e0161501, Jan. 2017, doi: 10.1371/journal.pone.0161501.

R. Agrahari et al., “Applications of Bayesian network models in predicting types of hematological malignancies,” Sci. Rep., vol. 8, no. 1, p. 6951, 2018, doi: 10.1038/s41598-018-24758-5.

K. J. Wang, B. Makond, and K. M. Wang, “Modeling and predicting the occurrence of brain metastasis from lung cancer by Bayesian network: A case study of Taiwan,” Comput. Biol. Med., vol. 47, no. 1, pp. 147–160, 2014, doi: 10.1016/j.compbiomed.2014.02.002.

Filip Dabek and Jesus J. Caban, “A Neural Network Based Model for Predicting Psychological Conditions,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9250, pp. 252–253, 2015, doi: 10.1007/978-3-319-23344-4.

A. Sau and I. Bhakta, “Artificial Neural Network (ANN) Model to Predict Depression among Geriatric Population at a Slum in Kolkata, India,” J. Clin. Diagn. Res., vol. 11, no. 5, pp. VC01–VC04, May 2017, doi: 10.7860/JCDR/2017/23656.9762.

B. Mwangi, K. P. Ebmeier, K. Matthews, and J. Douglas Steele, “Multi-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder,” Brain, vol. 135, no. 5, pp. 1508–1521, May 2012, doi: 10.1093/brain/aws084.

R. Miotto, L. Li, B. A. Kidd, and J. T. Dudley, “Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records,” Sci. Rep., vol. 6, no. 1, p. 26094, May 2016, doi: 10.1038/srep26094.

M. Tommasi, G. Ferrara, and A. Saggino, “Application of Bayes’ Theorem in Valuating Depression Tests Performance,” Front. Psychol., vol. 9, p. 1240, Jul. 2018, doi: 10.3389/fpsyg.2018.01240.


Refbacks

  • There are currently no refbacks.


Copyright (c) 2020 Pragathi Penikalapati, A Nagaraja Rao

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
Science in Information Technology Letters
ISSN 2722-4139
Published by Association for Scientific Computing Electrical and Engineering (ASCEE)
W : http://pubs2.ascee.org/index.php/sitech
E : sitech@ascee.org, andri@ascee.org, andri.pranolo.id@ieee.org

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

View My Stats