Healthcare analytics by engaging machine learning

Pragathi Penikalapati; A Nagaraja Rao

doi:10.31763/sitech.v1i1.32


Healthcare analytics by engaging machine learning

^{(1) *} Pragathi Penikalapati

(School of Computer Science and Engineering, VIT University, India)
⁽²⁾ A Nagaraja Rao

(School of Computer Science and Engineering, VIT University, India)
^*corresponding author

Abstract

Precise prediction of chronic diseases is the very basis of all healthcare informatics. Early diagnosis of the disease is crucial in delivering any healthcare service. The modern times witness our general vulnerability to several health disorders due to a stressful lifestyle causing anxiety and depression, or susceptibility to hypertension and diabetics or major diseases such as cancer or cardiovascular ailments. Hence, we should undergo periodic screening and diagnostic tests for such possible disorders to lead healthy lives. In this context, Machine Learning technology can play a pivotal role in developing Electronic Health Records (EHR) for implementing quick and comprehensively automated procedures in disease detection among the at-risk individuals at an early stage, so that accelerated processes of referral, counseling, and treatment can be initiated. The scope of the current paper is to survey the utilization of feature selection and techniques of Machine Learning, such as Classification and Clustering in the specific context of disease diagnosis and early prediction. This paper purposes of identifying the best models of Machine Learning duly supported by their performance indices, utility aspects, constraints, and critical issues in the specific context of their effective application in healthcare analytics for the benefit of practitioners and researchers.

Keywords

Electronic Health Records; Feature Selection; Machine Learning; Classification Models; Clustering Models

DOI

https://doi.org/10.31763/sitech.v1i1.32

Article metrics

10.31763/sitech.v1i1.32 Abstract views : 2958 | PDF views : 942

Cite

How to cite item

Full Text

Download

References

S. Yang, Effective Learning of Probabilistic Models for Clinical Predictions from Longitudinal Data. ProQuest Dissertations Publishing, 2017.

R. C. Deo, â€œMachine Learning in Medicine,â€ Circulation, vol. 132, no. 20, pp. 1920â€“1930, Nov. 2015, doi: 10.1161/CIRCULATIONAHA.115.001593.

S. Cui, D. Wang, Y. Wang, P.-W. Yu, and Y. Jin, â€œAn improved support vector machine-based diabetic readmission prediction,â€ Comput. Methods Programs Biomed., vol. 166, pp. 123â€“135, Nov. 2018, doi: 10.1016/j.cmpb.2018.10.012.

E. Garcia-Ceja, M. Riegler, T. Nordgreen, P. Jakobsen, K. J. Oedegaard, and J. TÃ¸rresen, â€œMental health monitoring with multimodal sensing and machine learning: A survey,â€ Pervasive Mob. Comput., vol. 51, pp. 1â€“26, Dec. 2018, doi: 10.1016/j.pmcj.2018.09.003.

C.-C. Wu et al., â€œPrediction of fatty liver disease using machine learning algorithms,â€ Comput. Methods Programs Biomed., vol. 170, pp. 23â€“29, Mar. 2019, doi: 10.1016/j.cmpb.2018.12.032.

T. S. Brisimi, T. Xu, T. Wang, W. Dai, W. G. Adams, and I. C. Paschalidis, â€œPredicting Chronic Disease Hospitalizations from Electronic Health Records: An Interpretable Classification Approach,â€ Proc. IEEE, vol. 106, no. 4, pp. 690â€“707, Apr. 2018, doi: 10.1109/JPROC.2017.2789319.

M. Chen, Y. Hao, K. Hwang, L. Wang, and L. Wang, â€œDisease Prediction by Machine Learning Over Big Data From Healthcare Communities,â€ IEEE Access, vol. 5, pp. 8869â€“8879, 2017.

D. Jain and V. Singh, â€œFeature selection and classification systems for chronic disease prediction: A review,â€ Egypt. Informatics J., vol. 19, no. 3, pp. 179â€“189, Nov. 2018, doi: 10.1016/j.eij.2018.03.002.

A. F. Simpao, L. M. Ahumada, J. A. GÃ¡lvez, and M. A. Rehman, â€œA Review of Analytics and Clinical Informatics in Health Care,â€ J. Med. Syst., vol. 38, no. 4, p. 45, Apr. 2014, doi: 10.1007/s10916-014-0045-x.

M. Islam, M. Hasan, X. Wang, H. Germack, and M. Noor-E-Alam, â€œA Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining,â€ Healthcare, vol. 6, no. 2, p. 54, May 2018, doi: 10.3390/healthcare6020054.

J. Liu, Z. Zhang, and N. Razavian, â€œDeep EHR: Chronic Disease Prediction Using Medical Notes,â€ in Proceedings of the 3rd Machine Learning for Healthcare Conference, 2018, vol. 85, pp. 440â€“464, [Online]. Available: http://proceedings.mlr.press/v85/liu18b.html.

H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer, â€œRandom survival forests,â€ Ann. Appl. Stat., vol. 2, no. 3, pp. 841â€“860, Sep. 2008, doi: 10.1214/08-AOAS169.

S. R. Alty, S. C. Millasseau, P. J. Chowienczyk, and A. Jakobsson, â€œCardiovascular disease prediction using support vector machines,â€ in 2003 46th Midwest Symposium on Circuits and Systems, 2003, vol. 1, pp. 376â€“379, doi: 10.1109/MWSCAS.2003.1562297.

D. H. Mantzaris, G. C. Anastassopoulos, and D. K. Lymberopoulos, â€œMedical disease prediction using Artificial Neural Networks,â€ in 2008 8th IEEE International Conference on BioInformatics and BioEngineering, Oct. 2008, pp. 1â€“6, doi: 10.1109/BIBE.2008.4696782.

F. Jiang et al., â€œArtificial intelligence in healthcare: past, present and future,â€ Stroke Vasc. Neurol., vol. 2, no. 4, pp. 230â€“243, Dec. 2017, doi: 10.1136/svn-2017-000101.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. New York, NY: Springer New York, 2009.

Rui Xu and D. Wunsch, â€œSurvey of clustering algorithms,â€ IEEE Trans. Neural Networks, vol. 16, no. 3, pp. 645â€“678, 2005.

J. Han, J. Pei, and M. Kamber, Data mining: concepts and techniques. Elsevier, 2011.

I. Guyon and A. Elisseeff, â€œAn Introduction to Variable and Feature Selection,â€ J. Mach. Learn. Res., vol. 3, no. null, pp. 1157â€“1182, 2003.

J. Tang, S. Alelyani, and H. Liu, Feature selection for classification: A review. Chapman and Hall/CRC, 2014.

K. Chandana, Y. Prasanth, and J. Prabhu Das, â€œA decision support system for predicting diabetic retinopathy using neural networks,â€ J. Theor. Appl. Inf. Technol., vol. 88, no. 3, pp. 598â€“606, 2016, doi: 10.1109/ERECT.2015.7499020.

J. Zhang, K. Kowsari, J. H. Harrison, J. M. Lobo, and L. E. Barnes, â€œPatient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record,â€ IEEE Access, vol. 6, pp. 65333â€“65346, 2018, doi: 10.1109/ACCESS.2018.2875677.

N. Sadati, M. Z. Nezhad, R. B. Chinnam, and D. Zhu, â€œRepresentation Learning with Autoencoders for Electronic Health Records: A Comparative Study,â€ arXiv Prepr. arXiv 1801.02961v2, Jan. 2018, [Online]. Available: http://arxiv.org/abs/1801.02961.

J. C. Ang, A. Mirzal, H. Haron, and H. N. A. Hamed, â€œSupervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection,â€ IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 13, no. 5, pp. 971â€“989, Sep. 2016, doi: 10.1109/TCBB.2015.2478454.

A. Mustaqeem, S. M. Anwar, and M. Majid, â€œMulticlass Classification of Cardiac Arrhythmia Using Improved Feature Selection and SVM Invariants,â€ Comput. Math. Methods Med., vol. 2018, pp. 1â€“10, 2018, doi: 10.1155/2018/7310496.

Q. K. Al-Shayea and MIS, â€œArtificial Neural Networks in Medical Diagnosis,â€ IJCSI Int. J. Comput. Sci., vol. 8, no. 2, pp. 150â€“154, 2011, doi: 10.1007/978-3-7908-1788-1_8.

J. K. Kim and S. Kang, â€œNeural Network-Based Coronary Heart Disease Risk Prediction Using Feature Correlation Analysis,â€ J. Healthc. Eng., vol. 2017, pp. 1â€“13, 2017, doi: 10.1155/2017/2780501.

R. Narain, S. Saxena, and A. Goyal, â€œCardiovascular risk prediction: a comparative study of Framingham and quantum neural network based approach,â€ Patient Prefer. Adherence, vol. 10, pp. 1259â€“1270, Jul. 2016, doi: 10.2147/PPA.S108203.

R. Mahajan, R. Kamaleswaran, J. A. Howe, and O. Akbilgic, â€œCardiac Rhythm Classification from a Short Single Lead ECG Recording via Random Forest,â€ 2017 Comput. Cardiol. Conf., vol. 44, pp. 2â€“5, 2018, doi: 10.22489/cinc.2017.179-403.

C. Vimal and B. Sathish, â€œRandom Forest Classifier Based ECG Arrhythmia Classification,â€ Int. J. Healthc. Inf. Syst. Informatics, vol. 5, no. 2, pp. 1â€“10, Apr. 2010, doi: 10.4018/jhisi.2010040101.

M. A. Jabbar and S. Samreen, â€œHeart disease prediction system based on hidden naÃ¯ve bayes classifier,â€ in 2016 International Conference on Circuits, Controls, Communications and Computing (I4C), Oct. 2016, pp. 1â€“5, doi: 10.1109/CIMCA.2016.8053261.

Y.-J. Son, H.-G. Kim, E.-H. Kim, S. Choi, and S.-K. Lee, â€œApplication of Support Vector Machine for Prediction of Medication Adherence in Heart Failure Patients,â€ Healthc. Inform. Res., vol. 16, no. 4, p. 253, 2010, doi: 10.4258/hir.2010.16.4.253.

R. Mahajan, R. Kamaleswaran, J. A. Howe, and O. Akbilgic, â€œCardiac Rhythm Classification from a Short Single Lead ECG Recording via Random Forest,â€ in 2017 Computing in Cardiology (CinC), Sep. 2017, pp. 1â€“4, doi: 10.22489/CinC.2017.179-403.

P. Janardhanan, L. Heena, and F. Sabika, â€œEffectiveness of support vector machines in medical data mining,â€ J. Commun. Softw. Syst., vol. 11, no. 1, pp. 25â€“30, 2015, doi: 10.24138/jcomss.v11i1.114.

G.-M. Huang, K.-Y. Huang, T.-Y. Lee, and J. Weng, â€œAn interpretable rule-based diagnostic classification of diabetic nephropathy among type 2 diabetes patients,â€ BMC Bioinformatics, vol. 16, no. Suppl 1, p. S5, 2015, doi: 10.1186/1471-2105-16-S1-S5.

S. Malik, R. Khadgawat, S. Anand, and S. Gupta, â€œNon-invasive detection of fasting blood glucose level via electrochemical measurement of saliva,â€ Springerplus, vol. 5, no. 1, p. 701, Dec. 2016, doi: 10.1186/s40064-016-2339-6.

D. Sisodia and D. S. Sisodia, â€œPrediction of Diabetes using Classification Algorithms,â€ Procedia Comput. Sci., vol. 132, pp. 1578â€“1585, 2018, doi: 10.1016/j.procs.2018.05.122.

R. K. Leung et al., â€œUsing a multi-staged strategy based on machine learning and mathematical modeling to predict genotype-phenotype risk patterns in diabetic kidney disease: a prospective caseâ€“control cohort analysis,â€ BMC Nephrol., vol. 14, no. 1, p. 162, Dec. 2013, doi: 10.1186/1471-2369-14-162.

Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, â€œPredicting Diabetes Mellitus With Machine Learning Techniques,â€ Front. Genet., vol. 9, p. 515, Nov. 2018, doi: 10.3389/fgene.2018.00515.

K.-J. Wang, B. Makond, and K.-M. Wang, â€œModeling and predicting the occurrence of brain metastasis from lung cancer by Bayesian network: A case study of Taiwan,â€ Comput. Biol. Med., vol. 47, pp. 147â€“160, Apr. 2014, doi: 10.1016/j.compbiomed.2014.02.002.

O. Regnier-Coudert, J. McCall, R. Lothian, T. Lam, S. McClinton, and J. Nâ€™Dow, â€œMachine learning for improved pathological staging of prostate cancer: A performance comparison on a range of classifiers,â€ Artif. Intell. Med., vol. 55, no. 1, pp. 25â€“35, May 2012, doi: 10.1016/j.artmed.2011.11.003.

Y.-C. Chen, W.-C. Ke, and H.-W. Chiu, â€œRisk classification of cancer survival using ANN with gene expression data from multiple laboratories,â€ Comput. Biol. Med., vol. 48, pp. 1â€“7, May 2014, doi: 10.1016/j.compbiomed.2014.02.006.

G. R. Hart, D. A. Roffman, R. Decker, and J. Deng, â€œA multi-parameterized artificial neural network for lung cancer risk prediction,â€ PLoS One, vol. 13, no. 10, p. e0205264, Oct. 2018, doi: 10.1371/journal.pone.0205264.

M. M. Khan, A. Mendes, and S. K. Chalup, â€œEvolutionary Wavelet Neural Network ensembles for breast cancer and Parkinsonâ€™s disease prediction,â€ PLoS One, vol. 13, no. 2, p. e0192192, Feb. 2018, doi: 10.1371/journal.pone.0192192.

C.-J. Tseng, C.-J. Lu, C.-C. Chang, and G.-D. Chen, â€œApplication of machine learning to predict the recurrence-proneness for cervical cancer,â€ Neural Comput. Appl., vol. 24, no. 6, pp. 1311â€“1316, May 2014, doi: 10.1007/s00521-013-1359-1.

M.-W. Huang, C.-W. Chen, W.-C. Lin, S.-W. Ke, and C.-F. Tsai, â€œSVM and SVM Ensembles in Breast Cancer Prediction,â€ PLoS One, vol. 12, no. 1, p. e0161501, Jan. 2017, doi: 10.1371/journal.pone.0161501.

R. Agrahari et al., â€œApplications of Bayesian network models in predicting types of hematological malignancies,â€ Sci. Rep., vol. 8, no. 1, p. 6951, 2018, doi: 10.1038/s41598-018-24758-5.

K. J. Wang, B. Makond, and K. M. Wang, â€œModeling and predicting the occurrence of brain metastasis from lung cancer by Bayesian network: A case study of Taiwan,â€ Comput. Biol. Med., vol. 47, no. 1, pp. 147â€“160, 2014, doi: 10.1016/j.compbiomed.2014.02.002.

Filip Dabek and Jesus J. Caban, â€œA Neural Network Based Model for Predicting Psychological Conditions,â€ Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9250, pp. 252â€“253, 2015, doi: 10.1007/978-3-319-23344-4.

A. Sau and I. Bhakta, â€œArtificial Neural Network (ANN) Model to Predict Depression among Geriatric Population at a Slum in Kolkata, India,â€ J. Clin. Diagn. Res., vol. 11, no. 5, pp. VC01â€“VC04, May 2017, doi: 10.7860/JCDR/2017/23656.9762.

B. Mwangi, K. P. Ebmeier, K. Matthews, and J. Douglas Steele, â€œMulti-centre diagnostic classification of individual structural neuroimaging scans from patients with major depressive disorder,â€ Brain, vol. 135, no. 5, pp. 1508â€“1521, May 2012, doi: 10.1093/brain/aws084.

R. Miotto, L. Li, B. A. Kidd, and J. T. Dudley, â€œDeep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records,â€ Sci. Rep., vol. 6, no. 1, p. 26094, May 2016, doi: 10.1038/srep26094.

M. Tommasi, G. Ferrara, and A. Saggino, â€œApplication of Bayesâ€™ Theorem in Valuating Depression Tests Performance,â€ Front. Psychol., vol. 9, p. 1240, Jul. 2018, doi: 10.3389/fpsyg.2018.01240.

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
Science in Information Technology Letters
ISSN 2722-4139
Published by Association for Scientific Computing Electrical and Engineering (ASCEE)
W : http://pubs2.ascee.org/index.php/sitech
E : sitech@ascee.org, andri@ascee.org, andri.pranolo.id@ieee.org

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

View My Stats

Username
Password
Remember me