A fundamental overview of sota-ensemble learning methods for deep learning: a systematic literature review

(1) * Marco Klaiber Mail (Aalen University of Applied Sciences, Germany)
*corresponding author

Abstract


The rapid growth in popularity of Deep Learning (DL) continues to bring more use cases and opportunities, with methods rapidly evolving and new fields developing from the convergence of different algorithms. For this systematic literature review, we considered the most relevant peer-reviewed journals and conference papers on the state of the art of various Ensemble Learning (EL) methods for application in DL, which are also expected to give rise to new ones in combination. The EL methods relevant to this work are described in detail and the respective popular combination strategies as well as the individual tuning and averaging procedures are presented. A comprehensive overview of the various limitations of EL is then provided, culminating in the final formulation of research gaps for future scholarly work on the results, which is the goal of this thesis. This work fills the research gap for upcoming work in EL for by proving in detail and making accessible the fundamental properties of the chosen methods, which will further deepen the understanding of the complex topic in the future and, following the maxim of ensemble learning, should enable better results through an ensemble of knowledge in the future.

Keywords


Ensemble Learning; Bagging; Boosting; Deep Learning; Machine Learning; Predictive Performance; CNN

   

DOI

https://doi.org/10.31763/sitech.v2i2.549
      

Article metrics

10.31763/sitech.v2i2.549 Abstract views : 2159 | PDF views : 314

   

Cite

   

Full Text

Download

References


[1] Y. Chen, Y. Wang, Y. Gu, X. He, P. Ghamisi, and X. Jia, “Deep Learning Ensemble for Hyperspectral Image Classification,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 12, no. 6, pp. 1882–1897, Jun. 2019, doi: 10.1109/JSTARS.2019.2915259.

[2] H. Greenspan, B. Van Ginneken, and R. M. Summers, “Guest Editorial Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique,” IEEE Trans. Med. Imaging, vol. 35, no. 5, pp. 1153–1159, May 2016, doi: 10.1109/TMI.2016.2553401.

[3] W. Liu, M. Zhang, Z. Luo, and Y. Cai, “An Ensemble Deep Learning Method for Vehicle Type Classification on Visual Traffic Surveillance Sensors,” IEEE Access, vol. 5, pp. 24417–24425, Oct. 2017, doi: 10.1109/ACCESS.2017.2766203.

[4] I. Kononenko, “Machine learning for medical diagnosis: history, state of the art and perspective,” Artif. Intell. Med., vol. 23, no. 1, pp. 89–109, Aug. 2001, doi: 10.1016/S0933-3657(01)00077-X.

[5] J. Latif, C. Xiao, A. Imran, and S. Tu, “Medical imaging using machine learning and deep learning algorithms: A review,” 2019 2nd Int. Conf. Comput. Math. Eng. Technol. iCoMET 2019, Mar. 2019, doi: 10.1109/ICOMET.2019.8673502.

[6] J. Bell, “What is machine learning?,” Mach. Learn. City Appl. Archit. Urban Des., pp. 209–216, May 2022, doi: 10.1007/978-3-319-18305-3_1.

[7] C. Bin Ha and H. K. Song, “Signal Detection Scheme Based on Adaptive Ensemble Deep Learning Model,” IEEE Access, vol. 6, pp. 21342–21349, Apr. 2018, doi: 10.1109/ACCESS.2018.2825463.

[8] Y. Ren, L. Zhang, and P. N. Suganthan, “Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [Review Article],” IEEE Comput. Intell. Mag., vol. 11, no. 1, pp. 41–53, Feb. 2016, doi: 10.1109/MCI.2015.2471235.

[9] S. J. Lee, T. Chen, L. Yu, and C. H. Lai, “Image Classification Based on the Boost Convolutional Neural Network,” IEEE Access, vol. 6, pp. 12755–12768, Jan. 2018, doi: 10.1109/ACCESS.2018.2796722.

[10] Z.-H. Zhou, “Ensemble Learning,” Encycl. Biometrics, pp. 270–273, 2009, doi: 10.1007/978-0-387-73003-5_293.

[11] L. Wen, L. Gao, and X. Li, “A New Snapshot Ensemble Convolutional Neural Network for Fault Diagnosis,” IEEE Access, vol. 7, pp. 32037–32047, 2019, doi: 10.1109/ACCESS.2019.2903295.

[12] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nat. 2015 5217553, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539.

[13] X. Dong, Z. Yu, W. Cao, Y. Shi, and Q. Ma, “A survey on ensemble learning,” Front. Comput. Sci., vol. 14, no. 2, pp. 241–258, Apr. 2020, doi: 10.1007/S11704-019-8208-Z.

[14] Y. Wang, Y. Yang, Y. X. Liu, and A. A. Bharath, “A Recursive Ensemble Learning Approach with Noisy Labels or Unlabeled Data,” IEEE Access, vol. 7, pp. 36459–36470, 2019, doi: 10.1109/ACCESS.2019.2904403.

[15] F. Huang, J. Lu, J. Tao, L. L. Li, X. Tan, and P. Liu, “Research on Optimization Methods of ELM Classification Algorithm for Hyperspectral Remote Sensing Images,” IEEE Access, vol. 7, pp. 108070–108099, 2019, doi: 10.1109/ACCESS.2019.2932909.

[16] O. Sagi and L. Rokach, “Ensemble learning: A survey,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 8, no. 4, p. e1249, Jul. 2018, doi: 10.1002/WIDM.1249.

[17] “Machine Learning Research: Four Current Directions | Request PDF.” Available at : researchgate.net.

[18] S. Wan and H. Yang, “Comparison among methods of ensemble learning,” Proc. - 2013 Int. Symp. Biometrics Secur. Technol. ISBAST 2013, pp. 286–290, 2013, doi: 10.1109/ISBAST.2013.50.

[19] F. Huang, G. Xie, and R. Xiao, “Research on ensemble learning,” 2009 Int. Conf. Artif. Intell. Comput. Intell. AICI 2009, vol. 3, pp. 249–252, 2009, doi: 10.1109/AICI.2009.235.

[20] L. Breiman, “Bagging predictors,” Mach. Learn., vol. 24, no. 2, pp. 123–140, 1996, doi: 10.1007/BF00058655.

[21] Y. H. Na, H. Jo, and J. B. Song, “Learning to grasp objects based on ensemble learning combining simulation data and real data,” Int. Conf. Control. Autom. Syst., vol. 2017-October, pp. 1030–1034, Dec. 2017, doi: 10.23919/ICCAS.2017.8204368.

[22] M. A. Dede, E. Aptoula, and Y. Genc, “Deep Network Ensembles for Aerial Scene Classification,” IEEE Geosci. Remote Sens. Lett., vol. 16, no. 5, pp. 732–735, May 2019, doi: 10.1109/LGRS.2018.2880136.

[23] G. I. Webb and Z. Zheng, “Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques,” IEEE Trans. Knowl. Data Eng., vol. 16, no. 8, pp. 980–991, Aug. 2004, doi: 10.1109/TKDE.2004.29.

[24] H. Guan and X. Xue, “Robust online visual tracking via a temporal ensemble framework,” Proc. - IEEE Int. Conf. Multimed. Expo, vol. 2016-August, Aug. 2016, doi: 10.1109/ICME.2016.7552969.

[25] N. Yu, L. Qian, Y. Huang, and Y. Wu, “Ensemble Learning for Facial Age Estimation Within Non-Ideal Facial Imagery,” IEEE Access, vol. 7, pp. 97938–97948, 2019, doi: 10.1109/ACCESS.2019.2928843.

[26] J. Zilly, J. M. Buhmann, and D. Mahapatra, “Glaucoma detection using entropy sampling and ensemble learning for automatic optic cup and disc segmentation,” Comput. Med. Imaging Graph., vol. 55, pp. 28–41, Jan. 2017, doi: 10.1016/J.COMPMEDIMAG.2016.07.012.

[27] S. A. Gyamerah, P. Ngare, and D. Ikpe, “On Stock Market Movement Prediction Via Stacking Ensemble Learning Method,” CIFEr 2019 - IEEE Conf. Comput. Intell. Financ. Eng. Econ., May 2019, doi: 10.1109/CIFER.2019.8759062.

[28] S. Džeroski and B. Ženko, “Is combining classifiers with stacking better than selecting the best one?,” Mach. Learn., vol. 54, no. 3, pp. 255–273, Mar. 2004, doi: 10.1023/B:MACH.0000015881.36452.6E.

[29] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001, doi: 10.1023/A:1010933404324`.

[30] R. Buettner, S. Sauer, C. Maier, and A. Eckhardt, “Towards ex ante prediction of user performance: A novel NeuroIS methodology based on real-time measurement of mental effort,” Proc. Annu. Hawaii Int. Conf. Syst. Sci., vol. 2015-March, pp. 533–542, Mar. 2015, doi: 10.1109/HICSS.2015.70.

[31] H. Liang, L. Song, and X. Li, “The rotate stress of steam turbine prediction method based on stacking ensemble learning,” Proc. IEEE Int. Symp. High Assur. Syst. Eng., vol. 2019-January, pp. 146–149, Mar. 2019, doi: 10.1109/HASE.2019.00030.

[32] A. P. Piotrowski and J. J. Napiorkowski, “A comparison of methods to avoid overfitting in neural networks training in the case of catchment runoff modelling,” J. Hydrol., vol. 476, pp. 97–111, Jan. 2013, doi: 10.1016/J.JHYDROL.2012.10.019.

[33] T. G. Dietterich, “Ensemble methods in machine learning,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 1857 LNCS, pp. 1–15, 2000, doi: 10.1007/3-540-45014-9_1.

[34] Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” J. Comput. Syst. Sci., vol. 55, no. 1, pp. 119–139, Aug. 1997, doi: 10.1006/JCSS.1997.1504.

[35] B. Zhang, Y. Yang, C. Chen, L. Yang, J. Han, and L. Shao, “Action Recognition Using 3D Histograms of Texture and A Multi-Class Boosting Classifier,” IEEE Trans. Image Process., vol. 26, no. 10, pp. 4648–4660, Oct. 2017, doi: 10.1109/TIP.2017.2718189.

[36] Z. H. Zhou, J. Wu, and W. Tang, “Ensembling neural networks: Many could be better than all,” Artif. Intell., vol. 137, no. 1–2, pp. 239–263, May 2002, doi: 10.1016/S0004-3702(02)00190-X.

[37] Y. Zhao, J. Li, and L. Yu, “A deep learning ensemble approach for crude oil price forecasting,” Energy Econ., vol. 66, pp. 9–16, Aug. 2017, doi: 10.1016/J.ENECO.2017.05.023.

[38] G. Huang, Y. Li, G. Pleiss, Z. Liu, J. E. Hopcroft, and K. Q. Weinberger, “Snapshot Ensembles: Train 1, get M for free,” 5th Int. Conf. Learn. Represent. ICLR 2017 - Conf. Track Proc., Apr. 2017, Accessed: Apr. 18, 2023. [Online]. Available at: https://arxiv.org/abs/1704.00109v1

[39] Y. Freund, “Boosting a Weak Learning Algorithm by Majority,” Inf. Comput., vol. 121, no. 2, pp. 256–285, Sep. 1995, doi: 10.1006/INCO.1995.1136.

[40] E. Bauer and R. Kohavi, “Empirical comparison of voting classification algorithms: bagging, boosting, and variants,” Mach. Learn., vol. 36, no. 1, pp. 105–139, 1999, doi: 10.1023/A:1007515423169.

[41] J. R. Quinlan, “Bagging, Boosting, and C4.5,” 1996. Available at : semanticscholar.org.

[42] B. Efron and R. J. Tibshirani, “An Introduction to the Bootstrap,” An Introd. to Bootstrap, May 1994, doi: 10.1201/9780429246593.

[43] A. Kabir, C. Ruiz, and S. A. Alvarez, “Mixed Bagging: A Novel Ensemble Learning Framework for Supervised Classification Based on Instance Hardness,” Proc. - IEEE Int. Conf. Data Mining, ICDM, vol. 2018-November, pp. 1073–1078, Dec. 2018, doi: 10.1109/ICDM.2018.00137.

[44] B. Krawczyk and M. Woźniak, “Wagging for Combining Weighted One-class Support Vector Machines,” Procedia Comput. Sci., vol. 51, no. 1, pp. 1565–1573, Jan. 2015, doi: 10.1016/J.PROCS.2015.05.351.

[45] G. I. Webb, “MultiBoosting: a technique for combining boosting and wagging,” Mach. Learn., vol. 40, no. 2, pp. 159–196, Aug. 2000, doi: 10.1023/A:1007659514849.

[46] D. H. Wolpert, “Stacked generalization,” Neural Networks, vol. 5, no. 2, pp. 241–259, Jan. 1992, doi: 10.1016/S0893-6080(05)80023-1.

[47] T. K. Ho, “The random subspace method for constructing decision forests,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 8, pp. 832–844, 1998, doi: 10.1109/34.709601.

[48] Y. Lin and Y. Jeon, “Random Forests and Adaptive Nearest Neighbors,”, vol. 101, no. 474, pp. 578–590, Jun. 2012, doi: 10.1198/016214505000001230.

[49] I. Loshchilov and F. Hutter, “SGDR: Stochastic Gradient Descent with Warm Restarts,” 5th Int. Conf. Learn. Represent. ICLR 2017 - Conf. Track Proc., Aug. 2016, Accessed: Apr. 18, 2023. [Online]. Available: https://arxiv.org/abs/1608.03983v5

[50] P. K. Chan and S. J. Stolfo, “A Comparative Evaluation of Voting and Meta-learning on Partitioned Data,” Mach. Learn. Proc. 1995, pp. 90–98, Jan. 1995, doi: 10.1016/B978-1-55860-377-6.50020-7.

[51] J. Zheng, X. Cao, B. Zhang, X. Zhen, and X. Su, “Deep Ensemble Machine for Video Classification,” IEEE Trans. Neural Networks Learn. Syst., vol. 30, no. 2, pp. 553–565, Feb. 2019, doi: 10.1109/TNNLS.2018.2844464.


Refbacks

  • There are currently no refbacks.


Copyright (c) 2021 Marco Klaiber

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

___________________________________________________________
Science in Information Technology Letters
ISSN 2722-4139
Published by Association for Scientific Computing Electrical and Engineering (ASCEE)
W : http://pubs2.ascee.org/index.php/sitech
E : sitech@ascee.org, andri@ascee.org, andri.pranolo.id@ieee.org

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

View My Stats