(2) * Cao Lijia (Sichuan University of Science & Engineering, China)
(3) Fu Changyou (Sichuan University of Science & Engineering, China)
*corresponding author
AbstractIn response to the challenge that traditional visual simultaneous localization and mapping (SLAM) systems, based on the assumption of a static environment, struggle to achieve real-time indoor 3D reconstruction in complex dynamic scenes, this paper proposes a real-time indoor 3D reconstruction algorithm based on semantic visual SLAM. By leveraging object detection to obtain 2D semantic information and providing prior information for geometric methods, the fusion of the two effectively suppresses dynamic features, reduces reliance on deep learning methods, and ensures the algorithm's real-time performance. Experimental results on dynamic scenes in the TUM RGB-D dataset show that our algorithm maintains nearly unchanged real-time performance while achieving an average performance improvement of approximately 97.56% and 97.31% on the TUM dataset and Bonn dataset, respectively, compared to the ORB-SLAM2 system. Moreover, our algorithm can reconstruct more intuitive indoor global Octo-map and semantic metric maps compared to sparse point cloud maps, effectively enhancing the scene perception capability of mobile robots and laying the foundation for performing advanced tasks. Furthermore, our algorithm demonstrates a 3.5-10.5 times improvement in real-time performance compared to other mainstream semantic SLAM systems. Experimental results on the NVIDIA Jetson AGX Xavier confirm that our algorithm can run in real time on low-power platforms such as mobile robots or drones. However, the drawbacks of our algorithm include lower reconstruction accuracy in low-texture and large-scale scenes and ineffective suppression of dynamic features in low-dynamic scenes. Future work will consider replacing and improving deep learning methods and integrating IMU and other sensors to enhance system usability.
KeywordsVisual SLAM; Semantic SLAM; 3D Reconstruction; Object Detection; Dynamic Features
|
DOIhttps://doi.org/10.31763/ijrcs.v4i1.1266 |
Article metrics10.31763/ijrcs.v4i1.1266 Abstract views : 776 | PDF views : 187 |
Cite |
Full TextDownload |
References
[1] B. Gong, Z. Zhu, C. Yan, Z. Shi, and F. Xu, "PlaneFusion: Real-Time Indoor Scene Reconstruction With Planar Prior," IEEE Transactions on Visualization and Computer Graphics, vol. 28, pp. 4671-4684, 2022, https://doi.org/10.1109/TVCG.2021.3099480.
[2] Z. Xi, M. Rui, G. Rui, and H. Qi, "Phase-SLAM: Phase Based Simultaneous Localization and Mapping for Mobile Structured Light Illumination Systems," IEEE Robotics and Automation Letters, vol. 7, pp. 6203-6210, 2022, https://doi.org/10.1109/LRA.2022.3162024.
[3] S. Hajira, M. Reza, and M. Hussan, "Neural Network-Based Recent Research Developments in SLAM for Autonomous Ground Vehicles: A Review," IEEE Sensors Journal, vol. 23, pp. 13829-13858, 2023, https://doi.org/10.1109/JSEN.2023.3273913.
[4] H. Y. Chia and H. L. Min, "Robust 3D Reconstruction Using HDR-Based SLAM," IEEE Access, vol. 9, pp. 16568-16581, 2021, https://doi.org/10.1109/ACCESS.2021.3051257.
[5] M. Francisco, C. J. M., M. Magdalena, and F. Camino, "Augmented Reality Based on SLAM to Assess Spatial Short-Term Memory," IEEE Access, vol. 7, pp. 2453-2466, 2019, https://doi.org/10.1109/ACCESS.2018.2886627.
[6] P. M. C. Joao, C. S. Antonio, and R. R. S. Silvio, "A mapping of visual SLAM algorithms and their applications in augmented reality," in 2020 22nd Symposium on Virtual and Augmented Reality (SVR), pp. 20-29, 2020, https://doi.org/10.1109/SVR51698.2020.00019.
[7] J. Fuentes-Pacheco, J. Ascencio, and J. Rendon-Mancha, "Visual Simultaneous Localization and Mapping: A Survey," Artificial Intelligence Review, vol. 43, pp. 55-81, 2015, https://doi.org/10.1007/s10462-012-9365-8.
[8] G. Jeremias, O. Eugenio, R. Francisco, and S. Carlos, "Mapping the Landscape of SLAM Research: A Review," IEEE Latin America Transactions, vol. 21, pp. 1313-1336, 2023, https://doi.org/10.1109/TLA.2023.10305240.
[9] G. Yang, Y. Wang, J. Zhi, W. Liu, Y. Shao, and P. Peng, "A Review of Visual Odometry in SLAM Techniques," in 2020 International Conference on Artificial Intelligence and Electromechanical Automation (AIEA), pp. 332-336, 2020, https://doi.org/10.1109/AIEA51086.2020.00075.
[10] J. Wang and F. Yang, "A Review of Vision SLAM-based Closed-loop Inspection," in 2023 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 507-512, 2023, https://doi.org/10.1109/ICMA57826.2023.10215583.
[11] Y. Ying, Z. Wei, H. Shang, and S. Liang, "Dense Scene 3D Reconstruction Based on Semantic Information of Indoor Environment: A Review," in 2021 International Conference on Networking Systems of AI (INSAI), pp. 110-117, 2021, https://doi.org/10.1109/INSAI54028.2021.00030.
[12] X. Lin, Y. Huang, D. Sun, Y. Lin, E. Brendan, M. E. Ryan, and G. Manni, "A Robust Keyframe-Based Visual SLAM for RGB-D Cameras in Challenging Scenarios," IEEE Access, vol. 11, pp. 97239-97249, 2023, https://doi.org/10.1109/ACCESS.2023.3312062.
[13] C. Campos, E. Richard, J. G. R. Juan, M. M. M. Jose, and D. T. Juan, "ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM," IEEE Transactions on Robotics, vol. 37, pp. 1874-1890, 2021, https://doi.org/10.1109/TRO.2021.3075644.
[14] M. Raul and D. T. Juan, "ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras," IEEE Transactions on Robotics, vol. 33, pp. 1255-1262, 2017, https://doi.org/10.1109/TRO.2017.2705103.
[15] H. Durrant-Whyte and T. Bailey, "Simultaneous localization and mapping: part I," IEEE Robotics & Automation Magazine, vol. 13, pp. 99-110, 2006, https://doi.org/10.1109/MRA.2006.1638022.
[16] L. Zhao, B. Wei, L. Li, and L. Xu, "A Review of Visual SLAM for Dynamic Objects," in 2022 IEEE 17th Conference on Industrial Electronics and Applications (ICIEA), pp. 1080-1085, 2022, https://doi.org/10.1109/ICIEA54703.2022.10006191.
[17] S. Chen, C. Sun, S. Zhang, and D. Zhang, "SG-SLAM: A Real-Time RGB-D Visual SLAM Toward Dynamic Scenes With Semantic and Geometric Information," IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1-12, 2023, https://doi.org/10.1109/TIM.2022.3228006.
[18] K. Chen, J. Liu, Q. Chen, Z. Wang, and J. Zhang, "Accurate Object Association and Pose Updating for Semantic SLAM," IEEE Transactions on Intelligent Transportation Systems, vol. 23, pp. 25169-25179, 2022, https://doi.org/10.1109/TITS.2021.3136918.
[19] Y. Liu and M. Jun, "RDS-SLAM: Real-Time Dynamic SLAM Using Semantic Segmentation Methods," IEEE Access, vol. 9, pp. 23772-23785, 2021, https://doi.org/10.1109/ACCESS.2021.3050617.
[20] I. Kostavelis and A. Gasteratos, "Semantic mapping for mobile robotics tasks: A survey," Robotics and Autonomous Systems, vol. 66, pp. 86-103, 2015, https://doi.org/10.1016/j.robot.2014.12.006.
[21] R. Teng, Y. Liang, J. Zhang, D. Tang, and H. Li, "RS-SLAM: A Robust Semantic SLAM in Dynamic Environments Based on RGB-D Sensor," IEEE Sensors Journal, vol. 21, pp. 20657-20664, 2021, https://doi.org/10.1109/JSEN.2021.3099511.
[22] K. Chen, J. Zhang, J. Liu, Q. Tong, R. Liu, and S. Chen, "Semantic Visual Simultaneous Localization and Mapping: A Survey," ARXIV, 2022, https://arxiv.org/abs/2209.06428v1.
[23] H. Pu, J. Luo, G. Wang, T. Huang, H. Liu, and J. Luo, "Visual SLAM Integration With Semantic Segmentation and Deep Learning: A Review," IEEE Sensors Journal, vol. 23, pp. 22119-22138, 2023, https://doi.org/10.1109/JSEN.2023.3306371.
[24] A. P. Julio, S. Jared, C. Henry, A. Nikolay, I. Vadim, C. Luca, and A. C. Jose, "A Survey on Active Simultaneous Localization and Mapping: State of the Art and New Frontiers," IEEE Transactions on Robotics, vol. 39, pp. 1686-1705, 2023, https://doi.org/10.1109/TRO.2023.3248510.
[25] F. Min, Z. Wu, D. Li, G. Wang, and N. Liu, "COEB-SLAM: A Robust VSLAM in Dynamic Environments Combined Object Detection, Epipolar Geometry Constraint, and Blur Filtering," IEEE Sensors Journal, vol. 23, pp. 26279-26291, 2023, https://doi.org/10.1109/JSEN.2023.3317056.
[26] K. Wang, S. Ma, J. Chen, F. Ren, and J. Lu, "Approaches, Challenges, and Applications for Deep Visual Odometry: Toward Complicated and Emerging Areas," IEEE Transactions on Cognitive and Developmental Systems, vol. 14, pp. 35-49, 2022, https://doi.org/10.1109/TCDS.2020.3038898.
[27] K. Liu, H. Zhang, Y. Liu, and Y. Wang, "Dynamic Object Removal based on Deep Learning and Multi-view Geometry," in 2022 4th International Conference on Frontiers Technology of Information and Computer (ICFTIC), pp. 863-868, 2022, https://doi.org/10.1109/ICFTIC57696.2022.10075301.
[28] Y. Wang, B. Zhang, P. Li, T. Cao, and B. Zhang, "Dynamic Object Separation and Removal in 3D Point Cloud Map Building," in 2022 6th International Conference on Robotics and Automation Sciences (ICRAS), pp. 247-252, 2022, https://doi.org/10.1109/ICRAS55217.2022.9842041.
[29] B. Berta, M. F. Jose, C. Javier, and N. Jose, "DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes," IEEE Robotics and Automation Letters, vol. 3, pp. 4076-4083, 2018, https://doi.org/10.1109/LRA.2018.2860039.
[30] C. Yu, Z. Liu, X. Liu, F. Xie, Y. Yi, Q. Wei, and Q. Fei, "DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments," in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1168-1174, 2018, https://doi.org/10.1109/IROS.2018.8593691.
[31] J. Chang, N. Dong, and D. Li, "A Real-Time Dynamic Object Segmentation Framework for SLAM System in Dynamic Scenes," IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1-9, 2021, https://doi.org/10.1109/TIM.2021.3109718.
[32] X. Cui, C. Liu, and J. Wang, "3D Semantic Map Construction Using Improved ORB-SLAM2 for Mobile Robot in Edge Computing Environment," IEEE Access, vol. 8, pp. 67179-67191, 2020, https://doi.org/10.1109/ACCESS.2020.2983488.
[33] K. He, G. Gkioxari, D. Piotr, and G. Ross, "Mask R-CNN," in 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980-2988, 2017, https://doi.org/10.1109/ICCV.2017.322.
[34] B. Vijay, K. Alex, and C. Roberto, "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, pp. 2481-2495, 2017, https://doi.org/10.1109/TPAMI.2016.2644615.
[35] N. Yoshikatsu and S. Hideo, "Efficient Object-Oriented Semantic Mapping With Object Detector," IEEE Access, vol. 7, pp. 3206-3213, 2019, https://doi.org/10.1109/ACCESS.2018.2887022.
[36] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, and A. C. Berg, "SSD: Single Shot MultiBox Detector," In Computer Vision–ECCV 2016: 14th European Conference, pp. 21-37, 2016, https://doi.org/10.1007/978-3-319-46448-0_2.
[37] H. Andrew et al., "Searching for MobileNetV3," in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1314-1324, 2019, https://doi.org/10.1109/ICCV.2019.00140.
[38] A. Hornung, K. M. Wurm, M. Bennewitz, C. Stachniss, and W. Burgard, "OctoMap: an efficient probabilistic 3D mapping framework based on octrees," Autonomous Robots, vol. 34, pp. 189-206, 2013, http://dx.doi.org/10.1007/s10514-012-9321-0.
[39] D. Li, S. Liu, W. Xiang, Q. Tan, K. Yuan, Z. Zhang, and Y. Hu, "A SLAM System Based on RGBD Image and Point-Line Feature," IEEE Access, vol. 9, pp. 9012-9025, 2021, https://doi.org/10.1109/ACCESS.2021.3049467.
[40] K. Liu, Z. Fan, M. Li, and S. Zhang, "Object-aware Semantic Mapping of Indoor Scenes using Octomap," in 2019 Chinese Control Conference (CCC), pp. 8671-8676, 2019, https://doi.org/10.23919/ChiCC.2019.8865848.
[41] M. Everingham, L. Van Gool, C. Williams, J. Winn, and A. Zisserman, "The Pascal Visual Object Classes (VOC) challenge," International Journal of Computer Vision, vol. 88, pp. 303-338, 2010, http://dx.doi.org/10.1007/s11263-009-0275-4.
[42] A. Pranav, R. Pratibha, and K. Manoj, "YOLO v3-Tiny: Object Detection and Recognition using one stage improved model," in 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), pp. 687-694, 2020, https://doi.org/10.1109/ICACCS48705.2020.9074315.
[43] W. Fang, L. Wang, and P. Ren, "Tinier-YOLO: A Real-Time Object Detection Method for Constrained Environments," IEEE Access, vol. 8, pp. 1935-1944, 2020, https://doi.org/10.1109/ACCESS.2019.2961959.
[44] L. Chen et al., "Deep Neural Network Based Vehicle and Pedestrian Detection for Autonomous Driving: A Survey," IEEE Transactions on Intelligent Transportation Systems, vol. 22, pp. 3234-3246, 2021, https://doi.org/10.1109/TITS.2020.2993926.
[45] R. Gao, Z. Li, J. Li, B. Li, J. Zhang, and J. Liu, "Real-Time SLAM Based on Dynamic Feature Point Elimination in Dynamic Environment," IEEE Access, vol. 11, pp. 113952-113964, 2023, https://doi.org/10.1109/ACCESS.2023.3324146.
[46] H. Liu, G. Liu, G. Tian, S. Xin, and Z. Ji, "Visual SLAM Based on Dynamic Object Removal," in 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 596-601, 2019, https://doi.org/10.1109/ROBIO49542.2019.8961397.
[47] H. Thorsten and A. Ayoub, "Pixel-Wise Motion Segmentation for SLAM in Dynamic Environments," IEEE Access, vol. 8, pp. 164521-164528, 2020, https://doi.org/10.1109/ACCESS.2020.3022506.
[48] C Shao, L. Zhang, and W. Pan, "Faster R-CNN Learning-Based Semantic Filter for Geometry Estimation and Its Application in vSLAM Systems," IEEE Transactions on Intelligent Transportation Systems, vol. 23, pp. 5257-5266, 2022, https://doi.org/10.1109/TITS.2021.3052812.
[49] J. Cheng, C. Wang. and Q. H. M. Max, "Robust Visual Localization in Dynamic Environments Based on Sparse Motion Removal," IEEE Transactions on Automation Science and Engineering, vol. 17, pp. 658-669, 2020, https://doi.org/10.1109/TASE.2019.2940543.
[50] M. Quigley et al., "ROS: an open-source Robot Operating System," in ICRA Workshop Open Source Softw, pp. 5, 2009, http://dx.doi.org/10.13140/RG.2.2.28424.93446.
[51] S. Jurgen, E. Nikolas, E. Felix, B. Wolfram, and C. Daniel, "A benchmark for the evaluation of RGB-D SLAM systems," in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573-580, 2012, https://doi.org/10.1109/IROS.2012.6385773.
[52] P. Emanuele, B. Jens, L. Philipp, G. Philippe, and S. Cyrill, "ReFusion: 3D Reconstruction in Dynamic Environments for RGB-D Cameras Exploiting Residuals," in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7855-7862, 2019, https://doi.org/10.1109/IROS40897.2019.8967590.
[53] F. Zhong, S. Wang, Z. Zhang, C. Chen, and Y. Wang, "Detect-SLAM: Making Object Detection and SLAM Mutually Beneficial," in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1001-1010, 2018, https://doi.org/10.1109/WACV.2018.00115.
[54] D. Wu, B. Xie, and C. Tao, "3D Semantic VSLAM of Dynamic Environment Based on YOLACT," Mathematical Problems in Engineering, vol. 2022, pp. 1-12, 2022, http://dx.doi.org/10.1155/2022/7307783.
[55] W. Wu, L. Guo, H. Gao, Z. You, Y. Liu, and Z. Chen, "YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint," Neural Computing and Applications, vol. 34, pp. 6011-6026, 2022, https://doi.org/10.1007/s00521-021-06764-3.
[56] X. Shi et al., "Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM," in 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3139-3145, 2020, https://doi.org/10.1109/ICRA40945.2020.9196638.
Refbacks
- There are currently no refbacks.
Copyright (c) 2023 Yu Liang, Fu Changyou, Cao Lijia
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
About the Journal | Journal Policies | Author | Information |
International Journal of Robotics and Control Systems
e-ISSN: 2775-2658
Website: https://pubs2.ascee.org/index.php/IJRCS
Email: ijrcs@ascee.org
Organized by: Association for Scientific Computing Electronics and Engineering (ASCEE), Peneliti Teknologi Teknik Indonesia, Department of Electrical Engineering, Universitas Ahmad Dahlan and Kuliah Teknik Elektro
Published by: Association for Scientific Computing Electronics and Engineering (ASCEE)
Office: Jalan Janti, Karangjambe 130B, Banguntapan, Bantul, Daerah Istimewa Yogyakarta, Indonesia