An Optimally Configured HP-GRU Model Using Hyperband for The Control of Wall Following Robot

ABSTRACT


Introduction
Autonomous robots play a vital role in several commercial and domestic applications. These robots are widely used in military factories, secular power plants, chemical industries, and locomotive industries. One of their kind is a wall following robot, which plays a vital role in detecting faults in machinery, cracks in infrastructure, perform rescue activities, and are used in medical and rehabilitation centers [1][2][3][4]. The accurate control of these autonomous robots is necessary for their robust performance and the surroundings' safety. In this paper, we presented an autonomous control framework for the wall following robot using an optimally configured Gated Recurrent Unit (GRU) model with the hyperband algorithm. GRU is popularly known for the time-series or sequence data, and it overcomes the vanishing gradient problem of RNN. GRU also consumes less memory and is computationally more efficient than LSTMs. The selection of hyper-parameters of the GRU model is a complex optimization problem with local minima. Usually, hyper-parameters are selected through hit and trial, which does not guarantee an optimal solution. To come around this problem, we used a hyperband algorithm for the selection of optimal parameters. It is an iterative method, which searches for the optimal configuration by discarding the least performing configurations on each iteration. The proposed HP-GRU model is used on a dataset of SCITOS G5 robots with 24 sensors mounted. The results show that HP-GRU has a mean accuracy of 0.9857 and a mean loss of 0.0810, and it is comparable with other deep learning algorithms.
There are many classical to modern state-of-the-art methods proposed for the controlled motion of wall-following robots [5][6][7][8][9][10][11][12][13]. Juang et al. [5] proposed a fuzzy control for the hexapod robot trained through differential evolution. Dash et al. [7] used a dataset of 24 ultrasonic sensors mounted on SCITOS G5 robot and applied a neural network (NN) to train a model to design control with an accuracy of 92.67%. Later, Dash et al. [9] proposed another hybrid model composed of gradational search approach with feedforward neural network and achieved an accuracy of 86.38% within 0.28 sec. Likewise, Dash et al. [10] proposed another approach known as Adaptive Resonance Theory-1 for the control of a wall-following robot with an average accuracy of 91.78%. Some heuristic techniques are also employed for the control of the robot [14,15]. For instance, Chen et al. [16] used an optimization-based meta-heuristic approach known as PSO and achieved a maximum accuracy of 98.8%. Likewise, Isaac et al. [17] compared the performance of Bayesian and k-NN networks in a dynamic environment with the accuracy of 93.3% and 73.3%.
The Recurrent Neural Networks (RNNs) are popularly known for the time-series data since they can understand the contextual information stored in sequential data. RNNs are popularly used in several real-world applications [18][19][20][21][22][23][24][25]. Likewise, they have used in the control of the wall the following robot as well. Hammad et al. [26] employed LSTM and GRU models (two variants of RNN) for the control of the robot and obtained an accuracy of 96.15% and 96.52%.
In this paper, we presented an optimally configured HP-GRU (Hyperband Gated Recurrent Unit) model for the robust control of the wall following the robot. In all the models discussed above, the hyper-parameters are manually chosen through hit and trial because the selection of hyper-parameters is an optimization problem. We employed a Hyperband algorithm to select optimal parameters from the search space for GRU and achieved a robust control framework for the wall-following robot. In the simulation, the dataset is taken from a SCITOS G5 robot with 24 ultrasonic sensors mounted on it [27]. The dataset contains four moves of the robot based on the sensory information. We used our proposed HP-GRU model on the dataset and achieved a mean accuracy of 98.58%. We compared the results with other methods to show the superiority of our optimally configured GRU model. The rest of the paper is as follows. In section 2, we will discuss the architecture of RNN and, more particularly, GRU. In section 3, we will discuss in detail Successive-Halving and Hyperband Algorithm. Besides, we will also discuss our proposed optimally configured GRU model, and finally, we will discuss the nature of the dataset collected from SCITOS G5. In section 4, we will discuss the numerical results and the comparison with other algorithms. In section 5, we will conclude the paper with the final remarks.

Gated Recurrent Unit (GRU-RNN)
RNNs are known for sequential or time-series data. They are an extension of neural networks (NN), where several NNs are stacked together and share the common weights. RNNs overcome the limited input length limitation of NNs but still faces the vanishing gradient problem. There are two solutions to come around this problem, i.e., LSTM (Long Short Term Memory) and GRU (Gated Recurrent Units). Both have their pros and cons, but GRU consumes less memory and is faster than LSTM. The comparison between LSTM and GRU is given below, • LSTM processes longer input-sequence than GRU.
• GRU has two memory gates, whereas LSTM has three memory gates.
• GRU model includes lesser trainable variables than LSTM.
• GRU are computationally and time-wise faster than LSTM. At input, it includes an and a cell state − and at the output, it has cell-state for the next GRU unit. The formulation of GRU is given as, where and are the intermediary matrices for the reset and update gate, respectively. The architecture of GRU is shown in Fig. 1. It shows the reset and update gate. These gates control the flow of information through the GRU cell so that relevant information passes on to the next cell and discard the useless information. This technique helps the GRU-RNN model to overcome the vanishing gradient problem. gate, (c) shows the update gate of GRU, which will pass the information to the next GRU unit.

Successive-Halving
There are numerous classical approaches for the selection of hyper-parameters of learning models. Bayesian optimization methods are at the top of the list, with their probabilistic approach to configure the optimal configuration. For highly complex non-linear problems, they fail to optimize the selection of hyper-parameters, so they are integrated with heuristic approaches to come around this problem.
Successive-halving is a modern approach, and as suggested from the name, it allocates the resources to all possible configurations and then evaluates their performances along with time-consumption. The half-best configurations move to the next iteration, while the remaining are drop. This iterative process continues until the best configuration of hyperparameter is achieved. in the beginning, successive-halving allocates uniform resources to all the configurations, let us say, , where ∈ + . With time, it allocates exponential resources to the best configurations. The selection of itself is an optimization problem. For instance, if is small, more resources (time) will be allocated to the configurations, which may not achieve the optimal design. Whereas, if is large, fewer resources will be assigned to the configurations resulting in premature convergence.

Hyperband
Hyperband is an extension of Successive-Halving, which addresses the "n Vs. ." It computes the performance of the model for different values of . The algorithm includes two nested loops, the inner loop performance successive halving for the given value of , whereas the outer-loop tries different values of n to find the optimal value. The outer loop is known as the "bracket," where each bracket utilizes resources.
The algorithm is shown in Fig. 2. It takes two inputs, , the maximum resource allocates to one configuration and, , which is the proportion of configurations to discard on each iteration. Likewise, it includes + 1 different values of .

Fig. 2. Hyperband Algorithm
For the implementation of Hyperband, we employed a keras-tuner. It takes different configurations of training models, e.g., number of layers, number of hidden units in each layer, activation function, dropout %, and learning-rate, etc. The user also inputs the number of trials and the number of executions in each trial is . Table 1 and Table 2 shows hyper-parameters configuration space and also show the optimal configuration of hyper-parameters for the GRU model. We tuned the HP-GRU model based on the validation loss. The model will the smallest loss will be the best to use. Fig. 3 shows the results that the worst model has the loss of 0.5275, best model has a validation loss of 0.1104, so the optimally configured model is almost five times better than the worst configured model.

Proposed Method
As mentioned above, the optimal configuration of the GRU model is selected through the hyperband algorithm. The details of hyper-parameters of each layer are given as follows.

Input Layer
The input layer consists of four input features extracted from the ultrasound sensors mounted on the wall following the robot, i.e., (Front Sensor), (Left Sensor), (Right Sensor), and (Back Sensor). The format of the input is ( , )=(5455, 4), where is batch size and is features.

Layer 1
The first layer includes 448 hidden units with a "relu" activation function followed by a 10% dropout layer, which means that the network will randomly discard 10% of hidden units before passing the cell state to the second layer.

Layer 2
The second layer includes 384 hidden units with a "relu" activation function followed by a 30% dropout layer, which means that the network will randomly discard 30% of hidden units before passing the cell state to the flatten layer. The flattened layer flats the hidden units and connect them with the fully connected layer, i.e., the fourth layer.

Layer 3
The third layer includes 96 hidden units with a "sigmoid" activation function, and then it passes the output to the final and output layer.

Output Layer
The last and the output layer consists of 4 possible outputs for the wall following robot, i.e., Slight-Right-Turn, Move-Forward, Sharp-Right-Turn, and Slight-Left-Turn. To keep the output between 0 and 1, we used the "softmax" activation function.

The Datasheet
The dataset is recorded with the help of the SCITOS G5 robot navigated in a room in the clockwise direction. The 24 sensors are attached to the robot's waist to measure its distance from the walls. The sampling rate is nine samples per second during the four rounds of the robot in the room.
The dataset of 24 sensors is then compressed into four parts, i.e., Front Sensor ( ), Right Sensor ( ), Left Sensor ( ), and Back Sensor ( ) known as features, and based on these features, the robot can take four decisions, i.e., Slight-Right-Turn, Move-Forward, Sharp-Right-Turn, and Slight-Left-Turn, known as classes. There are mainly two data preprocessing techniques are employed, i.e., min-max normalization and conversion of classes into numerical numbers. All the data is normalized between 0 and 1 using min-max normalization, and since we used "Sparse Categorical Cross-entropy," so we assigned a number to the classes, i.e., Slight-Right-Turn = 0, Move-Forward = 1, Forward, Sharp-Right-Turn = 2, and Slight-Left-Turn = 3. The sample of the dataset is shown in Table 3. The dataset is divided into three portions, i.e., training, validation, and test. As mentioned earlier, the dataset contains 5455 samples, and it is divided as, the training data is 65% of the total data, validation data is 10%, and training data is 25% of the total dataset.

Results and Discussion
In the simulation section, we will explore the accuracy of our HP-GRU model. There are some additional parameters for the training phase mentioned in Table 4. First, we will discuss the accuracy of all the models tested during the best model selection through hyperband. The results are shown in Table 5. It shows the performance of the best HP-GRU model to the worst HP-GRU model. There are four performing metrics, i.e., validation loss, validation accuracy, test loss, and test accuracy. It can be seen that the best model, the first model, has out-performed the rest in all metrics because hyperband manages to converge the hyper-parameters of the GRU model to the optimal configuration. It can also observe that the validation accuracy of HP-GRU models during ten trials increased from 0.7707 to 0.9659, and likewise, the test accuracy increased from 0.7780 to 0.9857, which is almost 1.25 times. Likewise, the test loss decreased from 0.5169 to 0.0810, which is 6.3 times. We used the optimal HP-GRU model, the first model, for comparison with other state-of-theart methods. For comparison, we used DFNN with Weight Sharing [26], DFNN (3 Hidden Layers) [26], FNN (1 Hidden Layer) [26], Gated Recurrent Unit (GRU) [26], and Long Short Term Memory (LSTM) [26]. The comparison is shown in Table 6. It shows that HP-GRU has higher accuracy of 0.9857 and a lower loss of 0.0810 as compared to other methods.

Conclusion
In this paper, we presented an optimally configured Gated Recurrent Unit (GRU) model with a hyperband algorithm to design the control framework of the wall-following robot. GRU is a variant of RNN networks used for time-series or sequence data. With the help of hyperband, we selected the optimal configuration of hyper-parameters of our GRU model. The proposed HP-GRU model is used on a dataset of SCITOS G5 robots with 24 sensors mounted. There are four decision classes, i.e., Slight-Right-Turn = 0, Move-Forward = 1, Forward, Sharp-Right-Turn = 2, and Slight-Left-Turn = 3, and the dataset is normalized before the training. The results show that HP-GRU has a mean accuracy of 0.9857 and a mean loss of 0.0810, and it is comparable with other deep learning algorithms.