Palm oil classification using deep learning

different


Science in Information Technology Letters
ISSN 2722-4139 Vol. 1., No. 1, May 2020, pp. 1-8 research applied several techniques, and most of them used machine vision. Currently, the image classification approach uses machine learning methods.
To improve their performance, we can collect larger datasets, learn more powerful models, and use better techniques for preventing overfitting. Therefore, this study aims to develop a new technique that can assist the classification for palm oil fruit and act like a human who is able to recognize the color and make a decision based on the selected category. The technique that was developed is using CNN as it's a good classifier in image classification and recognition tasks, as well as for the problem of fruit recognition [8].

Method
In addition, CNN can learn spontaneously from the input and extracts global features and contextual details, which reduces error slightly in image recognition.

Methodology Flow
This study uses a sample of two categories, namely ripe and immature. This research identifies the color features of oil palm fruit so that the analysis of the texture and shape of oil palm bunches is not needed. The flow chart in identifying the maturity of oil palm fruit is shown as in Fig. 1, which provides detailed identification steps.

Data Collection and Preparation
The datasets for this study were collected from the Tomanggong Palm Oil Mill, Lahad Datu Sabah. In this study, the standard dataset is used as a sample to test the effectiveness of the Convolutional Neural Network as a classifier. The sample of the original data for each bunch is presented in Fig. 2 and Fig. 3.  In this study, the holdout method is used to validate the datasets. The datasets of this study were assigned into two points of the set, which are d0 and d1. This method is a training set, and testing data is respected. Each set size is arbitrary, with the test set used smaller than the training set. The training process is on d0, and testing is on d1. The holdout is also used to determine the algorithm's predictive prowess and provides a final estimate of model performance after it has been trained and validated [12].

Research Design and Implementation
CNN is used as a classifier of the ripeness of oil palm fruit. The essential instinct behind these frameworks is that a processing architecture based on a huge number of layered and massively interconnected simple units may be fit than sophisticated algorithms to handle complex issues. The fundamental processing unit, the neuron, is exceptionally basic. It calculates the output activation by looking at the weighted entirety of its contribution with a threshold and applying a suitable nonlinearity, below is (1).
CNN training requires input images and related labels and automatically extracts image features. The purpose of the training algorithm is to train the network in input and layer by layer so that errors are minimized between the network output and desired output or to improve output performance. Neural network architects are made with an additional layer called the loss layer. This layer is a critique of neural networks if it recognizes the source of information effectively, and if not, how far guessed. It can reinforce the right ideas when trained. The CNN performance validation process uses validation data without a training process when the validation performance does not improve. Fig. 4 is an example of a training set. In the training process, the mean squad error (MSE) of the fit on the orange line is 4, while the MSE for fit on the green line is 9. The training data that is very suitable is the orange curve because the MSE value increases by almost a factor of four if compared between testing and training. Green curves are not suitable for training data, because the MSE increases by less than a factor of 2.
The data set testing technique is called the data augmentation technique. The training process is shown in Fig. 5, which is a validation test of the model being built. Data points in the training set are ignored from the test (validation) [13]. For each iteration, the data is separated into validation sets and test sets. The calculation process in the training set determines various coefficients. Data can be tested to predict based on broader data sets and provide information about errors. MSE for fit on the orange line is 15, and MSE for fit on the green line is 13.

Research Evaluation
This accuracy is used to evaluate the effectiveness of the algorithm to classify the images of ripe and unripe palm oil fruits. This accuracy will be proved the effectiveness of CNN in categorizing palm oil ripeness. The accuracy was calculated by (2). (2) Where TP is true positive, and TN is true negatives. TP and TN are using for correct classification. A false positive (FP) is when the outcome of the algorithm is incorrectly predicted, whereas False positive (FP) is used when the predicted results do not match. Based on Fig. 6, the closeness of the measurement results to the true value, precision, repeatability, or reproducibility of the measurement is a factor of accuracy. Besides, to determine classification accuracy depends on the size of the convolutional kernel and maxpooling kernel, the number of kernels in each convolutional layer, and hidden units in the fully connected layer [2], [14].

Results
The keras deep learning library that has been used to train the images stored in the folder. The result shows when the image is analyses, which the result equal to 1. From the coding, 1 belongs to unripe. In this study, 5 epochs were used to determine the best accuracy for training and testing datasets. Each of epochs stated the loss accuracy and validation accuracy. By observing the training accuracy and loss, after 5 epochs, the final accuracy for training accuracy is 96%, and the training loss is quite low, which is 0.5295. The validation accuracy is 97%, with validation loss 0.0476. The result shows if the result [0][0] = 1, then the prediction is unripe, else would be ripe. In this study, the result is 1, which is the images belonged to unripe. Table 1 shown below is the training result at each of single epoch. The end of the epochs showed the result of ripeness, which 1 belongs to unripe. This study was successfully classified images of palm oil by detected and differentiated the ripeness of oil palm fruit.    Fig. 8 represented the result of the accuracy of training and validation or test datasets. The accuracy of the training process and validation has not been much learned from the training dataset. Comparable skills show this in the two datasets. Model accuracy can be trained more to get increased accuracy in both data sets.

Discussion
Convolutional neural networks are created for the learning process with large amounts of data [15]. The palm oil FFB maturity classification algorithm has been implemented. This article proposes a machine learning framework for assessing oil palm fruit bunches using the Convolutional Neural Network algorithm. Detection of maturity is based on a lack of productivity, efficiency problems, and is still assessed by human visualization [6]. This project aims to determine and differentiate the color characteristics of oil palm fruit bunches.
This study was successfully solving an image classification by detected and differentiated the ripeness of oil palm fruit. The three objectives were achieved, which is the first objective is to pre-process the data set of palm oil fruit. By using this algorithm, the datasets have been preprocessed to be suitable for the classification of palm oil fruit.
Secondly, to apply deep learning in classifying palm oil fruit, The CNN that applied in this study can learn the datasets which show where the class input image belongs to either it ripe or unripe.
The third objective is to analyses the effectiveness of applying deep learning in classifying palm oil fruit. It was achieved where CNN can learn better from the input, and it can predict the ripeness of palm oil fruit either it is ripe or unripe effectively with an accuracy of 98%. The way those objectives achieved by training a CNN on few images of ripe and unripe palm oil and make CNN learn to predict which class the image belongs to. The computer software used is Python 3.2.5, which is used to analyze image data. This research provides benefits to industry, engineers, harvesters, appraisers, mill operators, plantation managers, small farmers, and the research community related to oil palm.
The result of this study, if compare with the other research that used CNN on image classification, such as Cheng et al. [16] achieved 96% accuracy of correct classification samples. The researcher mentioned in her study that 10 epochs are enough for successful training of the model. This study also compares CNN algorithm with other methods that have been conducted before to classifying palm oil ripeness [6] has been conducted a study on oil palm fruit grading using a hyperspectral device and machine learning algorithm which the result of the classification approach of the study had an accuracy of more than 95% for all three types of the oil palm fruit. While in this study approach which using the CNN algorithm achieved 96% accuracy than the ANN model. The researchers conclude that the CNN algorithm is more effective in classifying images than the ANN algorithm.

Conclusion
This study has been proven, which has high accuracy in image classification using CNN algorithm. This technique also used to train and test the images that have been stored in folder. However, given the limitations of the dataset and time, future research can add datasets to improve results. Furthermore, using only test data with a single image, the discovery of similar results at a more significant amount can be strengthened. The good overall result of this study, it can predict the image of palm oil. To improve the performance of this method, future work must get more data because deep learning algorithms often perform better with more data and usually, the quality of models is generally constrained by the quality of the training data [17], [18]. The future assessment system can be connected to the internet so that users can test the maturity of the fruit online (future research).