Convolutional neural network (CNN) to determine the character of wayang kulit

Wayang Kulit is a traditional Indonesian art genre that has been designated as a "Masterpiece of Oral Intangible Heritage of Humanity" by UNESCO [1], [2]. Wayang Kulit has a variety of names and characters. Wayang's character demonstrates each character's personality, self-identity, image, character, and qualities [3], [4]. In general, it can be used as a symbol or example of an individual's character. Wayang Kulit characters can be divided into two categories: good (protagonist) and bad (antagonist). Most teenagers nowadays are not familiar with the Wayang Kulit characters. This is due to a decrease in teenagers’ interest in traditional Wayang performances, lack of exposure to literature reading related to Wayang Kulit, and a lack of exposure of subject related to Wayang in schools. To spark the interest of the younger generation in Wayang, a range of possible solutions have been created, including establishing a Wayang puppeteer studio [5], craftsman center [6], [7], and performing Wayang on stage in various locations. Another worthwhile effort is producing Wayang in a more modern form such as wayang hip hop [8], comics [9], games [10], [11], and animations [12]–[14].


Introduction
Wayang Kulit is a traditional Indonesian art genre that has been designated as a "Masterpiece of Oral Intangible Heritage of Humanity" by UNESCO [1], [2]. Wayang Kulit has a variety of names and characters. Wayang's character demonstrates each character's personality, self-identity, image, character, and qualities [3], [4]. In general, it can be used as a symbol or example of an individual's character. Wayang Kulit characters can be divided into two categories: good (protagonist) and bad (antagonist). Most teenagers nowadays are not familiar with the Wayang Kulit characters. This is due to a decrease in teenagers' interest in traditional Wayang performances, lack of exposure to literature reading related to Wayang Kulit, and a lack of exposure of subject related to Wayang in schools. To spark the interest of the younger generation in Wayang, a range of possible solutions have been created, including establishing a Wayang puppeteer studio [5], craftsman center [6], [7], and performing Wayang on stage in various locations. Another worthwhile effort is producing Wayang in a more modern form such as wayang hip hop [8], comics [9], games [10], [11], and animations [12]- [14].
A technological breakthrough is needed to make the Wayang Kulit character more recognizable among the numerous existing solutions. A subset of artificial intelligence known as computer vision may be used as one potential approach. This branch investigates how a machine can identify an object while it is being observed. Convolutional Neural Network (CNN) is among the methods for detecting objects. Various studies on CNN, such as one conducted by [15] using CNN to detect faces using

Research Method
Black and white images of Wayang Kulit are used as data for this analysis. The Wayang Kulit images were acquired by downloading each shadow puppet one at a time from the Google image search engine using the image format Joint Photographic Group (JPG). Thus, a total of 100 images was download, which was then divided into two categories. There are 50 images classified as protagonists, and the rest were categorized as the antagonist. No identical images were classified in this data collection phase. In the data preprocessing stage, images were manually categorized by a dalang (puppeteer), assuming that a protagonist is facing left while the antagonist is facing right. The images were divided into train data and test data during this phase. Figure 1 shows the example protagonist Wayang Kulit, labeled with numbers 1 to 50. Figure 2 presents the example of an antagonist with numbers 51-100. The scenarios are separated after the Wayang Kulit images have been categorized according to their characters. The purpose of the scenario distribution was to equally determine the CNN method's classification level in recognizing an image object, particularly a Wayang Kulit image. In this analysis, 50 shadow puppet pictures were divided into two groups of 25 images each. The protagonist, Wayang Kulit groups, were considered A1 and A2, whereas the antagonist Wayang Kulit groups were considered B1 and B2. Several preparations and assessment scenarios would be carried out as detailed in Table 1. The data of the 1 st and 3 rd scenarios is exactly the same, yet different in scanning the object. The first scenario scans the whole image while the 3 rd only the head of Wayang Kulit image.

Scenario
Testing Training Figure 3 is a system architecture consisting of several stages: image for detection, neuron input, convolution + ReLu + pooling, fully connected layer, and classification. The first stage, image for detection, equates the image size (resize) with 640x640 pixels using RGB (Red, Green, Blue) with three kernel filters. The result is that the input neurons in the first layer are 1,228,800 neurons (640x640x3). Each neuron has a parameter with a value from 0 to 1.  The next stage is the convolution process. The purpose of the convolution process is to extract the input image. The Convolution Layer comprises several neurons that form a filter in a matrix with length and height (pixels). There are two matrices in the filtering process, namely the input matrix and the kernel matrix. The input matrix value is obtained from the color level of each pixel. At the same time, the value of the kernel matrix is set according to the researcher's needs. The ReLu activation process coincides with the convolution process. ReLu activation aims to determine the output of convolution multiplication. The stage after the convolution and ReLu process is pooling. This pooling process aims to get a new, smaller matrix output. The pooling process carried out in this research is by multiplying the 5x5 matrix pooling layer with a 3x3 filter with stride 1, which produces max pooling with a 3x3 matrix size. These processes are often carried out to get the desired output matrix before proceeding to the fully connected layer process. The fully connected layer process connects all the results of the neurons in the previous process to the next layer of neurons so that images can be classified. The confusion matrix approach was used to test the classification results [20]. The Confusion Matrix is a table that demonstrates how effective an algorithm's output results are. For example, the character of Wayang Kulit is determined by image classification, which divides Wayang Kulit into two categories: protagonist and antagonist. Table 2 shows the confusion matrix, which is used for performance analysis. True Positive (TP) data is a correct classification based on positive information. True Negative (TN) data correct information based on negative information. False Positive (FP) generates less accurate classification results than positive data. False Negative (FN) is incorrectly classifying the data. These matrices are used to calculate the accuracy, precision, recall, and F-measure as in the following equations.  Table 3 shows the four scenarios of image classification results. We set the maximum value of iteration in all scenarios to 25000. The average loss value obtained in all cases was 0.523. Thus, a value of 700 sec is obtained for Scenario 1. Similarly, using Scenerio2, Scenario 3, and Scenario 4, a value of 800sec, 800sec, and 800sec respectively is obtained. Eventually, the overall rating for all cases receives a score of 99.5 percent. It means that CNN can differentiate protagonist (Wayang Baik) and antagonist (Wayang Buruk). Figure 4 shows the example of the image detection result. Both scenarios show that CNN recognizes the picture as 100% good character, even though the scanned part is different. The CNN performance is measured using accuracy, precision, recall, and F-Measure results. Figure 5 shows the CNN accuracy. Red, yellow, green, and blue shows the order of scenario: S1, S2, S3, and S4.  (Convolutional neural network (CNN) to determine…) Figure 5 shows that all of the scenarios earned a 92 percent average score. This accuracy value is used to assess how accurate the classification results are when compared to the real values. However, there is no difference in accuracy between full image detection and head-only detection. Thus measuring with other indicators is needed for better justification.   Figure 7, it can be seen that the recall value in this analysis, the maximum value is obtained in scenarios 1 and 3 with a recall value of 97 percent, followed by Scenario 2 with an 88 percent recall value, and respectively Scenario 4 with an 87 percent recall value. The overall recall value was found to be 92.25 percent. The recall value is used to assess the system's degree of accuracy in predicting true positive data from overall data. S1 S2 S3 S4 S1 S2 S3 S4

Conclusion
The system succeeded in classifying Wayang Kulit characters using the Convolutional Neural Network method with the output of protagonist and antagonist characters. In all scenarios, the level of accuracy for the classification of Wayang Kulit characters using the Convolutional Neural Network method obtained an average value of 92 percent. Precision was given an average score of 92.5 percent. The recall value was found to be 92.25 percent on average. The average value of the F-Measure value was 91.75 percent. In the future, we may increase the number and resolution of images that are more diverse to train the model and acquire a higher classification value. Another future classification is wayang's name instead of its character. This development may help teenagers to know their culture better.