Factors influencing open unemployment rates: a spatial regression analysis

ABSTRACT


Introduction
Indonesia, as a developing country, is actively engaged in comprehensive development initiatives, aiming to enhance its competitiveness and cultivate a prosperous, just, and sustainable society [1].Central to this vision of an industrial society is the imperative alignment of economic growth with the maximum labor absorption capacity [2].However, unemployment, a multifaceted and global concern, often demonstrates a positive correlation with population growth, and its roots can be traced back to inadequate employment opportunities and individuals' reluctance to explore self-employment [3], which becomes more pronounced in regions with limited or non-existent job prospects [4].

A R T I C L E I N F O A B S T R A C T
Nonetheless, it is worth noting that even small benefits or income derived from newly forged professions can be redirected to support local communities [5], [6], underscoring the potential for community-driven solutions [7].Indonesia's journey toward realizing a just, prosperous, and sustainable society is intricately tied to the challenge of addressing unemployment [8], [9].This journey necessitates not only fostering economic growth but also promoting entrepreneurship and innovative approaches to job creation [10], [11].Importantly, this challenge resonates with a broader global endeavor -the quest to harmonize economic development with equitable employment opportunities [12].
Academics employ a wide range of analytical techniques to comprehend and conceptualize these phenomena [13], and spatial regression, as advocated by scholars [14]- [17], finds its place in academic literature.This technology finds applications in the fields of regional sciences, economics, real estate, and image processing [18]- [20].In order to showcase its adaptability, employed geographic regression analysis to estimate the pricing of flats in Surabaya [21].Similarly, researcher utilized spatial regression analysis to investigate the prevalence of diarrhea in the Tuban District of East Java [22].
Nevertheless, the effectiveness of linear regression in addressing unemployment in different regions of Indonesia in 2016 was constrained by spatial constraints [23].The conventional linear model exhibited inaccuracies due to its deficiency in error independence and homogeneity [24], [25].The utilization of spatial regression in geographical data analysis is increasingly gaining recognition due to its potential for enhanced precision in addressing discrepancies [26]- [28].This research employs spatial regression analysis to examine the spatial dimensions of unemployment.
The present study used spatial regression analysis as a means to address the existing gap in knowledge pertaining to the issue of unemployment in Indonesia during the year 2016.By incorporating geographical and spatial parameters, this approach enhances the precision of the model used to comprehend and forecast provincial unemployment patterns.This unique methodology enhances the existing body of work on unemployment modeling and provides policymakers with a potent instrument for formulating effective policies to mitigate unemployment.The findings and methodologies of this study have the potential to provide valuable insights for other countries grappling with spatial unemployment challenges.

Linier Regression
Regression analysis is the study of correlations and mathematical models between dependent (Y) and independent (X) variables [29].A linear regression model is capable of expressing the relationship between one dependent variable and one or more independent variables [30].In general, this relationship can be expressed as (1).
In the context of regression analysis, the dependent variable is denoted as Y, and it is influenced by one or more independent variables [31], represented as Xi.Each Xi is associated with a regression coefficient, symbolized as βi, which quantifies the impact of the independent variable on the dependent variable.Additionally, the model incorporates an error term, represented as µ, to account for discrepancies or unforeseen influences, capturing the variability not explained by the independent variables in the model.

Spatial Matrix Linkages
Spatial matrix linkages are a fundamental component of spatial analysis [32], often achieved through the creation of a spatial matrix referred to as "matrix W." Various techniques for weighting this matrix are available.Researchers have proposed three distinct approaches for defining matrix W, encompassing contiguity, distance, and generality [33].
Contiguity-Based Matrix W: The matrix W derived from contiguity is rooted in the concept that spatial interactions predominantly occur between neighboring regions [34], [35].This means that interactions primarily involve regions that share territorial boundaries, indicating a spatial connection due to proximity.Notably, the resulting matrix W is symmetrical, with its main diagonal consistently holding zero values.In this matrix, if W mn is assigned a value of 1 to denote interaction between regions m and n, then W nm also assumes a value of 1, reflecting the mutual nature of spatial connections in this context.

Spatial Regression Model
Spatial regression, also known as the Spatial Autoregressive Moving Average (SARMA) model in matrix form [36], can be represented as ( 2)-( 4).
In the context of spatial regression analysis, y represents a vector of the dependent variable with dimensions n×1.The matrix X signifies the independent variables and has a size of n×(k+1).The vector β consists of the regression parameter coefficients, having a dimension of (k+1)×1.The coefficient parameter ρ indicates spatial lag dependency.The coefficient parameter λ demonstrates spatial lag on errors.Both u and ε are error vectors of size n×1.The weight matrix, represented by W, has a dimension of n×n.N stands for the number of observations or locations, while k denotes the count of independent variables, which can vary from 1 to l.I is the identity matrix with a size of n×n.
In this model, λW²u introduces spatial structure to ε, the spatially dependent error term.The SEM model can be considered as a linear regression model with correlated spatial errors.

Significance of Spatial Regression Parameters
Reseachers highlight the principle that the base estimator, Maximum Likelihood, is asymptotically normal, implying that as the sample size, n, increases, the distribution of the curve tends to approximate a normal distribution [39], [40].The significance testing of both regression (β) and autoregressive (ρ and λ) parameters is based on the variance of the error (σ²).

Spatial Effects
This section delves into matters related to spatial effects, specifically : • Spatial Heterogeneity (Heteroscedasticity) : Spatial heterogeneity refers to the diversity in effects between locations, indicating that each location possesses unique relationship structures and parameters [41].Testing for spatial effects is conducted using the Breusch-Pagan test (BP test), and the modeling approach employed is the Geographically Weighted Regression (GWR).
• Spatial Dependence : Spatial dependence arises from the presence of interdependencies in regional data [42].This dependence aligns with Tobler's first law, asserting that all things are related to one another, with closer things having a greater influence.Detection of spatial dependence within a model is carried out using Moran's I and Langrange Multiplier (LM) statistics.

Data and Analysis Procedures
Data Collection: For this study, data from the year 2016 was meticulously collected from the official BPS website.The dataset encompassed a comprehensive array of essential metrics.Notable among these were : • Total Open Unemployment Rate : This key metric served as a critical indicator, shedding light on unemployment rates at the provincial level.It provided invaluable insights into the labor market conditions specific to each region.
• Economic Growth Rate : Data pertaining to the economic growth rates of individual provinces were methodically gathered and subsequently analyzed.These figures were pivotal in evaluating economic performance and discerning economic trends across various regions.
• Human Development Index (HDI) : The study incorporated HDI figures for each province, offering a holistic perspective on human development.HDI takes into account multiple factors, including education, health, and income, thereby providing a comprehensive assessment.
• Severity of Poverty Index : The dataset included information on the Severity of Poverty Index for each province.This metric played a pivotal role in assessing the extent and intensity of poverty in diverse geographical areas.
• School Participation Rates : The study examined school participation rates across provinces, facilitating a deeper understanding of accessibility and participation levels within the education system.This data was instrumental in revealing the state of education in different regions Analysis Procedures: The analysis procedures adopted in this study were structured and systematic, drawing inspiration from the methodology outlined by [43] and supported by the Geoda software.The analysis process consisted of several key steps : • Thematic Map Exploration : The initial stage of analysis involved the exploration of thematic maps.
These maps were instrumental in identifying patterns of distribution and dependencies among the variables.Additionally, scatter plots were generated to provide visual insights into the relationships between the independent variables (X) and the dependent variable (Y).
• Regression Modeling using Ordinary Least Square (OLS) : The study advanced to the estimation of parameters and the assessment of model significance.This phase was foundational in establishing the relationships between the variables under examination.
• Dependency/Correlation Analysis : The analysis then shifted focus to examining dependencies and correlations among the variables.This step aimed to gain a more comprehensive understanding of their interplay and the potential causal relationships that might exist.
• Spatial Effect Identification : To identify the presence of spatial effects within the data, the study employed statistical tests, including the Lagrange Multiplier (LM) test and Moran's I Statistics.These tests, as detailed by Anselin (1988), were essential in discerning the spatial dynamics at play within the dataset.
• Spatial Modeling Process : The dataset underwent modeling using a variety of techniques, including the Spatial Autoregressive Model (SAR), Spatial Error Model (SEM), and Spatial Autoregressive Moving Average (SARMA).These modeling techniques allowed for a comprehensive exploration of spatial dynamics within the data, shedding light on spatial relationships and dependencies.

Mapping Unemployment Rates
The visual representation of unemployment rates across Indonesian provinces has revealed striking regional disparities, shedding light on the complex economic landscape of the nation.The map presented in Fig. 1 offers a powerful visual narrative that deepens our understanding of these disparities.This map classifies Indonesian provinces into three distinct groups, each bearing its own set of socioeconomic implications.It is evident that the spatial distribution of unemployment rates tells a compelling story about the labor market dynamics in the country.
Group One (Light Yellow): Provinces in this group, represented by the light yellow color on the map, exhibit the lowest unemployment rates, with values ranging from 1.89 to 3.33.This cluster includes 11 provinces, among them Bali and West Papua.The prevalence of low unemployment rates in this group suggests relative economic stability and robust labor markets, offering a promising landscape for employment and economic growth.
Group Two (Yellow): The provinces in the second group, characterized by the yellow color on the map, encompass regions with moderately higher unemployment rates, ranging from 3.35 to 5.23.This group comprises 12 provinces, including West Kalimantan, Jambi, Maluku, and more.The presence of higher unemployment rates in this cluster indicates the existence of economic challenges that may require targeted interventions and labor market initiatives.
Group Three (Dark Brown): Represented by the dark brown color on the map, this group consists of provinces with the highest unemployment rates, spanning from 5.45 to 8.92.It comprises 11 provinces, including Aceh, Banten, and DKI Jakarta.The elevated unemployment rates in this group underscore the urgency for policymakers to address the underlying economic and labor market issues in these regions, aiming to stimulate economic growth and reduce unemployment.
The mapping of unemployment rates, as vividly demonstrated in Fig. 1, provides a valuable tool for policymakers and researchers alike.These spatial disparities offer crucial insights for designing regionspecific economic interventions, targeted labor market strategies, and policy initiatives that aim to foster equitable economic development across the diverse landscape of Indonesia.

Spatial Patterns and Moran's Index: Identifying Spatial Autocorrelation
Our examination of spatial patterns within unemployment rates has harnessed the formidable analytical tool known as the Moran Index.This method has led to the revelation of intriguing insights into the multifaceted dynamics of regional labor markets, transcending the boundaries of mere statistical exploration and holding profound implications for policymakers.
The results of the Moran Index analysis, vividly depicted in Fig. 2, have exposed a spatial autocorrelation pattern of remarkable significance.This pattern, primarily located within quadrants I and III, provides compelling evidence for the presence of spatial autocorrelation in unemployment rates across the diverse expanse of Indonesian provinces.The key takeaway from this revelation is that provinces marked by elevated unemployment rates tend to cluster together with their neighboring provinces, which likewise grapple with high levels of unemployment.In contrast, provinces characterized by lower unemployment rates display spatial clustering with neighboring counterparts that share the status of low unemployment.Fig. 2 offers a comprehensive and detailed examination of the Moran Index results for various variables, each shedding light on specific aspects of spatial autocorrelation : • Moran Index for the Dependent Variable, Y : 0.107149; This index uncovers the spatial autocorrelation pattern within the dependent variable, providing insight into the degree of clustering among provinces with similar unemployment rates.The positive value of 0.107149 underscores the prevalence of this spatial dependency.
• Moran Index for the Variable "Rate of Population Growth" (X1): -0.0904176;The Moran Index for the rate of population growth reveals a negative spatial autocorrelation, indicating that provinces with differing population growth rates tend to cluster spatially.This finding hints at the influential role of population dynamics in shaping regional unemployment disparities.
• Moran Index for the Human Development Index (HDI) (X2): -0.13779;For the HDI variable, a substantial negative Moran Index of -0.13779 implies that provinces with differing HDI values tend to cluster spatially.This profound negative correlation calls for a deeper exploration of the interplay between human development and unemployment.
• Moran Index for the Severity of Poverty Index (X3): 0.0190332; The Moran Index for the Severity of Poverty Index suggests a weak positive spatial autocorrelation, highlighting the tendency of provinces with similar poverty severity indices to spatially cluster.
• Moran Index for the School Participation Rate (X4): -0.0618879;The Moran Index for the school participation rate portrays a negative spatial autocorrelation, indicating that provinces with differing school participation rates tend to cluster spatially.
These Moran Index results, presented in Fig. 2, delve into the spatial intricacies of each variable, illuminating the nuanced interplay between geographical proximity and regional factors in shaping unemployment rates.Policymakers are encouraged to leverage this deeper understanding when crafting geographically targeted employment policies and interventions, ultimately addressing regional unemployment disparities and fostering equitable economic development across the diverse and complex landscape of Indonesia.

Regression Analysis: Unveiling Influential Factors
Our regression analysis delved deeper into the complex relationship between independent variables and unemployment rates.The classic regression model revealed a significant finding -the variable "rate of population increase" (X1) demonstrated statistically significant influence at the 5% level.This suggests that the rate of population growth holds a noteworthy impact on unemployment rates in Indonesian provinces.The results of statistics descriptive show as Fig. 3.The results depicted in Fig. 3 offer a comprehensive overview of descriptive statistics for our analysis, underlining the core aspects of the dataset.Notably, the R-squared value of 0.308859 indicates that approximately 30% of the variation in open unemployment in Indonesia can be attributed to the rate of population increase (X1).This finding is underscored by the significant coefficient of 1.86656 in Table 1, which further solidifies the relationship between population growth and unemployment rates.
Table 1 provides a detailed breakdown of the regression results for each variable : • The "CONSTANT" term, with a coefficient of -29.6222, indicates the intercept of the regression model.While it does not reach statistical significance, it suggests the presence of unmodeled factors contributing to unemployment.
• "X1," representing the rate of population increase, exhibits a statistically significant coefficient of 1.86656 with a low p-value (0.00521), highlighting its crucial role in explaining variations in unemployment rates.
• "X2" (HDI), "X3" (Severity of Poverty Index), and "X4" (school attendance rate) do not show statistically significant relationships with unemployment rates, as their p-values exceed the alpha level.
This in-depth analysis emphasizes the pivotal role of population growth in shaping unemployment dynamics.The regression model, Ŷ = -29.62+ 1.86X1, further accentuates this connection, indicating that around 30% of the variability in open unemployment in Indonesia can be attributed to changes in the rate of population increase.However, the remaining variance is influenced by unmodeled factors, highlighting the complexity of the unemployment issue.Incorporating spatial dependencies into our analysis through the Spatial Autoregressive (SAR) method has provided deeper insights into the multifaceted factors influencing unemployment rates.The SAR analysis reinforced our earlier findings, identifying the variable "Rate of population growth" (X1) as statistically significant at the 5% level, thereby reaffirming the influential role of population dynamics on unemployment across provinces.However, consistent with the classic regression model, the "HDI" (X2), "Severity of Poverty Index" (X3), and "school participation rate" (X4) retained their non-significant status in the SAR model.This observation was based on their respective p-values, which exceeded the alpha level, implying that these variables do not exert a statistically significant influence on unemployment rates when considering spatial dependencies.The results of spatial autoregressive show as Table 2.  • The variable "W_Y," which represents spatial dependence, exhibited a coefficient of 0.0956295 with a p-value of 0.39671.While this variable did not reach statistical significance, it highlights the presence of spatial interactions that influence unemployment.
• The "CONSTANT" term had a coefficient of -27.3568, indicating the model's intercept.It did not reach statistical significance but suggested the presence of unmodeled factors contributing to unemployment.
• "X1," the rate of population growth, displayed a statistically significant coefficient of 1.91945 with a low p-value (0.00069), reinforcing its crucial role in shaping unemployment dynamics in a spatial contex.
The SAR model, expressed as Ŷ = -27.35-0.095Wy + 1.91X1, further extends our understanding of unemployment rates by introducing spatial dependencies into the analysis.This model highlights that geographical proximity and spatial interactions play a crucial role in shaping unemployment dynamics across Indonesian provinces, emphasizing the need for geographically targeted employment policies and interventions to tackle regional unemployment disparities effectively.

Conclusion
Our recent study provides critical insights into regional variations in unemployment rates within Indonesia, unveiling three distinct clusters of provinces, each signifying varying degrees of economic stability and labor market dynamics.As our regression models demonstrate, the "rate of population increase" emerges as a significant factor influencing unemployment, underlining the need for policymakers to consider population dynamics in crafting targeted employment policies.The study also highlights the importance of spatial dependencies, emphasizing the role of geographical proximity and regional interactions in shaping unemployment dynamics.To effectively combat unemployment and foster economic stability and prosperity, policymakers must develop nuanced, region-specific strategies that consider these multifaceted aspects of unemployment.
In future research, it would be valuable to delve deeper into the underlying factors contributing to the regional disparities in unemployment rates, particularly within the high-unemployment cluster.
Exploring the specific challenges faced by these regions, such as economic decline, skills gaps, and job market disruptions, could provide further insights for tailored policy interventions.Additionally, investigating the impact of various policy measures on reducing unemployment within different clusters of provinces would be instrumental in guiding evidence-based policymaking for Indonesia's diverse regions.

Fig. 1 .
Fig. 1. Results of mapping the variable number of unemployed people in each province of Indonesia

Fig. 3 .
Fig. 3. Results of statistics descriptive of observation number 34 and variable 5

Table 1 .
Results of Regression

Table 2 .
Results of Spatial Autoregressive

Table 2
offers a comprehensive overview of the results derived from our Spatial Autoregressive analysis, shedding light on the specific coefficients, standard errors, z-evaluations, and probabilities associated with each variable in the model :