Mapping crime determinants in Central Java: an in-depth exploration through local spatial association and regression analysis

ABSTRACT


Introduction
Crime is a multifaceted issue that affects societies worldwide, impacting not only individuals' wellbeing but also the overall quality of life within a region [1], [2].The quest for economic development, while an essential goal for many communities, often accompanies the challenge of addressing socioeconomic disparities [3], [4], which, if left unaddressed, can contribute to an increase in crime rates.Central Java, an Indonesian province, has grappled with a persistent high crime rate [5]

. To effectively A R T I C L E I N F O
A B S T R A C T combat this issue, it is paramount to comprehend the intricate web of factors influencing criminal activities [6].This research endeavors to delve into the dynamics of crime within Central Java, focusing on the influences of population [7], unemployment [8], [9], poverty [8], [10], Age-Dependency Ratio (APS) [11], [12], and the Relative Location Quotient (RLS) [13], [14].
In recent years, there have been fluctuations in the number of reported crimes in Central Java [15], but it consistently ranks among the top ten Indonesian provinces with the highest crime rates [16].As crime is defined as any conduct that breaches both the law and social norms [17], [18], the government and law enforcement agencies must pay close attention to this issue [19].To address crime effectively in Central Java, it is crucial to understand the elements that contribute to its prevalence [20].Spatial regression analysis [21], particularly the Spatial Autoregressive Model (SAR) [22], is employed in this research to discern the intricate relationships between these factors and their spatial implications [23]- [25].Although much research has been conducted on the factors influencing crime in Indonesia, the application of the SAR analysis in the context of Central Java remains a relatively rare endeavor [26].Thus, this study fills a crucial gap in understanding the local dynamics of crime within the province.
The primary objective of this research is to provide a comprehensive picture of the crime situation in Central Java and to uncover the factors and spatial effects that influence crime within the region.By conducting an in-depth spatial analysis, we aim to shed light on the unique characteristics of Central Java's crime landscape [27], [28].The findings of this study are expected to serve as a valuable resource for policymakers, law enforcement agencies, and researchers seeking to develop informed strategies for crime reduction and the enhancement of public safety in Central Java.This investigation represents a significant step in comprehending the root causes of crime within the region, aiming to inform evidencebased strategies for reducing criminal activities and fostering a safer, more prosperous Central Java.
The structure of this article is as follows: Section 2 provides an in-depth elaboration of the research methodology, detailing data sources, research phases, and the application of advanced statistical techniques, including the Local Indicator of Spatial Association (LISA) and the Spatial Autoregressive Model (SAR).In Section 3, we dive into the presentation and discussion of the research findings, encompassing descriptive statistics, spatial autocorrelation, the LISA index, and a comprehensive exploration of the advantages and limitations of SAR models.Lastly, Section 4 serves as the culmination of our research, offering a succinct summary of the research findings, reiterating the influential factors in Central Java's complex crime landscape, and emphasizing the spatial dependencies and predictive capacity of our model.

Data and Data Sources
The foundation of this research lies in the comprehensive dataset obtained from the Central Java Central Statistics Agency (BPS).This dataset serves as the backbone of our analysis, providing the necessary variables to understand the intricate web of factors that contribute to crime in Central Java.The variables within the dataset include.
• Population (X1): Population is a fundamental component of the analysis [29].High population areas might experience different crime dynamics compared to areas with lower populations.Understanding how population relates to crime rates is crucial.
• Unemployment (X2): The unemployment rate is another critical factor [30]. High unemployment rates can lead to economic instability, potentially driving individuals towards criminal activities.
• Poverty (X3): Poverty often correlates with higher crime rates [31].The economic struggles faced by impoverished communities can create an environment conducive to criminal behavior.
• Age-Dependency Ratio (APS) (X4): The age-dependency ratio is an indicator of the proportion of dependent individuals (such as children and the elderly) to the working-age population [32].
Variations in this ratio can shed light on societal vulnerabilities that may influence crime.
• Relative Location Quotient (RLS) (X5): The RLS is a measure of the concentration of a particular industry in Central Java compared to the national average [33], [34].Variations in the RLS can reveal economic disparities that might affect crime.
The data sources, primarily provided by the BPS, guarantee a robust and reliable foundation for this research.The data collection process adheres to rigorous standards, ensuring accuracy and representativeness.This rich dataset allows for a multi-dimensional analysis, uncovering the intricate relationships between these variables and crime rates within Central Java.

Analytical Techniques
This section outlines the analytical strategies utilized to explore the intricate dynamics of crime in Central Java, emphasizing the adoption of rigorous and advanced methodologies.
• Descriptive Analysis: At the onset, the research conducts a comprehensive descriptive analysis, aiming to illuminate the essential features and attributes of the dataset [35].This foundational exploration involves scrutinizing the central tendencies, dispersion, and distributions of critical variables such as population, unemployment, poverty, age-dependency ratio, and relative location quotient.By meticulously investigating these parameters, the research establishes an initial understanding of the socio-economic landscape of Central Java, facilitating the formulation of preliminary insights and hypotheses concerning the relationship between these factors and crime rates.
• Spatial Analysis: An integral component of the research, spatial analysis serves as a lens through which the intricate relationships between variables and crime rates in Central Java are examined [36], [37].Global spatial autocorrelation is initially evaluated using the Moran Index, enabling the identification of broad spatial patterns within the dataset [38].This assessment aims to determine whether specific regions in Central Java exhibit significant spatial clustering of high or low crime rates.Subsequently, local spatial analysis, conducted through the Local Indicator of Spatial Association (LISA) Index, uncovers localized crime hotspots and cold spots.This examination provides valuable insights into specific districts or cities that display distinctive crime dynamics and their spatial correlations with neighboring areas.
• Spatial Regression Analysis: Central to the research's analytical framework is the utilization of Spatial Autoregressive Model (SAR) for spatial regression analysis [39]- [41].The SAR model facilitates the exploration of spatial dependencies, acknowledging the influence of both intrinsic characteristics and those of neighboring regions on crime rates.Particularly suited for investigating complex spatial patterns in crime, the SAR model enables an in-depth examination of how the socio-economic variables of one area impact crime rates not only within that region but also in adjacent areas.By leveraging the SAR model, the research aims to unravel the intricate interplay between socio-economic factors and the spatial distribution of crime in Central Java.
The chosen analytical techniques, ranging from comprehensive descriptive analysis to advanced spatial regression modeling, form a robust foundation for the subsequent presentation and discussion of research findings in Section 3.

Analytical Techniques
This section elaborates on the array of statistical tools harnessed to unearth the underlying dynamics of crime in Central Java.The research employs a suite of rigorous techniques to uncover the intricate relationships between socio-economic factors and crime rates in the region [42].
• Descriptive Statistics: The research embarks on its analytical journey by employing descriptive statistics to gain insights into the fundamental characteristics of the dataset [43].Through measures of central tendency, variability, and distribution, this initial analysis delves into the properties of critical variables, including population, unemployment, poverty, age-dependency ratio, and relative location quotient.Descriptive statistics not only provide an overview of the dataset but also lay the groundwork for formulating hypotheses regarding the potential influences of these variables on crime rates.
• Global Spatial Autocorrelation: The assessment of global spatial autocorrelation [44] is pivotal in understanding the broader spatial patterns of crime in Central Java.The Moran Index, a widely recognized metric, is utilized to gauge the extent of spatial clustering within the dataset.By identifying areas with similar crime rates and those exhibiting high or low clustering, this analysis unveils the overarching spatial dynamics of crime.This initial investigation is critical in recognizing regions with distinct crime characteristics, which sets the stage for further exploration.
• Local Spatial Autocorrelation: Delving deeper into spatial relationships, the research leverages the Local Indicator of Spatial Association (LISA) Index [45].This tool uncovers local crime hotspots and cold spots, pinpointing specific districts or cities with remarkable spatial dependencies.By identifying regions where crime rates are significantly correlated with those of neighboring areas, the LISA Index (1) provides essential insights into localized crime dynamics.This in-depth understanding of spatial correlations lays the foundation for unraveling the complex web of factors influencing crime at the local level.
Where   is the observation value at the I-th location,   is the observation value at the j-th location, ̅ is the average value of the observation variable,   is the spatial weighting, and   is the standard deviation value of the variable x.
• Spatial Regression Modeling: At the core of the research's analytical arsenal lies the Spatial Autoregressive Model (SAR) [46], [47].This sophisticated model is tailor-made for investigating complex spatial patterns and dependencies.By considering the influence of not only local socio-economic variables but also those of neighboring regions, the SAR model uncovers the intricate interplay between these factors and crime rates.It offers a comprehensive framework for exploring how the characteristics of one area impact crime rates within that region and in adjacent areas.The SAR model, renowned for its capacity to disentangle complex spatial relationships (2), is a powerful tool for revealing the nuanced dynamics of crime in Central Java.
with assumption ~(0,  2 ) from the (2) we get: In this context, "y" represents the response variable, "X" stands for the matrix of explanatory variables, "W" denotes the matrix of spatial weights, and "ρ" signifies the coefficient of spatial lag within the model's predictors.It's important to note that this model assumes an autoregressive process exclusively on the response variable.
These statistical tools collectively empower the research to explore the intricate dynamics of crime, both at a global and local scale.The subsequent section, Section 3, will present the research findings, offering a comprehensive analysis of the influences of population, unemployment, poverty, agedependency ratio, and relative location quotient on crime rates in Central Java.

Descriptive Statistics
In this section, we provide a comprehensive analysis of the descriptive statistics for key variables to deepen our understanding of the socio-economic landscape and its relationship with crime rates in Central Java (Fig. 1).Population: The population data reveals significant variations across districts and cities within Central Java (Table 1).Semarang City, the province's capital, has a high population of 1,653,524, which is considerably larger than many other areas.Understanding the distribution of population is crucial as it can be closely associated with crime rates.High population density can lead to increased opportunities for both criminal activity and victimization.Unemployment Rates: Examining the unemployment rates across the region, we find that Banyumas Regency has the lowest rate at 6%, while Semarang City has an unemployment rate of 9.57%.Unemployment can be a contributing factor to crime as individuals without employment opportunities may resort to illegal activities.
Poverty Levels: Poverty is another significant variable.In Banyumas Regency, there are 211,650 individuals living below the poverty line, whereas in Surakarta City, this number is significantly lower at 47,030.High poverty levels often correlate with an increased propensity for crime, as individuals facing economic hardships may resort to criminal activities as a means of survival.
Age-Dependency Ratio (APS): APS, representing the proportion of dependent individuals, varies across districts.Surakarta City has the highest APS at 76.25, indicating a higher dependency ratio.A high APS can exert pressure on the working-age population to support dependent individuals, which can potentially contribute to socio-economic stress and, subsequently, crime rates.
Relative Location Quotient (RLS): RLS is another critical variable, and it ranges from 6.97 to 10.69 across districts.RLS measures the concentration of employment in a specific industry relative to the national average.A higher RLS can signify economic specialization.This could mean an area's economy is largely reliant on a specific industry, which, if disrupted, may lead to economic hardships and, indirectly, higher crime rates.
Crime Rates: The number of crimes reported varies across districts, with Semarang City consistently reporting the highest crime rate.This distribution raises important questions about the socio-economic factors and spatial patterns influencing crime.The high crime rates in these districts demand in-depth analysis to determine whether there are spatial dependencies and what factors contribute to these patterns.
This detailed analysis of descriptive statistics forms the foundation for our spatial regression analysis, enabling us to understand the complex interplay between population, unemployment, poverty, dependency ratios, location quotients, and crime rates in Central Java.Further exploration and modeling are needed to provide actionable insights for policymakers and law enforcement agencies to develop strategies aimed at reducing crime and enhancing the well-being of Central Java's residents.

Testing Spatial Autocorrelation
In this section, we delve into testing for spatial autocorrelation, which is a critical step in understanding the underlying spatial patterns of crime in Central Java.We employ two key tests, the Moran's Index and the Local Indicator of Spatial Association (LISA) Index, to explore spatial dependencies in the crime data.
Moran's Index: The Moran's Index is employed to determine the presence of spatial autocorrelation, which indicates whether crime rates in one region are influenced by those in neighboring regions.This test allows us to identify the degree of spatial clustering or dispersion in Central Java.The Moran's Index is calculated as -0.07008243.By rejecting the null hypothesis (H0: No spatial autocorrelation), we establish that spatial autocorrelation indeed exists in the crime data.This result implies that crime rates in one district are influenced by the rates in neighboring districts, suggesting the presence of spatial patterns in criminal activities.
LISA Index: To gain a deeper understanding of the spatial patterns revealed by Moran's Index, we employ the Local Indicator of Spatial Association (LISA) Index.The LISA Index is applied to specific districts and cities (Table 2), providing us with insights into the local spatial clusters of high or low crime rates.The LISA Index results highlight that Banyumas Regency, Wonosobo, and Tegal City exhibit spatial dependencies in their crime rates.These areas are characterized by a strong local association with neighboring districts, indicating that high or low crime rates in these regions are not isolated incidents but rather part of larger spatial clusters.Cluster Map: The Cluster Map (Fig. 2) visually represents the findings from the LISA Index, shedding light on the spatial relationships among districts.It identifies three significant areas: one cluster with high crime rates (Banyumas Regency and Tegal City) and one with low crime rates (Wonosobo).The presence of such clusters implies that there are localized hotspots of criminal activities within Central Java, which may be influenced by shared socio-economic factors or community dynamics.By conducting these spatial autocorrelation tests, we have uncovered the existence of spatial patterns in crime across Central Java.This knowledge allows us to move forward with a spatial regression analysis using the Spatial Autoregressive Model (SAR).The identified spatial dependencies are essential for constructing a robust model that can capture the influences of neighboring districts on crime rates.This, in turn, contributes to a more accurate understanding of the socio-economic factors affecting crime in the region.

LISA Index
In this section, we provide a more detailed analysis of the Local Indicator of Spatial Association (LISA) Index results, which offer insights into the local spatial patterns of high or low crime rates in Central Java.These findings are crucial for identifying specific regions with significant spatial dependencies and understanding the factors that may contribute to these localized patterns.
The LISA Index analysis covers all districts and cities in Central Java, and it reveals the presence of spatial autocorrelation and the degree of clustering in crime rates.This analysis helps us identify regions with similar crime patterns that are influenced by neighboring areas.The index values are calculated for each district or city and indicate whether it is part of a spatial cluster of high or low crime rates.
The LISA Index results show that several districts in Central Java exhibit local spatial associations : • Banyumas Regency: The LISA Index value for Banyumas Regency is -0.1927 with a significant pvalue of 0.002.This indicates that Banyumas Regency is part of a spatial cluster with low crime rates and is surrounded by neighboring districts with similarly low crime rates.The negative value suggests that Banyumas Regency is in a cluster of districts with lower crime rates compared to the surrounding areas.
• Wonosobo: Wonosobo also demonstrates a significant LISA Index value of 0.2827 with a p-value of 0.038.This positive value signifies that Wonosobo is part of a spatial cluster with high crime rates and is surrounded by districts with similarly high crime rates.In this case, the positive LISA Index value indicates that Wonosobo is in a cluster of districts with higher crime rates compared to its neighbors.
• Tegal City: Tegal City has a substantial LISA Index value of -0.7560 with a highly significant pvalue of 0.001.This negative value indicates that Tegal City is part of a spatial cluster with low crime rates, surrounded by neighboring districts with similarly low crime rates.The strong negative value suggests that Tegal City is in a cluster of districts with significantly lower crime rates compared to its surrounding areas.
These LISA Index findings are instrumental in understanding the local spatial dependencies of crime rates within Central Java.It is clear that specific districts exhibit strong associations with their neighboring regions, either in terms of high or low crime rates.These spatial dependencies can be influenced by a variety of factors, such as shared socio-economic conditions, local law enforcement efforts, or community dynamics.
The knowledge of these spatial clusters provides a foundation for the subsequent spatial regression analysis using the Spatial Autoregressive Model (SAR).The SAR model will enable us to explore the factors contributing to these localized patterns of crime and gain a deeper understanding of the socioeconomic dynamics affecting different districts and cities within Central Java.

Spatial Regression Modeling
This section delves into the spatial regression modeling, focusing on the Spatial Autoregressive Model (SAR) to estimate the effects of various factors on crime rates in Central Java.The SAR model takes into account the spatial dependencies identified in the previous sections and provides valuable insights into the relationships between crime rates and independent variables, including population, unemployment, poverty, Age-Dependency Ratio (APS), and Relative Location Quotient (RLS).
LM Test for Spatial Lag Dependence: Before delving into the SAR model, a critical LM (Lagrange Multiplier) test was conducted to ascertain the presence of spatial lag dependence.This test is vital in establishing the necessity of using a spatial regression model.The hypothesis for this test was set as follows : • H0 (Null Hypothesis): There is no spatial lag dependence (ρ = 0).
• H1 (Alternative Hypothesis): There is spatial lag dependence (ρ ≠ 0) The significance level (α) was set at 0.05.The LM test results revealed a p-value of 0.03647, leading to the rejection of the null hypothesis.This suggests the presence of spatial lag dependence in the data.Consequently, the SAR model is a suitable choice for modeling crime rates in Central Java, as it takes spatial dependencies into account.Spatial Autoregressive (SAR) Model Estimation: The SAR model estimation is a critical step in understanding the relationships between crime rates and independent variables.The results from the SAR model estimation are presented in Table 3. • ρ (Spatial Lag Coefficient): The coefficient for spatial lag (ρ) is estimated as -0.58578 with a pvalue of 0.009827.This coefficient signifies the extent to which the crime rate in a given location is influenced by the crime rates in neighboring areas.The negative value indicates that a higher crime rate in the surrounding districts is associated with a lower crime rate in the focal district.
• Intercept (β0): The intercept term is estimated as -3.2995 × 10^2 with a p-value of 0.10221.It represents the crime rate when all independent variables are set to zero.
• β1, β2, β3, β8, and β9: These coefficients represent the impact of individual independent variables (population, unemployment, poverty, APS, and RLS) on the crime rate.Each of these coefficients has a unique p-value, indicating the significance of its effect.
Model Interpretation: With the SAR model estimated, we can interpret the relationships between crime rates and the independent variables: • A one-unit change in the population (X1) results in a change of 0.00038689 in the crime rate.
• A one-unit change in the unemployment rate (X2) leads to a change of 22.655 in the crime rate.
• A one-unit change in the poverty rate (X3) results in a change of 0.0010578 in the crime rate.
• A one-unit change in APS (X4) leads to a change of 5.3914 in the crime rate.
• A one-unit change in RLS (X5) results in a substantial change of 127.37 in the crime rate.
Additionally, the SAR model reveals that spatial lag dependence is indeed present, indicating that neighboring regions have an influence on the crime rate.The spatial lag coefficient (ρ) at -0.58578 provides insights into this influence, where a higher crime rate in nearby districts is associated with a lower crime rate in the focal district.
Overall Test of the SAR Model: An overall test was conducted to evaluate the simultaneous influence of the independent variables on the crime rate.The hypotheses for this test were : • H0 (Null Hypothesis): The independent variables simultaneously have no effect on the dependent variable.
• H1 (Alternative Hypothesis): At least one independent variable simultaneously influences the dependent variable.
The test results yielded a p-value of 0.0093744, leading to the rejection of the null hypothesis.This indicates that the independent variables (population, unemployment, poverty, APS, and RLS) collectively influence the crime rate in Central Java.
The SAR model, with an R-squared value of 0.7548 (75.48%), demonstrates its ability to explain 75.48% of the variation in the crime rate.This implies that the selected independent variables account for a significant portion of the observed variations in crime rates across Central Java.The remaining 24.02% of the variance may be attributed to factors not included in the model.
In summary, the SAR model provides a comprehensive understanding of the relationships between crime rates and various factors while considering spatial dependencies.The model's ability to explain a substantial portion of the variance in crime rates in Central Java makes it a valuable tool for policymakers and law enforcement agencies to develop informed strategies for crime reduction and public safety enhancement in the region.

Conclusion
In this study, we delved into the dynamics of crime in Central Java, Indonesia, focusing on socioeconomic factors and employing spatial regression techniques.Crime, a persistent issue in Central Java, calls for a deeper understanding of its driving forces.Our analysis began with a detailed examination of crime patterns across the region, revealing areas with higher crime rates.We identified five key variables, including population, unemployment, poverty, Age-Dependency Ratio (APS), and Relative Location Quotient (RLS), that significantly influence crime rates in Central Java.Rigorous spatial tests confirmed the presence of spatial dependencies, leading to the adoption of a Spatial Autoregressive Model (SAR).This model illuminated complex relationships between variables and crime rates, indicating the pivotal role of spatial lag dependence.The model's strong explanatory power, with 75.48% of the variance explained, offers valuable insights for policymakers and law enforcement in addressing crime in Central Java.This research contributes to informed strategies for crime reduction and public safety enhancement in the region, ultimately fostering a safer and more prosperous Central Java.
Future research should explore the region's unique cultural and sociological factors impacting crime.Examining historical trends and temporal changes in crime rates is essential for effective policy development.Integrating advanced technologies like predictive analytics, machine learning, and deep learning can enhance predictive models and law enforcement strategies.Collaboration between researchers, policymakers, and communities is key to addressing crime's underlying causes, aiming for sustainable crime reduction and improved safety in Central Java.

Fig. 1 .
Fig. 1.A tree map illustrating the distribution of crime in Central Java, providing descriptive statistics for the variables used in this research to assess the overall crime data and its independence

Fig. 2 .
Fig. 2. Cluster Map showing crime relevance levels in Central Java

Table 1 .
Frequency of crimes in Central Java's cities/regencies, with a focus on areas with the highest number of crimes

Table 2 .
LISA index results for crime in Central Java

Table 3 .
LISA index results for crime in Central Java