Optimal Planning of EV Charging Stations Using POI Big Data and CNN-LSTM-Attention Model

With the rapid development of the new energy vehicle industry, the number of electric vehicles has surged globally, leading to an imbalance between the supply and demand of EV charging stations and their uneven distribution. This study addresses the critical need for scientific planning of EV charging station layouts by leveraging big data and advanced deep learning techniques. We focus on integrating Point of Interest (POI) data and population demographics to predict optimal locations for EV charging stations, ensuring efficient resource allocation and enhanced service coverage. Our approach combines spatial analysis with a hybrid CNN-LSTM-Attention model, which effectively captures both local spatial patterns and global temporal dependencies in the data, while emphasizing key features through attention mechanisms. This methodology not only improves prediction accuracy but also provides a scalable framework for urban planning authorities to deploy EV charging infrastructure in a data-driven manner.

The proliferation of electric vehicles has underscored the importance of developing a robust charging network. Traditional methods for EV charging station placement often rely on heuristic approaches or simplistic spatial analyses, which may overlook complex interactions between urban infrastructure and population dynamics. In contrast, our study utilizes high-resolution POI data, encompassing categories such as commercial services, residential areas, public facilities, and administrative offices, to characterize the urban environment. By dividing the study area into a grid system and transforming the data into binary classifications, we enable precise modeling of suitability for EV charging station deployment. Furthermore, we incorporate population data to refine our predictions, ensuring that areas with high demographic demand are prioritized. The integration of convolutional neural networks (CNN) for spatial feature extraction, long short-term memory (LSTM) networks for sequence modeling, and attention mechanisms for feature weighting allows our model to outperform conventional machine learning techniques in terms of loss reduction, accuracy, and F1-score.

Our research demonstrates that the CNN-LSTM-Attention model achieves a loss value of 0.1776, an accuracy of 0.9324, and an F1-score of 0.9254, significantly surpassing benchmarks such as random forests and standalone CNN or LSTM models. Spatial autocorrelation analysis reveals a positive clustering pattern of existing EV charging stations, indicating that hotspots are concentrated in urban cores and densely populated townships. Kernel density estimation further identifies regions with high potential demand, aligning with commercial hubs and residential zones. By applying our model to predict suitable grids and cross-referencing with population thresholds, we identify 312 priority grids for EV charging station installation. These findings advocate for a zoned construction strategy, where urban areas receive high-density fast-charging stations, while suburban and rural regions are equipped with slower chargers tailored to local needs. This comprehensive approach not only addresses current infrastructure gaps but also supports sustainable urban mobility and the transition to clean energy.

The data foundation of this study comprises POI data and population statistics. POI data, representing specific locations like shopping malls, hospitals, and schools, were collected via API calls to mapping platforms, resulting in a dataset of 324,227 entries across Fuzhou City. These points were categorized into four primary scenes: commercial services, living and residence, public facilities, and administrative offices, each with subcategories to avoid functional overlap. For instance, commercial services include shopping consumption, catering services, financial services, and daily life services, while public facilities encompass educational institutions, scenic spots, sports venues, and transportation hubs. This classification ensures a comprehensive representation of urban infrastructure influencing EV charging station demand. Population data, sourced from the national census, provide demographic details for 190 townships, with a focus on the 15-59 age group as a proxy for potential EV users due to their higher mobility and economic activity. The integration of these datasets enables a multi-faceted analysis of spatial and demographic factors driving EV charging station suitability.

To standardize the analysis, the study area was divided into 13,270 grids of 1km × 1km, each assigned a unique identifier. The dependent variable was defined as a binary indicator: grids with existing EV charging stations were labeled 1, and those without were labeled 0. Independent variables consisted of counts of POI facilities within each grid, such as the number of restaurants, hospitals, and offices. This transformation facilitates the application of machine learning models by converting spatial data into a structured format. Partial grid information illustrates the variability in facility distributions, with some grids containing hundreds of POIs while others have minimal infrastructure. Population data further refine this, as townships with over 28,839 individuals aged 15-59 and comprising more than 60% of the total population are deemed high-demand areas. This step ensures that predictions are aligned with demographic realities, enhancing the practical relevance of our EV charging station planning model.

Table 1: Classification of POI Data
Primary Category Secondary Category Facility Examples POI Count
Commercial Services Shopping Consumption Supermarkets, commercial streets 92,263
Catering Services Restaurants, eateries 47,333
Financial Services Banks, insurance companies 4,452
Daily Life Services Hair salons, courier services 42,696
Living and Residence Residential Areas Residential communities, apartments 9,447
Medical Services Hospitals, clinics 13,046
Hotel Accommodation Hotels, homestays 10,903
Public Facilities Education and Culture Schools, libraries 12,189
Scenic Spots Parks, tourist attractions 6,589
Sports and Leisure Gyms, stadiums 8,070
Transportation Hubs Train stations, bus terminals 20,668
Administrative Offices Government Agencies Administrative units 19,643
Corporate Offices Companies, office buildings 36,928

The binary classification of grid data is essential for training predictive models. Each grid is characterized by the presence or absence of an EV charging station, along with counts of various POI types. This approach allows us to model the relationship between urban infrastructure and the likelihood of EV charging station deployment. For example, a grid with high numbers of commercial and transportation POIs may exhibit stronger correlations with EV charging station presence. The dataset, after removing grids with missing data, consists of 12,371 valid samples, including 544 grids with EV charging stations and 11,827 without. To address class imbalance, we applied undersampling to the training set, balancing positive and negative samples at a 1:1 ratio. This preprocessing step enhances model performance by preventing bias toward the majority class. The transformed data is then structured into a format suitable for deep learning, with features representing POI counts and the target variable indicating EV charging station suitability.

Table 2: Example Grid Information
Grid ID Sports and Leisure Medical Services Corporate Offices Residential Areas Government Agencies Daily Life Services Education and Culture Transportation Hubs Shopping Consumption Hotel Accommodation Financial Services Scenic Spots Catering Services EV Charging Station
12 27 7 305 31 25 57 43 51 121 8 2 3 80 1
416 5 3 34 6 3 27 2 14 74 0 1 0 20 1
1810 24 86 136 145 155 273 69 152 448 29 60 32 276 1
4629 1 12 8 4 13 29 6 3 126 5 3 5 18 1
6059 8 4 148 10 13 22 7 31 46 2 0 39 28 1
9336 2 12 25 0 9 12 3 1 40 1 0 1 15 0

The CNN-LSTM-Attention model is designed to leverage the strengths of convolutional, recurrent, and attention-based neural networks for predicting EV charging station locations. The convolutional neural network (CNN) component processes the spatial matrix of POI data, using convolutional kernels to extract local features from each grid and its neighbors. The convolution operation is defined as:

$$ y_{i,j} = \sum_{m=0}^{M-1} \sum_{n=0}^{N-1} x_{i+m,j+n} w_{m,n} + b $$

where \( x \) is the input data, \( w \) is the convolutional kernel, \( b \) is the bias term, and \( y \) is the output feature map. Pooling layers, typically max pooling, reduce dimensionality while retaining essential features:

$$ y_{i,j} = \max_{m,n \in R_{i,j}} x_{m,n} $$

Here, \( R_{i,j} \) represents the pooling window for position \( (i,j) \). The CNN output is then flattened and passed through fully connected layers, followed by dropout regularization to prevent overfitting. The subsequent LSTM layers model the sequential dependencies in the data, treating the spatial features as a sequence based on geographic coordinates. The LSTM mechanism involves input, forget, and output gates, which regulate information flow:

$$ i_t = \sigma(W_{ii} x_t + W_{hi} h_{t-1} + b_i) $$

$$ f_t = \sigma(W_{if} x_t + W_{hf} h_{t-1} + b_f) $$

$$ \tilde{C}_t = f_t \odot C_{t-1} + i_t \odot \tanh(W_{ic} x_t + W_{hc} h_{t-1} + b_c) $$

$$ o_t = \sigma(W_{io} x_t + W_{ho} h_{t-1} + b_o) $$

$$ h_t = o_t \odot \tanh(\tilde{C}_t) $$

In these equations, \( x_t \) is the input at time step \( t \), \( h_{t-1} \) is the previous hidden state, \( C_{t-1} \) is the previous cell state, \( \sigma \) is the sigmoid activation function, and \( \odot \) denotes element-wise multiplication. The attention layer then assigns dynamic weights to the LSTM outputs, emphasizing features most relevant to EV charging station prediction. The attention mechanism computes a weighted sum:

$$ h^* = \sum_{i=1}^{k} \alpha_i h_i $$

where \( h_i \) are the input features, and \( \alpha_i \) are weights derived using additive attention:

$$ a(s_{t-1}, h_j) = v_a^T \tanh(U_a h_j + W_a s_{t-1}) $$

$$ \alpha(t, j) = \frac{\exp(a(s_{t-1}, h_j))}{\sum_{j=1}^{T} \exp(a(s_{t-1}, h_j))} $$

Here, \( W_a \), \( U_a \), and \( v_a^T \) are weight matrices, and \( s_{t-1} \) is the hidden state at the previous time step. This combination allows the model to focus on critical grids, improving prediction accuracy for EV charging station placement.

Model training involved splitting the data into 80% training and 20% testing sets, with stratification to maintain distribution consistency. The Adam optimizer was used for parameter updates, and early stopping was implemented to halt training if validation loss did not improve for 10 epochs, preventing overfitting. The model architecture includes two CNN modules, each with convolutional and max-pooling layers, followed by fully connected and dropout layers, two LSTM layers, an attention layer, and a sigmoid output layer for binary classification. This structure enables the model to learn hierarchical spatial patterns and long-range dependencies, making it particularly effective for EV charging station planning in complex urban environments.

To evaluate the model’s performance, we compared it against several benchmarks, including decision trees, random forests, CNN-GRU, TCA-CNN-LSTM, LSTM-FC, standalone CNN, and standalone LSTM models. Metrics such as loss, accuracy, and F1-score were used, with loss calculated using binary cross-entropy:

$$ L(D) = \frac{1}{n} \sum_{i=1}^{n} \left[ -y_i \ln(p_i) – (1 – y_i) \ln(1 – p_i) \right] $$

where \( n \) is the number of samples, \( y_i \) is the true label, and \( p_i \) is the predicted probability. Accuracy measures the proportion of correct predictions:

$$ \text{Accuracy} = \frac{TP + TN}{TP + FP + TN + FN} $$

and the F1-score balances precision and recall:

$$ \text{Precision} = \frac{TP}{TP + FP} $$

$$ \text{Recall} = \frac{TP}{TP + FN} $$

$$ \text{F1} = \frac{2 \times \text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}} = \frac{2TP}{2TP + FP + FN} $$

Our CNN-LSTM-Attention model achieved superior results, with a loss of 0.1776, accuracy of 0.9324, and F1-score of 0.9254, outperforming all other models. This demonstrates its capability to handle imbalanced data and capture complex patterns essential for EV charging station location prediction.

Table 3: Comparison of EV Charging Station Prediction Models
Model Loss Value Accuracy F1-Score Source
Decision Tree 1.4658 0.8626 0.3727 Experiment
Random Forest 0.3230 0.8804 0.4150 Experiment
CNN-GRU 0.3416 0.8704 0.7807 Reference
TCA-CNN-LSTM 0.2358 0.8715 0.8913 Reference
LSTM-FC 0.2961 0.9093 0.8370 Reference
CNN 0.5345 0.8533 0.3575 Experiment
LSTM 0.3199 0.8727 0.3860 Experiment
CNN-LSTM 0.2526 0.9056 0.8730 Experiment
CNN-LSTM-Attention 0.1776 0.9324 0.9254 Our Model

Spatial autocorrelation analysis using Moran’s I was conducted to examine the distribution patterns of existing EV charging stations. The formula for Moran’s I is:

$$ I = \frac{n \sum_{i=1}^{n} \sum_{j=1}^{n} w_{ij} (x_i – \bar{x})(x_j – \bar{x})}{\left( \sum_{i=1}^{n} \sum_{j=1}^{n} w_{ij} \right) \sum_{i=1}^{n} (x_i – \bar{x})^2} $$

where \( n \) is the number of grids, \( x_i \) and \( x_j \) are observations at locations \( i \) and \( j \), \( \bar{x} \) is the mean, and \( w_{ij} \) is the spatial weight matrix. The calculated Moran’s I of 0.3649 with a p-value less than 0.001 indicates significant positive spatial autocorrelation, confirming that EV charging stations are clustered rather than randomly distributed. This clustering aligns with urban centers and areas of high POI density, underscoring the importance of considering spatial dependencies in planning.

Kernel density estimation (KDE) was employed to identify high-demand regions for EV charging stations. The KDE function is defined as:

$$ f(s) = \sum_{i=1}^{n} \frac{1}{h^2} k \left( \frac{s – c_i}{h} \right) $$

where \( f(s) \) is the density at location \( s \), \( h \) is the bandwidth, \( n \) is the number of points within distance \( h \), and \( k \) is the kernel function. Results show that high-density areas are concentrated in central urban zones, which correspond to commercial and residential hubs with substantial EV charging station demand. Medium-density areas include peripheral towns, while low-density regions are predominantly rural. This analysis, combined with population data, guides the selection of priority grids for EV charging station deployment.

For decision-making, we first excluded grids with existing EV charging stations, resulting in 595 grids predicted as suitable for new installations. To further refine these, we applied demographic criteria, selecting townships where the 15-59 age group exceeds 28,839 individuals and constitutes over 60% of the population. This yielded 62 high-demand townships, predominantly in urban districts such as Gulou, Taijiang, and Cangshan, as well as suburban areas like Changle and Minhou. The final set of 312 priority grids for EV charging station construction is primarily located in these regions, ensuring that infrastructure development aligns with population density and mobility patterns. This targeted approach maximizes the efficiency of EV charging station networks, promoting equitable access and supporting the growth of electric vehicle adoption.

Table 4: Distribution of High-Demand Townships
Region Number of Townships Region Number of Townships
Gulou District 5 Changle District 9
Taijiang District 2 Minhou County 5
Cangshan District 8 Lianjiang County 4
Mawei District 4 Luoyuan County 2
Jin’an District 6 Pingtan District 7
Yongtai County 2 Fuqing City 8

In conclusion, our study presents a comprehensive framework for EV charging station planning using POI big data and a hybrid deep learning model. The CNN-LSTM-Attention model effectively integrates spatial and sequential features, achieving high predictive accuracy and robust performance metrics. Spatial analyses reveal clustered patterns of existing EV charging stations and identify high-demand areas through kernel density estimation. By incorporating demographic data, we prioritize grids for new EV charging station installations, ensuring that infrastructure development meets actual needs. We recommend a zoned strategy: urban centers should focus on high-capacity fast-charging EV charging stations to serve dense populations and commercial activities, while suburban and rural areas benefit from slower chargers that cater to local usage patterns. This approach not only optimizes resource allocation but also fosters sustainable urban mobility, contributing to the broader goals of reducing carbon emissions and promoting electric vehicle adoption. Future work could explore real-time data integration and dynamic modeling to further enhance EV charging station planning in evolving urban landscapes.

Scroll to Top