With the rapid adoption of electric vehicles (EVs) globally, accurately predicting their charging load has become crucial for grid stability and energy management. In China, the electric vehicle market is expanding rapidly, leading to increased demand for efficient charging infrastructure. This paper addresses the challenges in short-term charging load prediction for electric vehicles by proposing a novel hybrid model that combines improved clustering techniques with advanced deep learning architectures. The integration of electric vehicles into the power grid introduces significant variability due to user behavior, making traditional prediction methods less effective. Our approach leverages the GLDSC-ConvAutoformer model to enhance prediction accuracy, which is essential for optimizing grid operations and supporting the growth of China EV infrastructure.
The proliferation of electric vehicles worldwide, particularly in China, has highlighted the need for reliable load forecasting to prevent grid instability. Electric vehicle charging patterns are influenced by various factors, including time of day, user habits, and environmental conditions. Existing methods, such as support vector machines and recurrent neural networks, often struggle with the non-linear and non-stationary nature of EV load data. In this study, we introduce a framework that first clusters EV charging profiles using a grey limited dynamic spectrum clustering (GLDSC) algorithm to identify patterns, then applies a ConvAutoformer model for prediction. This combination allows for better handling of temporal dependencies and feature extraction, resulting in superior performance compared to conventional models.
To provide context, the basic algorithms underlying our approach include spectral clustering and Autoformer. Spectral clustering is a graph-based method that partitions data into clusters by analyzing similarity matrices. Traditionally, it uses Euclidean distance, but we enhance it by incorporating a grey relational degree model based on limited dynamic time warping (LDTW) distance. This improvement accounts for the temporal misalignments in EV charging data, leading to more meaningful clusters. The similarity matrix \( A \) in spectral clustering is defined as:
$$ A_{ij} = \begin{cases}
\exp\left(-\frac{\| s_i – s_j \|^2}{2\sigma^2}\right), & i \neq j \\
0, & i = j
\end{cases} $$
where \( s_i \) and \( s_j \) represent data points, and \( \sigma^2 \) controls the decay rate. The Laplacian matrix \( L \) is derived as \( L = D^{-1/2} A D^{-1/2} \), where \( D \) is the diagonal degree matrix. This forms the basis for clustering, which groups EV charging curves with similar characteristics.
The Autoformer model, an evolution of the Transformer architecture, is designed for time series forecasting by decomposing sequences into trend and seasonal components. For an input sequence \( \chi \in \mathbb{R}^{L \times d} \), the decomposition is performed as:
$$ \chi_t = \text{AvgPool}(\text{Padding}(\chi)) $$
$$ \chi_s = \chi – \chi_t $$
where \( \chi_t \) and \( \chi_s \) represent the trend and seasonal parts, respectively. The encoder and decoder in Autoformer utilize auto-correlation mechanisms to capture long-term dependencies, which is beneficial for EV load data that exhibits periodic behavior. However, standard Autoformer can suffer from prediction oscillations, so we enhance it with convolutional layers for better feature extraction.
Our proposed GLDSC-ConvAutoformer model integrates these elements to address the shortcomings of existing methods. The process begins with data preprocessing, where EV charging load data is normalized using Z-score standardization to reduce scale effects. For a dataset \( P_i = [p_{i,1}, p_{i,2}, \dots, p_{i,T}] \) representing the load of the i-th EV at time j, the normalized value is computed as:
$$ P’_i = \frac{P_i – \mu_i}{\sigma_i} $$
with \( \mu_i = \frac{1}{n} \sum_{j=1}^{n} p_{i,j} \) and \( \sigma_i = \sqrt{\frac{1}{n} \sum_{j=1}^{n} (p_{i,j} – \mu_i)^2} \). This ensures that the data is centered and scaled, facilitating similarity analysis.
Next, the GLDSC algorithm is applied to cluster the normalized EV charging curves. The steps involve computing the LDTW distance between sequences, which measures similarity while allowing for temporal warping within limits. The grey relational degree \( \gamma(X_0, X_i) \) between sequences \( X_0 \) and \( X_i \) is given by:
$$ \gamma(X_0, X_i) = \frac{\min_m \min_n \| x_0(t_0) – x_i(t_i) \| + \xi \max_m \max_n \| x_0(t_0) – x_i(t_i) \|}{\text{LDTW}(X_0, X_i)/\lambda + \xi \max_m \max_n \| x_0(t_0) – x_i(t_i) \|} $$
where \( \xi = 0.5 \) is a distinguishing coefficient, and \( \lambda \) is the path length from LDTW. This grey relational degree matrix replaces the traditional similarity matrix in spectral clustering, improving cluster quality by capturing temporal dynamics. The optimal number of clusters is determined using the elbow method, which evaluates the silhouette score for different cluster counts k.
After clustering, the ConvAutoformer model is used for prediction on each cluster separately. This model incorporates dual convolutional layers with varying kernel sizes to extract features from the input data. The convolutional operations enhance the model’s ability to capture local patterns, which are then fed into the Autoformer architecture. The Autoformer component includes series decomposition blocks and auto-correlation mechanisms, as described earlier. To prevent overfitting and improve gradient flow, residual connections are added. The final prediction is obtained by summing the outputs from all clusters, reconstructing the overall EV charging load.

For experimental validation, we used real-world data from electric vehicle charging stations in a urban area, focusing on China EV datasets. The data included charging transactions from December 1-31, 2019, with parameters like start/end times, energy consumed, and station IDs. We selected 103 distinct EV profiles, each with 1488 sampling points at 30-minute intervals. The dataset was split into training (70%), validation (10%), and testing (20%) sets. Preprocessing involved normalization and clustering via GLDSC, which identified five distinct clusters based on silhouette scores. The clustering results showed that most groups exhibited clear periodic trends, while one cluster had minimal fluctuations, representing residual behavior.
The evaluation metrics used were mean absolute error (MAE) and mean squared error (MSE), defined as:
$$ E_{\text{MSE}} = \frac{1}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)^2 $$
$$ E_{\text{MAE}} = \frac{1}{n} \sum_{i=1}^{n} |y_i – \hat{y}_i| $$
where \( y_i \) is the actual value, \( \hat{y}_i \) is the predicted value, and n is the number of samples. We compared our GLDSC-ConvAutoformer model against several benchmarks, including traditional neural networks (TCN, LSTM) and Transformer-based models (Transformer, Informer, Reformer, Autoformer), as well as variants like ConvAutoformer and GLDSC-Autoformer. The results demonstrated that our model achieved the lowest errors, with significant improvements in both MAE and MSE.
| Model | MAE | MSE |
|---|---|---|
| LSTM | 0.895 | 0.392 |
| TCN | 0.812 | 0.370 |
| Transformer | 0.586 | 0.255 |
| Informer | 0.517 | 0.207 |
| Reformer | 0.402 | 0.165 |
| Autoformer | 0.385 | 0.142 |
| ConvAutoformer | 0.277 | 0.098 |
| GLDSC-Autoformer | 0.259 | 0.095 |
| GLDSC-ConvAutoformer | 0.129 | 0.026 |
The table above summarizes the performance metrics, highlighting that our model reduces MSE by approximately 66.5% compared to standard Autoformer. This underscores the effectiveness of combining improved clustering with enhanced feature extraction for electric vehicle load prediction. Additionally, the clustering analysis revealed that groups with strong periodicity had lower prediction errors, while the residual cluster contributed minimally to overall error due to its small size.
In terms of model configuration, we used a single-step prediction setup with parameters such as prediction length pred_len=1, label length label_len=2, sequence length seq_len=4, encoder layers e_layers=4, decoder layers d_layers=1, attention heads n_heads=8, model dimension d_model=48, and batch size Batch_size=3. For the convolutional layers, we experimented with kernel sizes (2×1, 3×1, 4×1, 5×1) and numbers of kernels (3, 5, 7, 9), settling on a first layer with kernel size 3×1 and 7 kernels, and a second layer with kernel size 2×1 and 5 kernels for the five clusters. This optimized feature extraction without overfitting.
The implications of this work are significant for the integration of electric vehicles into power systems, especially in China where EV adoption is accelerating. Accurate load forecasting enables better grid management, reduces peak demand stresses, and supports the deployment of renewable energy sources. For instance, by predicting charging loads, utilities can schedule charging during off-peak hours, minimizing costs and enhancing reliability. The use of advanced models like GLDSC-ConvAutoformer can also inform infrastructure planning, such as the placement of new charging stations based on predicted demand patterns.
In conclusion, the GLDSC-ConvAutoformer model presents a robust solution for short-term charging load prediction of electric vehicles. By integrating grey relational degree-based clustering with convolutional-enhanced Autoformer, we achieve higher accuracy and stability in forecasts. This approach not only addresses the variability in EV charging behavior but also provides a scalable framework for future applications in smart grids. As the electric vehicle market grows, particularly in China, such predictive models will play a vital role in ensuring sustainable energy management and supporting the transition to electric mobility.
Future research could explore the integration of additional external factors, such as weather conditions and user demographics, to further refine predictions. Moreover, adapting the model for real-time forecasting and larger datasets could enhance its practicality for grid operators. The continuous evolution of electric vehicle technologies and charging infrastructures in China EV ecosystems will necessitate ongoing improvements in predictive analytics, and our work lays a foundation for these advancements.
