Electric Vehicle Load Prediction Combining Attention and Multi-Scale Features

Abstract

Aiming to address the randomness and low prediction accuracy of electric vehicle (EV) charging loads, I propose a novel prediction model, VMD-AM-MSF-TCNnet, which integrates variational mode decomposition (VMD), attention mechanisms, and multi-scale feature fusion within a temporal convolutional network (TCN). The model decomposes EV load sequences using VMD optimized by the whale optimization algorithm (WOA), enhances TCN residual blocks with gating mechanisms and dual attention (time and channel), and fuses multi-scale features extracted by different residual blocks. Experimental results demonstrate significant improvements in prediction performance compared to state-of-the-art models, with reduced mean squared error (MSE), mean absolute error (MAE), and enhanced coefficient of determination (R²).

1. Introduction

The proliferation of electric vehicles has imposed critical demands on power grid management, making accurate load prediction essential for optimal grid scheduling, charging station planning, and user service optimization . Traditional statistical methods (e.g., time series, regression analysis) struggle with the non-linearity and complexity of EV load data, while deep learning models like recurrent neural networks (RNNs) face challenges such as gradient vanishing and limited parallel computation -.

Temporal convolutional networks (TCNs) have emerged as a promising alternative, leveraging dilated convolutions to capture long-term dependencies and enabling parallel processing. However, EV load data’s inherent randomness necessitates advanced feature extraction and noise reduction. I introduce VMD to decompose load sequences into manageable components, combined with attention mechanisms to prioritize critical temporal and channel-wise features .

2. Methodology

2.1 Model Architecture: VMD-AM-MSF-TCNnet

The proposed model follows a multi-stage framework: data preprocessing, VMD decomposition, feature screening, multi-scale feature extraction, and load reconstruction.

Data Preprocessing: EV load and meteorological data are downsampled to hourly granularity. Missing values are imputed linearly, and meteorological features are normalized using:\(w_j’ = \frac{w_j – w_{\text{min}}}{w_{\text{max}} – w_{\text{min}}}\) where \(w_j\) is the original feature value, and \(w_{\text{min}}/w_{\text{max}}\) denote the feature’s range -.
VMD with WOA Optimization: VMD decomposes the load sequence into intrinsic mode functions (IMFs), with parameters K (number of components) and \(\alpha\) (penalty factor) optimized by WOA. The WOA simulates humpback whale predation, updating positions via:
- For \(R \geq 0.5\):\(L^{t+1}(n) = L_{\text{best}}^t + |L_{\text{best}}^t – L^t(n)| e^{\pi r_1} \cos(2\pi r_1)\)
- For \(R < 0.5\) and \(|\beta| < 1\):\(\Delta L(n) = |\gamma L_{\text{best}}^t – L^t(n)|, \quad L^{t+1}(n) = L_{\text{best}}^t – \beta \Delta L(n)\) where \(R, r_1, r_2, r_3\) are random numbers, and \(\beta, \gamma\) are control parameters -.
Improved Residual Blocks: TCN residual blocks are enhanced with gating mechanisms and dual attention:
- Gating Unit: Controls information flow via:\(z_1 = \text{sigmoid}(W_{z1} * H_1 + b_{z1}), \quad r_1 = \text{sigmoid}(W_{r1} * H_1 + b_{r1})\) \(\tilde{H}_1 = r_1 \odot \text{state}, \quad H_1′ = (1 – z_1) \odot \tilde{H}_1 + z_1 \odot H_1\)
- Time Attention: Generates temporal weights:\(A_t = \text{softmax}\left(\text{Flatten}\left(\tanh(W_t * X + c)\right)\right)\)
- Channel Attention: Emphasizes informative channels:\(A_c = \text{Dense}\left(\text{Dense}\left(\text{Reshape}\left(\text{GlobalAveragePooling}(Y)\right)\right)\right)\) where Y is the time-attended feature -.
Multi-Scale Feature Fusion: Three residual blocks with kernel sizes [3, 5, 7] and dilation rates [1, 2, 4] extract features at different time scales. Fused features are:\(\text{Res}_{\text{mul}} = \text{Res}_1 + \text{Res}_2 + \text{Res}_3\) \(\text{Res}_{\text{fusion}} = \text{Flatten}\left(\text{Dropout}\left(\text{Res}_{\text{mul}} * A_d\right)\right)\) where \(A_d\) is the dynamic attention weight -.

2.2 Meteorological Feature Screening

A multivariate linear model screens meteorological features:\(Y = a_1 W_1 + \dots + a_i W_i + \dots + a_s W_s + b\) Features with \(|a_i| > 0.5\) are retained. Table 1 shows significant features affecting EV loads.

3. Experiments

3.1 Data Sources

EV Load Data: ElaadNL project (2019.01.01–2019.06.30), 163,255 charging records, downsampled to hourly data.
Meteorological Data: Dutch weather data (same period), 4,344 records with features like temperature and wind speed -.

3.2 Evaluation Metrics

Mean Squared Error (MSE):\(\text{MSE} = \frac{1}{\text{samp}} \sum_{i=1}^{\text{samp}} (y_i^{\text{pre}} – y_i^{\text{ori}})^2\)
Mean Absolute Error (MAE):\(\text{MAE} = \frac{1}{\text{samp}} \sum_{i=1}^{\text{samp}} |y_i^{\text{pre}} – y_i^{\text{ori}}|\)
Coefficient of Determination (R²):\(R^2 = 1 – \frac{\sum_{i=1}^{\text{samp}} (y_i^{\text{pre}} – y_i^{\text{ori}})^2}{\sum_{i=1}^{\text{samp}} (y_i^{\text{ori}} – \bar{y})^2}\) where \(y_i^{\text{pre}}\), \(y_i^{\text{ori}}\), and \(\bar{y}\) are predictions, ground truths, and the mean load -.

3.3 Experimental Setup

Hardware: AMD Ryzen7, NVIDIA 4060, 16GB RAM.
Software: PyCharm 2022, Adam optimizer, 50 training epochs.
Model Config: 3 residual blocks (kernel sizes 3, 5, 7; dilation rates 1, 2, 4) .

4. Results and Discussion

4.1 Meteorological Feature Impact

Table 1 lists significant meteorological features. Atmospheric temperature (T), horizontal atmospheric pressure (Po), sea-level pressure (P), and 3-hour pressure change (Pa) show strong coefficients, while relative humidity (U) and wind speed (Ff) have weaker impacts -.

Feature	Coefficient
T	0.81
Po	-0.86
P	0.85
Pa	1.33
U	0.05
Ff	0.28

Table 1. Impact of meteorological features on EV load

4.2 Comparison with State-of-the-Art Models

Table 2 shows the proposed model outperforms competitors:

MSE of 1.47 kW², 96.82% lower than GRU.
MAE of 0.91 kW, 81.87% lower than GRU.
R² of 0.987, 59.71% higher than GRU.

Model	MSE (kW²)	MAE (kW)	R²
GRU	46.28	5.02	0.618
LSTM	43.26	4.89	0.643
CNN-AT-LSTM	41.15	4.67	0.653
EMD-LSTM	25.84	3.80	0.787
VMD-CNN	7.31	2.05	0.937
VMD-FCA-TCN	6.93	1.97	0.940
VMD-LSTM	4.53	1.63	0.960
ITCN-AT-BiGRU	3.43	1.42	0.970
Proposed Model	1.47	0.91	0.987

Table 2. Prediction errors of compared models

4.3 Ablation Studies

Table 3 validates the contribution of each model component:

Adding VMD to TCN (VT) reduces MSE by 84.04%.
Improved residual blocks (VT-IR) further reduce MSE by 68.51%.
Multi-scale feature fusion (VT-MSR) reduces MSE by 78.00% -.

Model	MSE (kW²)	MAE (kW)	R²
TCN	47.55	4.97	0.607
VT (TCN+VMD)	7.59	2.09	0.934
VT-IR	2.39	1.21	0.980
VT-MSR	1.67	1.01	0.985
Proposed	1.47	0.91	0.987

Table 3. Ablation experiment results

5. Conclusion

I have developed VMD-AM-MSF-TCNnet, an EV load prediction model combining VMD, attention mechanisms, and multi-scale TCN features. Key findings include:

VMD decomposes complex EV load sequences, reducing randomness and improving prediction accuracy.
Gating and dual attention enhance feature representation, prioritizing critical temporal and channel information.
Multi-scale feature fusion captures dependencies at different time scales, improving model robustness.

Future work will incorporate EV charging prices and more contextual features to further refine predictions