In recent years, the rapid adoption of battery EV cars has significantly transformed the transportation and energy sectors. With the global push toward carbon neutrality, the number of battery EV cars on the roads has skyrocketed, presenting both challenges and opportunities for power grid management. The random spatiotemporal integration of large-scale battery EV cars can lead to grid instability, including harmonic pollution, voltage violations, and phase imbalances. Conversely, when properly managed through advanced control strategies, aggregated battery EV car fleets can serve as valuable flexible resources for grid ancillary services, such as frequency regulation, peak shaving, and economic dispatch. However, to effectively harness this potential, accurate and fine-grained predictions of the schedulable capacity of battery EV cars are essential across multiple timescales—real-time, ultra-short-term, and day-ahead. Existing prediction methods often struggle with long-sequence forecasting, where accuracy and efficiency degrade markedly. To address this, I propose an enhanced Informer-based model that leverages a convolutional sparse attention mechanism to improve the prediction of aggregated schedulable capacity for battery EV cars over multiple timescales and spatial dimensions.

The schedulable capacity of a battery EV car refers to the adjustable range of power or energy that can be exchanged with the grid without disrupting the user’s normal usage. This capacity is quantified through four key metrics: schedulable charging capacity (SCC), schedulable discharging capacity (SDC), schedulable charging power (SCP), and schedulable discharging power (SDP). For an individual battery EV car with charging record indexed by d, the schedulable capacity at time t can be modeled based on its charging start time \(t_{d,0}\), end time \(t_{d,end}\), initial energy \(C_{d,0}\), target energy \(C_{d,end}\), and maximum charging/discharging power. Assuming a response step length \(t_s\), the schedulable capacities are computed as follows:
$$A^{SCC}_{d,t} = \min\left(C_{d,end}, \int_{t_{d,0}}^{t+t_s} \eta_c P_{d,c}(t) dt\right) – C_{d,t}$$
$$A^{SCP}_{d,t} = \min\left(\frac{A^{SCC}_{d,t}}{\eta_c t_s}, P_{d,c}(t)\right)$$
$$A^{SDC}_{d,t} = C_{d,t} – \max\left(C_{d,0}, C_{d,end} – \int_{t+t_s}^{t_{d,end}} \eta_c P_{d,c}(t) dt\right)$$
$$A^{SDP}_{d,t} = -\min\left(\frac{A^{SDC}_{d,t} \eta_d}{t_s}, P_{d,d}\right)$$
Here, \(\eta_c\) and \(\eta_d\) represent charging and discharging efficiencies, while \(P_{d,c}(t)\) and \(P_{d,d}\) denote the maximum charging and discharging powers, respectively. For an aggregator l managing \(N_l\) battery EV cars, the aggregated schedulable capacity is obtained via Minkowski summation:
$$A^{SCC}_{0,l,t} = \sum_{d=1}^{N_l} A^{SCC}_{d,t}, \quad A^{SDC}_{0,l,t} = \sum_{d=1}^{N_l} A^{SDC}_{d,t}, \quad A^{SCP}_{0,l,t} = \sum_{d=1}^{N_l} A^{SCP}_{d,t}, \quad A^{SDP}_{0,l,t} = \sum_{d=1}^{N_l} A^{SDP}_{d,t}$$
These aggregated values form the historical time-series data used for prediction. The need for multi-timescale prediction arises from diverse grid ancillary services. For instance, frequency regulation requires real-time predictions with minute-level granularity, peak shaving may involve hourly forecasts, and economic dispatch relies on day-ahead planning. The table below outlines the key parameters for different timescales in this study:
| Timescale | Input Sequence Length | Output Sequence Length | Time Granularity | Execution Frequency |
|---|---|---|---|---|
| Real-time | 60 | 1 | 1 minute | Every minute |
| Ultra-short-term | 180 | 60 | 1 minute | Every hour |
| Day-ahead | 1440 | 1440 | 1 minute | Daily |
To handle these long sequences effectively, I developed an improved Informer model. The original Informer algorithm utilizes a sparse attention mechanism to reduce computational complexity from \(O(L^2)\) to \(O(L \log L)\) for sequence length L, but it relies on dot-product attention that may overlook local trend information. My enhancement replaces this with a convolutional sparse attention mechanism, which better captures local variations in time-series data. The overall architecture consists of an encoder and a decoder with embedding layers for positional and temporal encoding.
In the encoder, the input sequences—aggregated schedulable capacity and corresponding timestamps—are first embedded with positional encoding using sine and cosine functions:
$$F_{e,2j} = \sin\left(\frac{e}{(2L)^{2j/d_{model}}}\right), \quad F_{e,2j+1} = \cos\left(\frac{e}{(2L)^{2j/d_{model}}}\right)$$
where e is the position in the sequence, j is the dimension index, and \(d_{model}\) is the feature dimension. Temporal encoding converts timestamps into a format like [minute, hour, day of week, day of year, month] to encapsulate periodic patterns. The core innovation lies in the multi-head convolutional sparse self-attention layer. Instead of applying linear transformations directly to the input sequence x to obtain query (Q), key (K), and value (V) matrices as in traditional attention, I first perform a convolution operation with kernel h:
$$x_{\text{conv}} = \text{conv}(x, h)$$
Then, the matrices are derived as:
$$[Q, K, V]^T = W \cdot [x_{\text{conv}}, x_{\text{conv}}, x_{\text{conv}}]^T + b$$
This convolution step integrates local trend information, making the attention mechanism more sensitive to sequence changes. The sparse attention process selects only the most influential queries based on a scoring function. For a query \(q_i\) and a randomly sampled set of M keys \(k_j\), the score is computed as:
$$\text{score}(q_i, k_j) = \frac{q_i k_j^T}{\sqrt{d_k}}$$
The measure \(\bar{M}(q_i, k)\) is defined as:
$$\bar{M}(q_i, k) = \max_j(\text{score}(q_i, k_j)) – \frac{1}{M} \sum_{j=1}^M \text{score}(q_i, k_j)$$
Queries with the top N values of \(\bar{M}(q_i, k)\) are retained as \(\hat{Q}\), and attention is computed as:
$$\text{Attention}(\hat{Q}, K, V) = \text{Softmax}\left(\frac{\hat{Q} K^T}{\sqrt{d_k}}\right) V$$
This reduces computational load while focusing on critical points. The encoder also includes distillation layers that use 1D convolution and max-pooling to halve the sequence length, thereby extracting essential features and improving robustness. The decoder follows a generative inference approach, taking a concatenated input of a historical segment \(x_{\text{token}}\) and a placeholder sequence of zeros. It employs masked convolutional sparse attention to prevent information leakage and utilizes the encoded features for prediction.
For experimental validation, I utilized over 500,000 real charging records from battery EV cars in a city throughout 2021. The data included timestamps, charging/discharging power, and geographical coordinates. After computing the aggregated schedulable capacity for eight distinct battery EV car aggregators (EVAs) with 1-minute granularity, the dataset was split into training (80%) and testing (20%) sets. The battery EV car fleets exhibited diverse spatiotemporal patterns, as summarized below:
| EVA ID | Number of Charging Records | SCC Range (MWh) | SCP Range (MW) |
|---|---|---|---|
| 1 | 78,155 | [-2.203, 1.362] | [-15.141, 5.960] |
| 2 | 72,236 | [-4.253, 2.457] | [-14.881, 4.587] |
| 3 | 58,540 | [-1.988, 1.367] | [-10.358, 5.773] |
| 4 | 108,081 | [-2.199, 1.531] | [-4.880, 4.423] |
| 5 | 53,823 | [-14.095, 7.376] | [-25.103, 6.435] |
| 6 | 54,023 | [-7.791, 5.378] | [-21.961, 6.823] |
| 7 | 81,980 | [-7.471, 3.781] | [-12.279, 4.183] |
| 8 | 39,144 | [-0.467, 0.197] | [-1.224, 1.224] |
These battery EV car aggregators show varied charging behaviors, with some peaking during specific hours, which influences their schedulable capacity. The prediction performance was evaluated using mean absolute error (MAE) and mean absolute percentage error (MAPE), defined as:
$$\text{MAE} = \frac{1}{S} \sum_{i=1}^S |y_i – \hat{y}_i|, \quad \text{MAPE} = \frac{1}{S} \sum_{i=1}^S \frac{|y_i – \hat{y}_i|}{\hat{y}_i} \times 100\%$$
where S is the number of samples, \(y_i\) is the actual value, and \(\hat{y}_i\) is the predicted value. I compared the improved Informer model against several benchmarks: Transformer, original Informer, Long Short-Term Memory (LSTM), and Temporal Convolutional Network (TCN). The training was conducted on a system with an Intel i7 processor, GeForce GTX 1070 GPU, and 16 GB RAM, using PyTorch.
In real-time scale prediction (1-minute ahead), the improved Informer demonstrated superior accuracy for battery EV car schedulable capacity. For EVA 4, which has complex patterns, the results are highlighted below:
| Algorithm | SCC MAE (kWh) | SCC MAPE (%) | Training Time (s) |
|---|---|---|---|
| Improved Informer | 0.420 | 0.445 | 265 |
| LSTM | 2.057 | 1.001 | 124 |
| TCN | 5.749 | 0.925 | 212 |
| Transformer | 1.863 | 0.627 | 354 |
| Informer | 0.648 | 0.484 | 256 |
My improved Informer reduced MAPE by 10.59% compared to the original Informer and by 38.61% compared to Transformer for battery EV car capacity prediction. Against non-attention methods, the improvements were even more pronounced: 54.71% over LSTM and 64.01% over TCN. The convolutional sparse attention slightly increased training time relative to Informer but remained efficient, especially for longer sequences.
For ultra-short-term scale (60-minute ahead), the improved Informer consistently outperformed other algorithms across most battery EV car aggregators. The average MAPE reduction was 18.94% versus Informer, 23.05% versus Transformer, 24.93% versus LSTM, and 30.71% versus TCN. The training times were comparable, with my method being only 2.27% slower than Informer but 5.53% faster than Transformer. This demonstrates the scalability of the approach for medium-length sequences.
In day-ahead scale (24-hour ahead), where sequence lengths are longest (1440 points), the improved Informer showed remarkable gains. For EVA 4, the results are summarized as follows:
| Algorithm | SCC MAE (kWh) | SCC MAPE (%) | Training Time (s) |
|---|---|---|---|
| Improved Informer | 10.076 | 8.872 | 794 |
| LSTM | 11.618 | 13.387 | 941 |
| TCN | 11.816 | 11.332 | 803 |
| Transformer | 15.693 | 11.189 | 895 |
| Informer | 12.094 | 11.603 | 769 |
The improved Informer achieved a 21.75% lower MAPE than Informer, 29.21% lower than Transformer, 45.84% lower than LSTM, and 39.4% lower than TCN. Training time was reduced by up to 22.01% compared to LSTM and 14.91% compared to TCN, highlighting the efficiency of the sparse attention mechanism for long sequences. The convolutional component enabled better capture of local trends, as evidenced in prediction curves where my model closely tracked actual fluctuations, while others lagged or missed subtle variations.
Spatial distribution predictions further validated the model’s utility. By clustering battery EV car charging records into aggregators based on geographical and temporal features, I could forecast schedulable capacity across different city regions. For example, day-ahead predictions for December 27 showed capacity shifts from central-southern to northeastern areas, aligning with the known spatiotemporal patterns of battery EV car usage. The error distribution also exhibited spatial correlation, with underestimations in growing regions due to prediction lag. Compared to a non-clustered approach where all battery EV cars were treated as a single aggregator, the clustered prediction reduced MAPE by 25.13%, emphasizing the importance of accounting for spatiotemporal heterogeneity in battery EV car fleets.
The integration of battery EV cars into grid services necessitates precise capacity forecasts. My improved Informer model addresses key limitations in long-sequence prediction by incorporating convolutional sparse attention, which enhances sensitivity to local trends while maintaining computational efficiency. The multi-timescale framework supports various ancillary services, from real-time frequency regulation to day-ahead economic dispatch. Experimental results confirm significant accuracy improvements across all timescales, with MAPE reductions of 10.59–64.01% depending on the benchmark. Moreover, the spatial prediction capability enables grid operators to leverage battery EV car resources optimally based on regional demand.
Future work could involve real-time simulation systems to update battery EV car states dynamically during grid interactions, enabling rolling predictions that adapt to调度 decisions. Additionally, incorporating distribution network constraints, such as voltage limits and power flow, could enhance the practical applicability of the predictions for battery EV car fleets. As the adoption of battery EV cars continues to grow, such advanced forecasting models will be crucial for achieving grid stability and maximizing the value of these mobile energy resources. The proposed method lays a foundation for scalable, accurate, and efficient schedulable capacity prediction, paving the way for smarter integration of battery EV cars into the power ecosystem.
