Graph Convolutional Network-Based Fault Prediction for EV Charging Station Metering

In recent years, the rapid growth of electric vehicle (EV) adoption has placed significant emphasis on the reliability and accuracy of EV charging station infrastructure. As a critical component of the smart grid, EV charging stations must not only supply power efficiently but also ensure precise metering for fair transaction settlements. However, metering faults in EV charging stations, caused by factors such as device aging or human interference, can lead to substantial economic losses and grid instability. Traditional manual inspection methods are labor-intensive, costly, and inefficient, especially given the diversity in EV charging station designs and operational conditions. Therefore, developing automated fault prediction methods for EV charging stations is essential to enhance maintenance efficiency and reduce operational costs.

In this study, we address the challenge of fault prediction in EV charging stations by leveraging deep learning techniques. Our approach begins with an analysis of user charging behavior, which reveals temporal correlations in charging patterns. Based on this, we model charging data in non-Euclidean domains and propose a novel graph convolutional network (GCN) combined with a convolutional neural network (CNN) to capture complex nonlinear relationships between data features and fault types. Through extensive experiments on real-world datasets, we demonstrate the superiority of our model over existing methods, achieving high performance in fault classification for EV charging stations.

The analysis of user charging behavior is fundamental to understanding the data characteristics of EV charging stations. We focus on three key parameters: charging start time $ T_S $, charging duration $ T_C $, and charging end time $ T_E $, which are related by the equation $ T_E = T_S + T_C $. To model the distribution of these parameters, we employ kernel density estimation (KDE), a non-parametric method that smooths data points to estimate probability densities. The KDE function for a sample $ X $ is defined as:

$$ f_K(x) = \frac{1}{nh} \sum_{i=1}^{n} K\left( \frac{X_i – x}{h} \right) $$

where $ n $ is the number of samples, $ K(\cdot) $ is the kernel function, and $ h $ is the bandwidth. We use the Gaussian kernel for its smooth properties:

$$ K(u) = \frac{1}{\sqrt{2\pi}} \exp\left( -\frac{u^2}{2} \right) $$

The adaptive bandwidth $ h $ is selected via least squares cross-validation to minimize the integrated square error (ISE). For charging start time, the density estimate is:

$$ f_{K_S}(T_S) = \frac{1}{\sqrt{2\pi} n h_S} \sum_{i=1}^{n} \exp\left[ -\left( \frac{T_{S_i} – T_S}{h_S} \right)^2 \right] $$

Similarly, for charging end time and duration, we have:

$$ f_{K_E}(T_E) = \frac{1}{\sqrt{2\pi} n h_E} \sum_{i=1}^{n} \exp\left[ -\left( \frac{T_{E_i} – T_E}{h_E} \right)^2 \right] $$

$$ f_{K_C}(T_C) = \frac{1}{\sqrt{2\pi} n h_C} \sum_{i=1}^{n} \exp\left[ -\left( \frac{T_{C_i} – T_C}{h_C} \right)^2 \right] $$

Under the assumption of independence, the relationship between these densities is given by the convolution $ f_{K_E} = f_{K_S} * f_{K_C} $. However, in practice, charging behavior in EV charging stations often shows correlations, leading to a relative error $ \varepsilon_r $ when independence is assumed. This error is computed as:

$$ \varepsilon_r = \int_{-\infty}^{+\infty} \frac{ f_{K_S}(T_S) * f_{K_C}(T_C) – f_{K_E}(T_E) }{ f_{K_E}(T_E) } dT_E $$

This analysis highlights the temporal dependencies in EV charging station data, which we exploit to construct graph-based models for fault prediction.

Data preprocessing is crucial for handling raw charging data from EV charging stations, which often contains missing values and outliers. We apply linear interpolation for missing data, the 3-σ rule for outlier detection, and min-max normalization for scaling. For a sample $ X $, the preprocessing steps are defined as follows. Linear interpolation $ F_1(X_i) $ handles missing values:

$$ F_1(X_i) =
\begin{cases}
X_i & \text{if } X_i \neq \text{NaN} \\
0 & \text{if } X_i = \text{NaN and } X_{i-1} \text{ or } X_{i+1} = \text{NaN} \\
\frac{X_{i-1} + X_{i+1}}{2} & \text{if } X_i = \text{NaN and } X_{i-1} \neq \text{NaN}, X_{i+1} \neq \text{NaN}
\end{cases} $$

Outlier removal $ F_2(X_i) $ uses the 3-σ rule:

$$ F_2(X_i) =
\begin{cases}
X_{\text{Up}} & \text{if } X_i \geq X_{\text{Up}}, \quad X_{\text{Up}} = X_{\text{mean}} + 2X_{\sigma} \\
X_i & \text{otherwise}
\end{cases} $$

Min-max normalization $ F_3(X_i) $ scales data to [0,1]:

$$ F_3(X_i) = \frac{X_i – X_{\text{min}}}{X_{\text{max}} – X_{\text{min}}} $$

After preprocessing, we aggregate charging data over a fixed period (e.g., 15 days) to capture behavioral patterns. The graph construction for EV charging station data involves representing time nodes as graph vertices, with edges connecting nodes based on charging events. Specifically, each node corresponds to a time step, and edges link consecutive nodes as well as start and end nodes of charging sessions, reflecting the temporal correlations identified in the charging behavior analysis.

Our proposed model for EV charging station fault prediction combines GCN and CNN architectures to process non-Euclidean and Euclidean data, respectively. The GCN component handles graph-structured data, capturing spatial-temporal dependencies, while the CNN extracts local features from time-series data. The joint model, termed GCN-CNN, integrates these features for accurate fault classification.

The GCN operates on graph data defined by an adjacency matrix $ A $ and node feature matrix $ X $. Based on spectral graph theory, the graph convolution is performed in the Fourier domain. The normalized graph Laplacian $ L $ is computed as:

$$ L = I_n – D^{-\frac{1}{2}} A D^{-\frac{1}{2}} = U \Lambda U^T $$

where $ I_n $ is the identity matrix, $ D $ is the degree matrix, and $ U $ and $ \Lambda $ are the eigenvectors and eigenvalues, respectively. The convolution operation with a filter $ G $ is approximated using Chebyshev polynomials:

$$ G * X \approx \sum_{i=1}^{k} W’_i T_i \left( \frac{2L}{\lambda_{\text{max}}} – I_n \right) X $$

where $ T_i(\cdot) $ are Chebyshev polynomials defined recursively:

$$ T_0(X) = 1, \quad T_1(X) = X, \quad T_i(X) = 2X T_{i-1}(X) – T_{i-2}(X) $$

For simplicity, we set $ k=1 $ and $ \lambda_{\text{max}} \approx 2 $, leading to:

$$ G * X \approx W’_0 X – W’_1 D^{-\frac{1}{2}} A D^{-\frac{1}{2}} X $$

The GCN architecture consists of two graph convolution layers, a max-pooling layer, and a fully connected layer. The first graph convolution layer outputs:

$$ H^{(1)}_g = \text{ReLU}(W^{(1)}_g A X + B^{(1)}_g) $$

where $ W^{(1)}_g $ and $ B^{(1)}_g $ are weight and bias matrices. The second layer processes this output:

$$ H^{(2)}_g = \text{ReLU}(W^{(2)}_g A H^{(1)}_g + B^{(2)}_g) $$

Max-pooling reduces the feature dimensions:

$$ H^{(3)}_g = \max_{i,j \in R_1} (H^{(2)}_{g,i,j}) $$

where $ R_1 $ is the pooling region. Finally, a fully connected layer produces the GCN output:

$$ Y_g = \text{ReLU}(W^{(4)}_g H^{(3)}_g + B^{(4)}_g) $$

The CNN component processes the time-series data of EV charging stations, formatted as a 2D matrix. It includes two convolutional layers, two max-pooling layers, and a fully connected layer. The first convolutional layer operates as:

$$ H^{(1)}_c = \text{ReLU}(W^{(1)}_c * X_c + B^{(1)}_c) $$

where $ X_c $ is the input matrix, and $ * $ denotes Euclidean convolution. Max-pooling follows:

$$ H^{(2)}_c = \max_{i,j \in R_2} (H^{(1)}_{c,i,j}) $$

Similarly, the second convolutional and pooling layers yield $ H^{(4)}_c $, and the fully connected layer outputs:

$$ Y_c = \text{ReLU}(W^{(5)}_c H^{(4)}_c + B^{(5)}_c) $$

The joint model concatenates $ Y_g $ and $ Y_c $, passing them through a sigmoid-activated fully connected layer for fault classification:

$$ Y_h = \text{Sigmoid}(W_h [Y_g, Y_c] + B_h) $$

To address class imbalance in EV charging station fault data, we use a focal loss function that emphasizes hard-to-classify samples. For a predicted probability $ y_h $, the loss is:

$$ L(y_h) = – (1 – y_h)^\gamma \log(y_h) $$

where $ \gamma $ is a focusing parameter.

We evaluate our model on a real-world dataset from multiple EV charging stations, covering charging records from 2018 to 2022. The dataset includes various fault types, such as data acquisition anomalies, meter runaway, reverse meter operation, meter creep, and time abnormalities, along with normal metering data. Preprocessing ensures data quality, and the dataset is split into training, validation, and test sets in a 6:3:1 ratio. Ten-fold cross-validation is used for robust evaluation.

The training configuration for the GCN-CNN model includes a batch size of 64, 500 epochs, a learning rate of $ 1 \times 10^{-4} $, and a learning rate decay of 0.5 every 100 epochs. Experiments are conducted using PyTorch on a system with an Intel i7-13620H CPU and RTX 4050 GPU. Performance metrics include precision, recall, specificity, F1-score, and G-mean. For a class $ i $, precision and recall are defined as:

$$ \text{Precision}_i = \frac{TP_i}{TP_i + FP_i}, \quad \text{Recall}_i = \frac{TP_i}{TP_i + FN_i} $$

Specificity is:

$$ \text{Specificity}_i = \frac{TN_i}{TN_i + FP_i} $$

The F1-score for class $ i $ is:

$$ F1_i = \frac{2 \cdot \text{Precision}_i \cdot \text{Recall}_i}{\text{Precision}_i + \text{Recall}_i} $$

Macro-averaged F1-score and G-mean are computed as:

$$ F1_{\text{macro}} = \frac{\sum_{i=1}^{n} F1_i}{n}, \quad G\text{-mean} = \frac{\sum_{i=1}^{n} \sqrt{\text{Recall}_i \cdot \text{Specificity}_i}}{n} $$

We conduct ablation studies to assess the impact of graph construction methods and feature extraction modules. Three graph construction approaches are compared: our method (connecting start and end nodes based on charging behavior), method 1 (no graph features), and method 2 (connecting only adjacent nodes). The results are summarized in the following table:

Graph Construction Method	$ F1_{\text{macro}} $	$ G\text{-mean} $
Our Method	0.844	0.844
Method 1	0.767	0.771
Method 2	0.817	0.818

Our graph construction method improves $ F1_{\text{macro}} $ by 10.04% and $ G\text{-mean} $ by 9.47% compared to method 1, and by 3.20% and 3.08% compared to method 2, demonstrating the importance of modeling temporal correlations in EV charging station data.

Next, we compare different feature extraction modules in the joint model. The CNN-based feature extractor is evaluated against no feature extractor, MLP, and Attention mechanisms. The performance is shown below:

Feature Extractor Type	$ F1_{\text{macro}} $	$ G\text{-mean} $
CNN (Our)	0.844	0.844
No Extractor	0.778	0.781
MLP	0.817	0.818
Attention	0.822	0.823

The CNN extractor outperforms others, with improvements of 8.48% in $ F1_{\text{macro}} $ and 7.46% in $ G\text{-mean} $ over no extractor, and smaller gains over MLP and Attention. This validates CNN’s efficacy in capturing local patterns in EV charging station time-series data.

We further compare our GCN-CNN model with baseline models, including standalone CNN, LSTM, and Transformer. The results on the validation set are as follows:

Model	$ F1_{\text{macro}} $	$ G\text{-mean} $
GCN-CNN (Our)	0.844	0.844
CNN	0.773	0.776
LSTM	0.788	0.791
Transformer	0.812	0.813

Our model achieves the highest scores, with $ F1_{\text{macro}} $ and $ G\text{-mean} $ both at 0.844, representing average improvements of 6.28% and 6.04% over the other models. The Transformer model performs well due to its attention mechanism, but our GCN-based approach better captures the non-Euclidean relationships in EV charging station data.

In conclusion, we propose a novel fault prediction method for EV charging stations based on graph convolutional networks. By analyzing user charging behavior and constructing temporal graphs, our GCN-CNN model effectively identifies metering faults with high accuracy. Experimental results demonstrate significant performance gains over existing methods, highlighting the potential of graph-based deep learning for EV charging station maintenance. Future work will focus on developing lightweight models for reduced training costs and adapting to evolving charging technologies in EV charging stations.