Federated Learning-Based Recommendation System for Electric Vehicle Charging and Refueling Stations in China

In recent years, the rapid adoption of electric vehicles (EVs) in China has transformed the transportation landscape, driven by government policies and growing environmental awareness. Among these, hybrid electric vehicles (HEVs) have gained prominence due to their ability to combine the benefits of traditional internal combustion engines with electric propulsion, offering reduced emissions and enhanced fuel efficiency. However, HEV owners face challenges such as limited charging infrastructure, uneven utilization of stations, and long waiting times, which hinder the overall user experience. To address these issues, we propose a novel recommendation algorithm that leverages vertical federated learning (VFL) to provide personalized station suggestions while ensuring user privacy. Our approach integrates blockchain technology with cloud computing to create a secure, decentralized network for model training and data aggregation, specifically tailored for the China EV market. By focusing on privacy-preserving techniques, we aim to overcome the reluctance of data owners to share sensitive information, such as vehicle locations and charging histories, which is critical in the context of China’s evolving data protection regulations.

The proliferation of electric vehicles in China has led to a surge in demand for efficient charging and refueling solutions. However, existing recommendation systems often rely on centralized data processing, which poses significant privacy risks. In contrast, our VFL-based model enables collaborative learning across multiple parties—including HEVs, charging stations (CSs), and gas stations (GSs)—without exposing raw data. This is particularly relevant in China, where the electric vehicle ecosystem is expanding rapidly, and users are increasingly concerned about data security. Our system employs encrypted entity alignment and local model training to compute gradients and losses securely, while a blockchain-based CloudletChain ensures the integrity of aggregated parameters. Through extensive simulations using real-world data from a Chinese city, we demonstrate that our algorithm not only improves recommendation accuracy but also reduces waiting times and costs, thereby enhancing the overall efficiency of station utilization. This work underscores the potential of federated learning in advancing the China EV industry while adhering to strict privacy standards.

In the following sections, we delve into the methodology, experimental setup, and results of our proposed system. We begin by reviewing related work in EV recommendation algorithms and privacy-preserving techniques, highlighting the limitations of existing approaches. Next, we detail our VFL framework, including model parameters, feature selection, and the integration with CloudletChain. We then present a comprehensive evaluation of our algorithm’s performance, comparing it with baseline methods and analyzing key metrics such as execution time, waiting time, and communication latency. Finally, we discuss the implications of our findings and outline future research directions, emphasizing the role of federated learning in shaping the next generation of electric vehicle services in China.

Related Work

Previous studies on electric vehicle station recommendation have primarily focused on optimizing routes and reducing costs, but often neglect privacy concerns. For instance, collaborative filtering algorithms based on historical charging behavior have been proposed to suggest charging piles, but they do not account for vehicle-specific characteristics or real-time station availability. Other approaches integrate EVs with power grids and road networks to alleviate congestion and grid stress, yet they fail to provide personalized recommendations that protect user data. In China, where the electric vehicle market is booming, there is a growing need for solutions that balance efficiency with privacy. Some researchers have explored edge computing and blockchain for secure data exchange, but these methods are not widely adopted in the Chinese context due to infrastructure limitations. Our work builds on these efforts by incorporating VFL, which allows data to remain localized while enabling model training across distributed entities. This aligns with the unique demands of the China EV ecosystem, where data sovereignty and user trust are paramount.

Methodology

Our recommendation system is designed to serve HEVs by suggesting optimal CSs or GSs based on factors such as distance, cost, and waiting time. The core of our approach lies in VFL, which facilitates model training without data sharing. We define the participants as a set $k = \{1, 2, \dots, K\}$, where each party holds a local dataset $D_k$. For HEVs, the data samples are represented as $\{x_{\text{hev}}^i\}_{i \in D_{\text{hev}}}$, while for CSs and GSs, they include labels: $\{x_{\text{cs}}^i, y_i\}_{i \in D_{\text{cs}}}$ and $\{x_{\text{gs}}^i, z_i\}_{i \in D_{\text{gs}}}$. The goal is to learn model parameters $\Theta \in \mathbb{R}^d$, where $d$ denotes the feature dimension, and the performance of our VFL model $M_{\text{vfl}}$ should closely approximate that of a centralized model $M_{\text{cent}}$, as expressed by:

$$|\sigma_{\text{vfl}} – \sigma_{\text{cent}}| < \sigma$$

Here, $\sigma$ is the error tolerance, ensuring that our decentralized approach does not sacrifice accuracy.

Feature Selection

To enhance the relevance of our recommendations, we identify key features for each participant, as summarized in Table 1. These features capture essential aspects of the electric vehicle ecosystem in China, such as traffic conditions, station capacity, and user preferences. For example, HEV features include battery level and average speed, while CS and GS features encompass location-based attributes and service fees. By selecting these attributes, we reduce computational overhead and improve model performance, which is crucial for real-time applications in the dynamic China EV environment.

Table 1: Feature Vectors for Participants
Participant	Features
HEV	Vehicle location, average speed, current weather, surrounding infrastructure, traffic congestion, battery capacity, start/stop charging times, vehicle state (charging/refueling/driving)
CS	Latitude and longitude, number of charging piles, average charging capacity, average charging cost, additional parking fees, service fees, charging start time, duration, end time
GS	Latitude and longitude, number of fuel pumps, average refueling cost, average refueling capacity, waiting time, refueling duration

Vertical Federated Learning Training

The VFL training process consists of two main phases: encrypted entity alignment and local model training. In the first phase, we use cryptographic techniques to match sample IDs across HEVs, CSs, and GSs without revealing sensitive information. This results in an intersected set $I$ of common entities, which form the basis for collaborative learning. The second phase involves iterative local computations and secure aggregation, as outlined below:

Initialization: A cloud aggregator $c_p \in C$ sends initial parameters $\Theta_{\text{hev}}$, $\Theta_{\text{cs}}$, and $\Theta_{\text{gs}}$ to all participants.
Local Computation: Each party computes intermediate values, such as $u_{\text{hev}}^i = \Theta_{\text{hev}} x_{\text{hev}}^i$, and encrypts them using homomorphic encryption.
Gradient and Loss Calculation: Participants compute encrypted gradients and losses, applying masks for added security. For example, the overall loss function is defined as:

$$L = \sum_i \left( u_{\text{hev}}^i + u_{\text{cs}}^i – y_i + u_{\text{gs}}^i – z_i \right)^2 + \lambda \left( g(\Theta_{\text{hev}}) + g(\Theta_{\text{cs}}) + g(\Theta_{\text{gs}}) \right)$$

where $\lambda$ is a regularization parameter and $g(\cdot)$ is a regularization function. The loss is decomposed into components like $L_{\text{hev-cs}} = 2\sum_i (u_{\text{hev}}^i (u_{\text{cs}}^i – y_i))$ to facilitate distributed computation.

Aggregation and Update: Encrypted values are sent to the cloud aggregator for aggregation, and updated parameters are returned to participants for model refinement.
Recommendation Generation: After training, a sorted list $L_f$ of top-$N$ CSs and GSs is generated based on availability scores, which range from 0 (idle) to 2 (in use).

This process ensures that raw data never leaves the local devices, aligning with privacy requirements for electric vehicle users in China.

Integration with CloudletChain

To support real-time recommendations, we leverage a decentralized cloud network called CloudletChain, which combines cloudlets with blockchain technology. Cloudlets are edge computing nodes that process parameters locally, reducing latency for electric vehicle applications. The blockchain component ensures security by maintaining a tamper-proof ledger of authorized nodes. The CloudletChain operation involves four steps:

Node Registration: A new cloud node $c_{\text{new}}$ registers by generating a transaction $TX$ with its public and private keys.
Verification: A randomly selected miner node $c_m$ validates $TX$ and includes it in a block $b_r$.
Block Mining: The miner computes a hash for $b_r$ using a Merkle tree root, as shown in:

$$H(TX_n + TX_{n-1}) = H(\text{hash}(TX_n)) + H(\text{hash}(TX_{n-1}))$$

This iterative process continues until the hash meets the target difficulty.

Block Propagation: The new block $b_{\text{new}}$ is broadcast to the network, updating all ledgers and ensuring consensus.

By integrating CloudletChain, our system achieves low communication delays and high reliability, which is essential for the scalable growth of the China EV market.

Experimental Setup

We evaluated our algorithm using data collected from a Chinese city between September and December 2023. The dataset included information from 20 CSs, 20 GSs, and 50 HEVs, with features such as location, cost, and historical usage. We deployed 10 cloud nodes across different geographic areas to simulate a realistic environment. Our experiments compared our VFL-based approach with two baseline methods: Real-Time Recommendation (RT) and Earliest Finish Time (EFT). Key performance metrics included execution time, waiting time, communication latency, and blockchain mining time. All simulations were conducted in a controlled setting to ensure reproducibility, with multiple iterations to account for variability.

Results and Discussion

Our results demonstrate the effectiveness of the proposed algorithm in enhancing recommendation quality while preserving privacy. Table 2 compares the performance of different algorithms in terms of waiting probability, total cost, and station utilization. As shown, our method achieves zero waiting probability and lower costs compared to RT and EFT, highlighting its superiority in real-world electric vehicle scenarios in China.

Table 2: Performance Comparison of Recommendation Algorithms
Algorithm	Waiting Probability	Total Cost (CNY)	Parking Cost (CNY)	Time Utilization	Hourly Revenue (CNY)
RT	0.122	25.231	3.579	0.148	7916
EFT	0	20.948	1.754	0.193	9021
Our Algorithm	0	19.426	0.126	0.202	9062

Figure 1 illustrates the execution time as the number of participants increases. Initially, with fewer CSs, GSs, and HEVs, the system processes data quickly, but time grows linearly with scale. This trend underscores the scalability of our VFL approach, which efficiently handles larger datasets common in the expanding China EV sector.

Waiting time analysis reveals that our algorithm minimizes delays by avoiding congested stations. For instance, the difference $\rho$ between previous and current HEV waiting times remains stable, with lower values for vehicles like HEV8 and HEV16, which were directed to less busy stations. This optimization contributes to higher station profitability and reduced resource idle time, addressing key challenges in electric vehicle infrastructure management.

Communication latency is another critical factor, especially for real-time recommendations. As depicted in Figure 2, a centralized network requires approximately 9 seconds to generate recommendations, whereas our decentralized system with 10 cloud nodes reduces this to around 3 seconds. This improvement is attributed to the distributed nature of CloudletChain, which minimizes delays by processing data closer to the source—a significant advantage for dynamic electric vehicle applications in China.

Finally, we assessed the time required to mine new blocks in the blockchain, as shown in Figure 3. As the number of cloud nodes increases from 6 to 10, the mining time variance grows, reflecting the computational overhead of consensus algorithms. However, this trade-off is acceptable given the enhanced security and transparency, which are crucial for building trust in electric vehicle networks.

Conclusion

In this paper, we presented a privacy-preserving recommendation system for electric vehicle charging and refueling stations using vertical federated learning. Our approach addresses the unique needs of the China EV market by combining VFL with blockchain-based cloud networks, ensuring data privacy while improving recommendation accuracy and efficiency. Experimental results confirm that our algorithm reduces waiting times, costs, and communication delays compared to existing methods. Future work will extend this framework to public transportation systems, such as electric buses, and explore the integration of additional data sources to further optimize station utilization. By advancing federated learning techniques, we aim to support the sustainable growth of electric mobility in China and beyond.