Stackelberg Game-based Pricing and Dispatch Strategy for Battery Electric Vehicle Aggregators in Multi-Level Electricity Markets

The rapid acceleration of “new electrification” on the demand side, coupled with the increasing frequency of extreme weather events, has precipitated dual challenges for power systems: a heightened risk of short-term supply-demand imbalances and insufficient flexibility resource margins. Effectively harnessing the dual value of demand-side flexibility—for both market operations and grid stability—has become a critical research frontier. Among these resources, the burgeoning fleet of battery electric vehicles presents a significant, yet largely untapped, potential. Acting as distributed mobile energy storage units, battery electric vehicles can provide vital grid services when aggregated and coordinated. This paper investigates the optimal operational strategy for a Battery Electric Vehicle Aggregator (BEVA) participating in sequential multi-stage markets, considering the diverse consumption preferences of EV users.

The core of the strategy lies in a hierarchical decision-making framework modeled as a Stackelberg game. The BEVA acts as the leader, determining charging/discharging subsidy prices for battery electric vehicles to maximize its profit from multiple markets. The battery electric vehicle users act as followers, adjusting their charging profiles in response to these prices to minimize their personal cost, which includes electricity charges, received subsidies, and a quantified discomfort from deviating from their preferred charging schedule. This bi-level interaction captures the inherent conflict and interdependence between the aggregator’s profit motive and the users’ economic and comfort objectives.

1. Framework for BEVA Operations in Multi-Stage Markets

The BEVA operates as an intermediary between the wholesale electricity market and a large fleet of individual battery electric vehicles. Its operational framework spans three sequential markets: the Day-Ahead (DA) energy market, the Frequency Regulation (FR) market, and a Demand Response (DR) market designed for peak shaving during critical periods (e.g., extreme weather).

The BEVA’s decision process unfolds in stages. First, it must procure enough energy in the DA market to satisfy the forecasted aggregate charging needs of its customer battery electric vehicles. Subsequently, based on received signals for frequency regulation and peak demand response, the BEVA adjusts the real-time charging/discharging power of the battery electric vehicle fleet. Deviations from the DA schedule are settled in the real-time balancing market. The BEVA’s revenue streams thus include payments from the FR and DR markets, while its costs include DA energy purchases, real-time settlement costs, and the subsidies paid to battery electric vehicle users for providing flexibility services.

The interaction is fundamentally a pricing problem. The BEVA announces time-varying subsidy prices for upward regulation (discharging), downward regulation (increased charging), and demand response (discharging during peaks). A higher subsidy increases user participation and available flexible capacity for the BEVA but also reduces its net profit. Conversely, a lower subsidy preserves profit margin but may yield insufficient responsive capacity. The battery electric vehicle users, considering these subsidies and their own preferences, decide on their final charging plan. This leader-follower structure is perfectly suited for analysis via Stackelberg game theory.

2. Quantifying the Dispatchable Potential of a Battery Electric Vehicle Fleet

A crucial first step for the BEVA is to assess the aggregate flexibility—the dispatchable potential—of its managed battery electric vehicle fleet. This potential is constrained by each vehicle’s driving patterns, battery specifications, and charging infrastructure.

The operational state of a single battery electric vehicle $i$ at time $t$ can be described by its State of Charge (SOC):
$$S_i(t+1) = S_i(t) + \frac{\eta_{c} P_{i,c}^{max}(t) \Delta t}{E_i} x_i^{EV}(t) – \frac{P_{i,d}^{max}(t) \Delta t}{\eta_{d} E_i} x_i^{EV}(t)$$
where $S_i(t)$ is the SOC, $E_i$ is the battery capacity, $P_{i,c}^{max}(t)$ and $P_{i,d}^{max}(t)$ are the maximum available charging and discharging power, $\eta_c$ and $\eta_d$ are charging and discharging efficiencies, $\Delta t$ is the time interval, and $x_i^{EV}(t) \in \{-1, 0, 1\}$ indicates discharging, idle, or charging mode.

Given a battery electric vehicle‘s plug-in time $t_{start}$, plug-out time $t_{end}$, initial SOC $S_0$, minimum allowable SOC $S_{min}$, and target departure SOC $S_{expect}$, its feasible energy and power boundaries can be derived. The upper energy boundary $S_i^+(t)$ assumes immediate charging upon arrival until full, then idling. The lower energy boundary $S_i^-(t)$ assumes immediate discharging to $S_{min}$, then idling until necessary charging begins to reach $S_{expect}$ by departure.

For a large fleet of $N$ battery electric vehicles, the aggregate dispatchable power and energy boundaries are obtained by summing individual boundaries:
$$P^+(t) = \sum_{i=1}^{N} P_{i,c}^{max}(t), \quad P^-(t) = \sum_{i=1}^{N} P_{i,d}^{max}(t)$$
$$E^+(t) = \sum_{i=1}^{N} S_i^+(t) E_i, \quad E^-(t) = \sum_{i=1}^{N} S_i^-(t) E_i$$
Monte Carlo simulation is typically employed to account for the stochastic nature of battery electric vehicle usage patterns (arrival/departure times, initial SOC). Distributions for these parameters are used to generate a large number of scenarios, and the aggregated boundaries are calculated for each scenario to build a probabilistic profile of the fleet’s flexibility.

Table 1: Assumed Parameters for Battery Electric Vehicle Driving Patterns
User Type	Plug-in Time	Plug-out Time	Initial SOC
Daytime Commuter	N(08:30, 1.5²)	N(17:30, 1.5²)	N(0.4, 0.15²)
Nighttime Commuter	N(18:30, 1.5²)	N(07:30, 1.5²)	N(0.4, 0.15²)

3. Stackelberg Game Model Formulation

3.1 Upper-Level: BEVA Profit Maximization

The BEVA’s objective is to maximize its total profit over the scheduling horizon $T$ (e.g., 24 hours). The profit $F^{BEVA}$ is composed of revenues minus costs:
$$\max F^{BEVA} = F^{FR} + F^{DR} + F^{RT} – F^{DA} – F^{Chr} – F^{Sub}$$
where:

$F^{DA} = \sum_{t=1}^{T} \lambda^{DA}(t) P^{chr}(t) \Delta t$: Cost of purchasing energy $P^{chr}(t)$ in the DA market at price $\lambda^{DA}(t)$.
$F^{RT} = \sum_{t=1}^{T} \lambda^{RT}(t) (P^{up}(t) + P^{dr}(t) – P^{dn}(t)) \Delta t$: Revenue/cost from settling deviations in the Real-Time (RT) market at price $\lambda^{RT}(t)$. $P^{up}(t)$, $P^{dn}(t)$, and $P^{dr}(t)$ are power committed to up-regulation, down-regulation, and demand response, respectively.
$F^{FR} = \sum_{t=1}^{T} [\lambda^{rc}(t) P^{rc}(t) + \lambda^{rm}(t) P^{rm}(t)] \Delta t$: Revenue from the Frequency Regulation market, including capacity payment ($\lambda^{rc}(t)$) and mileage payment ($\lambda^{rm}(t)$).
$F^{DR} = \sum_{t=1}^{T} \lambda^{dr}(t) P^{dr}(t) \Delta t$: Revenue from the Demand Response market at price $\lambda^{dr}(t)$.
$F^{Chr} = \sum_{t=1}^{T} \lambda^{cs}(t) P(t) \Delta t$: Revenue from selling electricity to battery electric vehicle users at retail price $\lambda^{cs}(t)$. $P(t)$ is the actual net charging power of the fleet.
$F^{Sub} = \sum_{t=1}^{T} [\lambda^{dc\_dr}(t)P^{dr}(t) + \lambda^{dc\_up}(t)P^{up}(t) – \lambda^{cc}(t)P^{dn}(t)] \Delta t$: Total subsidy cost paid to users. $\lambda^{dc\_dr}(t)$, $\lambda^{dc\_up}(t)$, and $\lambda^{cc}(t)$ are the subsidy prices for DR, up-regulation, and down-regulation, respectively. These are the BEVA’s key decision variables.

The BEVA’s constraints include respecting the aggregate fleet power and energy boundaries derived in Section 2, as well as bounds on the subsidy prices it can set:
$$\lambda_{cc}^{min} \le \lambda^{cc}(t) \le \lambda_{cc}^{max}, \quad \frac{1}{T}\sum_{t=1}^{T} \lambda^{cc}(t) = \lambda_{cc}^{avg}$$
and similar constraints for $\lambda^{dc\_up}(t)$ and $\lambda^{dc\_dr}(t)$. The average price constraints prevent the BEVA from setting extreme prices and ensure a degree of fairness.

3.2 Lower-Level: Battery Electric Vehicle User Cost Minimization

Each battery electric vehicle user (or the aggregate fleet behaving as a single agent) aims to minimize its total cost, which is not purely monetary. It incorporates a comprehensive utility function $\Phi_t^{EV}$ that captures both economic cost and “discomfort” from deviating from an uncontrolled charging baseline $P^B(t)$. The objective is:
$$\min F^{EV} = \sum_{t=1}^{T} [ \underbrace{\lambda^{cs}(t) P(t) \Delta t}_{\text{Electricity Cost}} – \underbrace{(\lambda^{dc\_dr}(t)P^{dr}(t) + \lambda^{dc\_up}(t)P^{up}(t) – \lambda^{cc}(t)P^{dn}(t)) \Delta t}_{\text{Subsidy Revenue}} + \underbrace{\Phi_t^{EV}(P(t))}_{\text{User Utility}} ]$$

The utility function $\Phi_t^{EV}(P(t))$ is defined as:
$$\Phi_t^{EV}(P(t)) = \alpha \cdot \underbrace{[-a_1 P(t)^2 + b_1 P(t)]}_{\text{Charging Consumption Satisfaction}} + \beta \cdot \underbrace{[a_2 (P(t)-P^B(t))^2 + b_2 (P(t)-P^B(t))]}_{\text{Comfort Loss}}$$
where $\alpha + \beta = 1$. The first component models diminishing satisfaction with higher electricity expenditure (convex function). The second component quantifies the discomfort when the actual charging profile $P(t)$ deviates from the preferred baseline $P^B(t)$. Coefficients $a_1, b_1, a_2, b_2$ and weights $\alpha, \beta$ allow modeling different user types:

Price-Sensitive (Low-Income) Users: High $\alpha$, low $\beta$. They prioritize cost savings over charging convenience.
Comfort-Priority (High-Income) Users: Low $\alpha$, high $\beta$. They are less sensitive to price and prefer a charging schedule close to their baseline.

This function is piecewise linearized for computational tractability.

The lower-level problem is subject to the power and energy balance constraints for the aggregate fleet, linking the decision variables $P(t)$, $P^{up}(t)$, $P^{dn}(t)$, and $P^{dr}(t)$:
$$
\begin{aligned}
P(t) &= P^{chr}(t) – P^{dn}(t) + P^{up}(t) + P^{dr}(t) \\
0 &\le P^{dn}(t) \le P^+(t) \\
0 &\le P^{up}(t) + P^{dr}(t) \le P^-(t) \\
E^-(t) &\le E(t) = E(t-1) + (P(t)\eta_c – \frac{P^{up}(t)+P^{dr}(t)}{\eta_d}) \Delta t \le E^+(t)
\end{aligned}
$$

4. Model Solution: Single-Level Reformulation

The bilevel Stackelberg game is solved by reformulating it into a single-level Mathematical Program with Equilibrium Constraints (MPEC). This is achieved by replacing the lower-level optimization problem with its necessary and sufficient Karush-Kuhn-Tucker (KKT) conditions, which are then appended as constraints to the upper-level problem.

The Lagrangian $\mathcal{L}$ of the lower-level problem is constructed, and the stationarity, primal/dual feasibility, and complementary slackness conditions are derived. The nonlinear complementary slackness conditions are linearized using the big-M method, introducing binary variables. For example, for a constraint $g(x) \le 0$ with dual variable $\mu \ge 0$, the condition $\mu g(x) = 0$ is linearized as:
$$0 \le \mu \le M \cdot \zeta, \quad 0 \le -g(x) \le M \cdot (1-\zeta)$$
where $\zeta \in \{0,1\}$ and $M$ is a sufficiently large constant.

Furthermore, strong duality holds for the linearized lower-level problem. The equality of the primal and dual objective values at optimality provides an additional linear equation that helps simplify the model by eliminating the bilinear terms (subsidy price $\times$ response power) that appear in both the upper and lower-level objectives.

The final reformulated model is a Mixed-Integer Linear Program (MILP) that can be efficiently solved using commercial solvers like Gurobi within a modeling environment such as YALMIP in MATLAB.

5. Case Study and Analysis

A case study is designed with a BEVA managing 2000 battery electric vehicles, equally split between daytime and nighttime commuting patterns (parameters in Table 1). Market price data, including DA energy prices, real-time prices, and frequency regulation capacity/mileage prices, are based on historical patterns. The DR market is assumed to be active during two critical peak periods (13:00-14:00, 21:00-23:00) with a fixed subsidy price $\lambda^{dr}$.

Table 2: Time-of-Use Tariff for Battery Electric Vehicle Charging
Period	Time	Retail Price ($\lambda^{cs}$) (元/kWh)
Super-Peak	13:00-14:00, 21:00-23:00	1.3281
Peak	11:00-13:00, 14:00-18:00, 20:00-21:00	1.1567
Off-Peak	07:00-11:00, 18:00-20:00, 23:00-01:00	0.8464
Valley	01:00-07:00	0.5361

Four scenarios are compared:

Scenario 1 (Baseline): Smart charging without BEVA control, where battery electric vehicles simply charge at times of low retail tariff.
Scenario 2: BEVA participates in the DA and Frequency Regulation markets only.
Scenario 3: BEVA participates in the DA and Demand Response markets only.
Scenario 4: BEVA participates in the DA, Frequency Regulation, and Demand Response markets (full model).

The optimization results for Scenario 4 (full model) show that the BEVA optimally sets subsidy prices. Down-regulation subsidies $\lambda^{cc}(t)$ are generally higher than up-regulation subsidies $\lambda^{dc\_up}(t)$ to incentivize charging during regulation events, which also helps meet the fleet’s energy needs. High subsidy prices are offered during periods of high FR or DR demand to ensure sufficient user response.

The aggregate charging power $P(t)$ shows significant shifting compared to the baseline. During DR events (e.g., at 13:00), the fleet discharges power ($P(t) < 0$), providing peak shaving. The fleet’s energy trajectory $E(t)$ remains comfortably within its aggregated upper and lower boundaries throughout the day.

Table 3: Economic Comparison Across Different Market Participation Scenarios
Cost/Revenue (元)	Scenario 1 (Baseline)	Scenario 2 (DA+FR)	Scenario 3 (DA+DR)	Scenario 4 (DA+FR+DR)
BEVA Net Profit	5,476	60,352	16,550	56,696
BEVA FR Revenue	0	60,467	0	54,931
BEVA DR Revenue	0	0	14,782	7,571
Total User Cost	37,898	31,420	31,411	25,656
User Electricity Bill	37,898	38,802	38,802	39,920
User Subsidy Income	0	7,381	7,391	14,263

Key Findings from the Case Study:

Mutual Benefits: All BEVA-controlled scenarios (2, 3, 4) significantly increase BEVA profit and reduce the total cost for battery electric vehicle users compared to the smart charging baseline (Scenario 1). In Scenario 4, user cost is reduced by 32.3%, while BEVA profit increases by over 935%.
Market Interaction: The BEVA’s profit does not monotonically increase with the number of markets participated in. Scenario 2 (FR only) yields higher profit than Scenario 4 (FR+DR) under the given DR subsidy. This is because the DR subsidy paid to users can be higher than the FR subsidy, and the DR market activation periods may conflict with more lucrative FR opportunities. The BEVA’s participation in DR is highly sensitive to the exogenous DR market price $\lambda^{dr}$. A breakeven analysis shows that the BEVA only has a strong incentive to dispatch battery electric vehicles for DR when $\lambda^{dr}$ exceeds a certain threshold (e.g., 350元/MWh in this case).
Impact of User Preferences: Modeling user utility is crucial. For price-sensitive users (high $\alpha$), the BEVA sets more volatile subsidy prices, offering high premiums during critical grid events to elicit a strong response. Their resulting charging profile is more dynamic. For comfort-priority users (high $\beta$), the BEVA sets more stable subsidies, and the charging profile remains closer to the baseline. Catering to price-sensitive users provides greater cost savings for them (11.4% lower cost than comfort-priority users) and slightly higher profit for the BEVA.

6. Conclusion

This paper presents a comprehensive Stackelberg game-based framework for a Battery Electric Vehicle Aggregator to determine optimal pricing and dispatch strategies in multi-stage electricity markets. The model successfully integrates the technical constraints of a battery electric vehicle fleet, the sequential nature of energy and ancillary service markets, and the diverse economic/comfort preferences of end-users.

The results demonstrate that coordinated participation of battery electric vehicles through an aggregator can create significant value for both the grid (through FR and DR services) and the participants (lower user costs, higher aggregator profits). The analysis highlights critical practical insights: the profitability of participating in demand response programs is highly contingent on the compensation level offered by the grid operator, and the aggregator’s strategy must adapt to the predominant consumption preferences within its customer portfolio.

Future work will focus on incorporating uncertainties in market prices, battery electric vehicle availability, and user response into a robust or stochastic optimization framework. Additionally, fair and transparent profit-sharing mechanisms among battery electric vehicle users within the aggregator’s pool warrant further investigation.