The new energy vehicle industry, particularly focusing on EV cars, has emerged as a pivotal sector in the global transition toward sustainable transportation. As a strategic emerging industry, the development trajectory of EV cars is critical for policy-making, corporate strategy, and consumer behavior. Traditional statistical models, such as linear regression or time series analysis, often fall short in capturing the complex, nonlinear dynamics and long-term dependencies inherent in EV car sales data due to factors like technological advancements, policy incentives, and market fluctuations. These limitations can lead to inaccurate forecasts, hindering effective decision-making. In contrast, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network, excel in modeling sequential data by incorporating gating mechanisms that retain important information over extended periods. This study leverages an LSTM model to predict annual sales of EV cars in China from 2025 to 2035, based on historical data from 2010 to 2024. By employing data normalization, optimizing hyperparameters, and evaluating model performance using multiple metrics, this research aims to provide a robust forecasting framework. The findings are intended to support strategic planning for businesses and inform purchasing decisions for consumers, thereby fostering the sustainable growth of the EV car industry.

The dataset comprises annual sales figures for EV cars in China from 2010 to 2024, sourced from publicly available official records. During data collection, it was observed that monthly sales data for certain years were incomplete. To ensure data integrity and continuity, linear interpolation was applied to fill missing values, resulting in a consistent historical dataset. This preprocessing step is crucial for maintaining the temporal structure required for LSTM modeling. The historical sales trend of EV cars shows a steady increase, with significant growth in recent years, reflecting the rising adoption of EV cars globally. The complete dataset after interpolation is visualized to illustrate the progression, highlighting the importance of accurate data handling for reliable predictions in the EV car market.
To prepare the data for LSTM modeling, normalization was performed to eliminate scale differences and enhance training efficiency. The Min-Max normalization method was applied, transforming the original sales data into a [0, 1] range. The formula used is as follows:
$$ X_{\text{norm}} = \frac{X – X_{\text{min}}}{X_{\text{max}} – X_{\text{min}}} $$
Here, \( X \) represents the original sales data, \( X_{\text{min}} \) is the minimum value in the dataset, \( X_{\text{max}} \) is the maximum value, and \( X_{\text{norm}} \) denotes the normalized data. For instance, by computing \( X_{\text{min}} \) and \( X_{\text{max}} \) from the 2010–2024 EV car sales data, each year’s sales figure \( X_i \) was transformed accordingly. This process ensures that all input values are on a comparable scale, which accelerates convergence during model training and improves prediction accuracy for EV cars. The normalized results are summarized in the table below, demonstrating the scaled values that serve as inputs to the LSTM model.
| Year | Normalized Sales |
|---|---|
| 2010 | 0.0000 |
| 2011 | 0.0001 |
| 2012 | 0.0006 |
| 2013 | 0.0010 |
| 2014 | 0.0067 |
| 2015 | 0.0320 |
| 2016 | 0.0494 |
| 2017 | 0.0761 |
| 2018 | 0.1235 |
| 2019 | 0.1185 |
| 2020 | 0.1344 |
| 2021 | 0.3474 |
| 2022 | 0.6802 |
| 2023 | 0.9381 |
| 2024 | 1.0000 |
The LSTM model architecture is designed to capture temporal dependencies in EV car sales data through its gating mechanisms: the forget gate, input gate, and output gate. These gates regulate the flow of information, allowing the model to learn long-term patterns critical for forecasting EV cars sales. The mathematical formulations for each gate are as follows:
Forget Gate: This gate determines which information to discard from the cell state. It is computed as:
$$ f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f) $$
Here, \( f_t \) is the forget gate’s output, \( \sigma \) denotes the sigmoid activation function, \( W_f \) is the weight matrix, \( h_{t-1} \) is the previous hidden state, \( x_t \) is the current input, and \( b_f \) is the bias vector. This mechanism is essential for filtering irrelevant historical data in EV cars sales sequences.
Input Gate: This gate controls the extent to which new information is stored in the cell state. It involves two steps: first, a sigmoid function decides which values to update, and second, a tanh function generates candidate values. The equations are:
$$ i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i) $$
$$ \tilde{C}_t = \tanh(W_C \cdot [h_{t-1}, x_t] + b_C) $$
where \( i_t \) is the input gate output, \( \tilde{C}_t \) is the candidate cell state, and \( b_i \) and \( b_C \) are bias terms. This allows the model to incorporate new trends in EV car sales effectively.
Cell State Update: The cell state is updated by combining the forget gate and input gate outputs:
$$ C_t = f_t \cdot C_{t-1} + i_t \cdot \tilde{C}_t $$
This equation ensures that the model retains relevant past information while integrating new inputs, which is vital for accurate predictions of EV cars sales over time.
Output Gate: This gate determines the hidden state output, which influences the final prediction. It is calculated as:
$$ o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o) $$
$$ h_t = o_t \cdot \tanh(C_t) $$
Here, \( o_t \) is the output gate result, and \( h_t \) is the current hidden state. This process enables the LSTM model to output predictions for EV car sales based on learned temporal patterns.
To optimize the LSTM model for EV cars sales forecasting, various hyperparameters were configured, and the training process was fine-tuned. The model was trained using the Adam optimizer, with mean squared error (MSE) as the loss function. The dataset was split into training and testing sets in a 9:1 ratio to ensure robust validation. Key parameters, such as the number of hidden units, learning rate, batch size, and dropout rate, were carefully selected to prevent overfitting and enhance generalization. The table below outlines the specific hyperparameter values used in the training process, which contributed to the model’s ability to capture the dynamics of EV car sales.
| Parameter | Value |
|---|---|
| Input Features | 1 |
| Output Features | 1 |
| Optimizer | Adam |
| Loss Function | MSE |
| Training-Test Split | 9:1 |
| Max Epochs | 1500 |
| Hidden Units | 100 |
| Epochs | 125 |
| Learning Rate | 0.051 |
| Batch Size | 85 |
| Dropout Rate | 0.2 |
| Learning Rate Decay Factor | 0.33 |
| Gradient Threshold | 0.085 |
The model’s performance was evaluated using multiple metrics to assess its predictive accuracy for EV cars sales. The metrics include the coefficient of determination (R²), mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). These indicators provide a comprehensive view of the model’s fit and error magnitude. The formulas for these metrics are as follows:
$$ R^2 = 1 – \frac{\sum_{i=1}^{N} (y_i – \hat{y}_i)^2}{\sum_{i=1}^{N} (y_i – \bar{y})^2} $$
where \( y_i \) is the actual sales value, \( \hat{y}_i \) is the predicted value, \( \bar{y} \) is the mean of actual values, and \( N \) is the number of samples. A higher R² value indicates better explanatory power for EV car sales trends.
$$ \text{MSE} = \frac{1}{N} \sum_{i=1}^{N} (y_i – \hat{y}_i)^2 $$
$$ \text{RMSE} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (y_i – \hat{y}_i)^2} $$
MSE and RMSE quantify the average squared and root-squared errors, respectively, with lower values signifying higher precision in forecasting EV cars sales.
$$ \text{MAE} = \frac{1}{N} \sum_{i=1}^{N} |y_i – \hat{y}_i| $$
$$ \text{MAPE} = \frac{100\%}{N} \sum_{i=1}^{N} \left| \frac{y_i – \hat{y}_i}{y_i} \right| $$
MAE measures the average absolute error, while MAPE expresses error as a percentage, making it useful for comparing across different scales in EV car sales data. The evaluation results demonstrate the model’s robustness, as detailed in the subsequent section.
After training, the LSTM model exhibited stable convergence, with both training and validation loss curves declining steadily and plateauing at low values. This indicates that the model effectively learned the temporal patterns in EV cars sales data without overfitting. The loss curves show a consistent decrease, reflecting the model’s ability to generalize to unseen data. Such stability is crucial for reliable predictions of EV car sales, as it ensures that the model captures underlying trends rather than noise.
The model evaluation on the test set yielded strong performance metrics, confirming its accuracy in predicting EV cars sales. The results are summarized in the table below, highlighting the model’s high R² value and low error rates. These metrics underscore the LSTM model’s suitability for forecasting EV car sales, providing a solid foundation for future projections.
| Metric | Value |
|---|---|
| R² | 0.97447 |
| MSE | 25.011 |
| RMSE | 5.0011 |
| MAE | 3.6013 |
| MAPE | 0.37885% |
With the trained model, sales of EV cars from 2025 to 2035 were forecasted by inputting the historical data. The normalized predictions were converted back to original scales using the inverse normalization formula:
$$ X = X_{\text{norm}} \cdot (X_{\text{max}} – X_{\text{min}}) + X_{\text{min}} $$
This process yielded the projected sales figures for EV cars, which are presented in the table below. The results indicate a substantial growth trajectory, with sales of EV cars expected to increase steadily over the next decade. This trend aligns with global shifts toward sustainable transportation and emphasizes the expanding market for EV cars.
| Year | Sales (Ten Thousands) |
|---|---|
| 2025 | 791.3 |
| 2026 | 853.6 |
| 2027 | 906.4 |
| 2028 | 1001.1 |
| 2029 | 1123.8 |
| 2030 | 1240.5 |
| 2031 | 1395.7 |
| 2032 | 1546.2 |
| 2033 | 1714.1 |
| 2034 | 1996.4 |
| 2035 | 2538.6 |
The forecasted sales trend for EV cars shows a consistent upward curve, reflecting accelerated adoption and market expansion. This growth is driven by factors such as technological advancements, supportive policies, and increasing consumer awareness. The rising sales of EV cars underscore the importance of proactive strategies for stakeholders in the industry.
Based on the forecasting results, several recommendations can be made to leverage the growing market for EV cars. For businesses, it is essential to intensify research and development efforts in core technologies. As sales of EV cars continue to climb, competition will intensify, necessitating innovations in battery efficiency, charging infrastructure, and autonomous driving features. Companies should collaborate with academic institutions to accelerate technology transfer and maintain a competitive edge in the EV car market. Additionally, diversifying product lines to cater to various consumer segments—such as economy, premium, and commercial EV cars—can capture broader market share. Emphasizing sustainability through green manufacturing processes and battery recycling will enhance brand image and appeal to environmentally conscious consumers of EV cars.
For consumers, early planning for EV car purchases is advisable, given the projected increase in sales and potential phase-out of conventional vehicles. Understanding the usage characteristics of EV cars, including charging patterns and range management, can facilitate a smoother transition. Consumers should also monitor government incentives, such as subsidies and tax benefits, to reduce the total cost of ownership for EV cars. Furthermore, staying informed about technological advancements can help in selecting EV car models with long-term value, avoiding obsolescence due to rapid innovation. By aligning purchasing decisions with market trends, consumers can maximize benefits from the evolving EV car landscape.
In conclusion, the LSTM model proves to be a powerful tool for forecasting EV car sales, offering high accuracy and reliability. The predicted growth in EV cars sales from 2025 to 2035 highlights the sector’s potential and the need for strategic adaptations by both enterprises and consumers. This research contributes to the broader understanding of the EV car industry’s dynamics, supporting informed decision-making for sustainable development. Future work could incorporate additional variables, such as economic indicators or policy changes, to further refine predictions for EV cars.
