Electric Car User Demand Analysis with BERTopic

In recent years, the electric car industry in China has experienced unprecedented growth, positioning the nation as a global leader in the China EV market. This rapid expansion necessitates that companies accurately capture user demands to drive product innovation and service optimization. Social media platforms, such as Weibo, have become vital channels for users to express their opinions and needs regarding electric cars. However, traditional methods for analyzing these demands often fall short due to the fast-paced nature of social media, where topics evolve quickly and hotspots shift frequently. These limitations include inadequate topic recognition comprehensiveness, weak exploration of hierarchical demand relationships, and insufficient tracking of dynamic evolution. To address these challenges, we employ the BERTopic model, a state-of-the-art topic modeling technique, to analyze user interactions related to the “BYD Han” electric car on Weibo. This approach allows us to identify hotspot topics, examine their correlations using Maslow’s Hierarchy of Needs Theory, and track their evolution over time through dynamic topic modeling. By focusing on the China EV sector, this study provides insights into user demand characteristics and their progression, offering valuable guidance for electric car manufacturers and policymakers in adapting to market changes.

The importance of understanding user demands in the electric car industry cannot be overstated. As the China EV market expands, users are increasingly vocal about their preferences, ranging from technical specifications to emotional and cultural aspects. Traditional survey-based methods often struggle to capture these dynamic and multifaceted demands in real-time, leading to gaps in product development and marketing strategies. In contrast, topic modeling of social media data enables a more agile and data-driven approach. The BERTopic model, in particular, leverages pre-trained language models to generate semantic embeddings, making it highly effective for handling short, informal texts like Weibo comments. This study not only applies BERTopic to the electric car domain but also integrates psychological theories to interpret the hierarchical nature of user demands. Through this comprehensive analysis, we aim to uncover the underlying patterns in electric car user behavior, which can inform strategic decisions in the rapidly evolving China EV landscape.

Previous research on electric cars has primarily focused on technological advancements, policy impacts, and market trends from production and governmental perspectives. For instance, studies analyzing patents and policy documents have highlighted innovations in battery technology and regulatory frameworks. However, these approaches often overlook the user-centric view, which is crucial for aligning products with market expectations. In the context of the China EV market, user feedback on social media reveals a broader spectrum of demands, including design aesthetics, cultural identity, and post-purchase experiences. Early studies using models like LDA and word2vec have identified topics such as battery performance and range anxiety, but they frequently miss emerging themes like “Chinese-style design” due to limitations in handling semantic nuances and dynamic text data. The BERTopic model overcomes these issues by incorporating contextual embeddings, allowing for more accurate topic extraction and evolution tracking. This study builds on existing literature by emphasizing user-generated content and employing advanced topic modeling to provide a holistic view of electric car demands in China.

Our methodology involves several key steps to ensure a robust analysis of electric car user demands. First, we collected data from Weibo using the keyword “BYD Han,” covering the period from January 2020 to July 2023. This timeframe captures critical phases in the China EV industry, including technological breakthroughs and market validation. After initial data retrieval, we performed preprocessing to clean the text, removing duplicates, HTML tags, URLs, and emoticons. We then used the Jieba library for word segmentation, enhanced with custom dictionaries containing electric car-specific terms and synonyms. This preprocessing resulted in 3,057 valid posts, which were vectorized using the MiniLM-L12-v2 sentence transformer model to generate semantic embeddings. The BERTopic model was applied with parameters optimized for clustering, including a minimum cluster size of 10 and automatic topic number determination. The process involved dimensionality reduction with UMAP, clustering with HDBSCAN, and topic representation using c-TF-IDF, as defined by the formula:

$$ W_{x,c} = \text{tf}_{x,c} \times \log \left(1 + \frac{A}{\|f_x\|}\right) $$

where \( W_{x,c} \) is the c-TF-IDF weight for word \( x \) in cluster \( c \), \( \text{tf}_{x,c} \) is the term frequency, \( A \) is the average number of words per cluster, and \( f_x \) is the frequency of \( x \) across all clusters. For dynamic analysis, we employed the DTM component of BERTopic, which tracks topic evolution over quarterly intervals using cosine similarity to measure topic relatedness:

$$ \text{cosine\_similarity}(A,B) = \frac{\sum_{i=1}^{n} A_i B_i}{\sqrt{\sum_{i=1}^{n} A_i^2} \sqrt{\sum_{i=1}^{n} B_i^2}} $$

This formula calculates the similarity between topic vectors \( A \) and \( B \), enabling us to observe how electric car demands shift over time in the China EV context.

The data collection and preprocessing phase yielded a rich dataset for analyzing electric car discussions. We focused on user interactions related to the BYD Han, a prominent model in the China EV market, to ensure relevance and specificity. The preprocessing steps included tokenization and stop-word removal, which are critical for handling the informal and noisy nature of social media text. By incorporating domain-specific dictionaries, we improved the accuracy of term recognition, capturing key electric car concepts such as “blade battery” and “range.” The use of BERTopic allowed us to generate high-quality topics without predefined numbers, adapting to the data’s inherent structure. This methodological rigor ensures that our findings on electric car user demands are both reliable and actionable for stakeholders in the China EV industry.

In the results section, we begin by identifying hotspot topics from the initial 37 clusters generated by BERTopic. These topics represent the most frequently discussed aspects of electric cars among users, with the top 14 topics selected for detailed analysis based on document frequency. The table below summarizes these hotspot topics, their key feature words, and their prevalence in the dataset, highlighting the core areas of user interest in the China EV market.

Topic ID Topic Name Top Feature Words Document Frequency
Topic0 New Car Release Han EV, model, blade battery High
Topic1 Chinese-style Design Chinese knot, design, tech sense High
Topic2 Product Sales sales, Han family, year-on-year growth High
Topic3 Brand Comparison Tesla, Xiaopeng P7, Model 3 Medium
Topic4 Huawei Collaboration Huawei, phone, Hicar Medium
Topic5 New Car Delivery delivery, Song Pro, owner Medium
Topic6 Taillight Design taillight, design, national pride Medium
Topic7 Model Testing Han EV, chassis, limited edition Medium
Topic8 OTA and User Feedback OTA mania, virtual battery, energy control Low
Topic9 Brand and Market brand, Chinese brand, global Low
Topic10 Collision and Accidents collision, accident, fire Low
Topic11 Version and Pricing champion edition, Han DMI, price Low
Topic12 Brand Competition Model 3, comparison, Wuling Mini Low
Topic13 Battery Technology blade battery, battery, ternary lithium Low

Analysis of these hotspot topics reveals that users in the China EV space are particularly engaged with new product launches, aesthetic elements like Chinese-style design, and sales performance. For instance, Topic0 (New Car Release) dominates discussions, with feature words such as “Han EV” and “blade battery” indicating strong interest in model-specific innovations and safety features. Topic1 (Chinese-style Design) emphasizes cultural identity, with terms like “Chinese knot” reflecting a desire for electric cars that incorporate traditional elements, which is a unique aspect of the China EV market. Topic2 (Product Sales) highlights user attention to market dynamics, where sales figures and growth rates serve as indicators of brand success. The distribution of feature word weights within topics shows that performance and design topics have balanced discussions, whereas brand and battery topics are more focused, suggesting that marketing and technological branding strategies effectively capture user interest in electric cars.

To delve deeper into the relationships between topics, we conducted a correlation analysis using cosine similarity. This involved computing the similarity scores between all pairs of initial topics, resulting in a heatmap that visualizes topic relatedness. The analysis identified several highly correlated topic pairs, which we then examined through the lens of Maslow’s Hierarchy of Needs Theory. This psychological framework categorizes human demands into five levels: physiological, safety, social, esteem, and self-actualization. In the context of electric cars, we mapped these levels to user discussions, as shown in the table below, which outlines topic pairs, their correlation scores, and corresponding demand levels.

Topic Pair Correlation Score Maslow Level Interpretation
Topic8 & Topic20 0.881 Self-actualization & Esteem OTA features and design comparisons reflect advanced technological and aesthetic pursuits.
Topic7 & Topic25 0.878 Safety & Social Model testing and comparisons indicate concerns for safety and community engagement.
Topic11 & Topic22 0.868 Safety & Social Pricing and configuration discussions relate to financial security and social identity.
Topic31 & Topic35 0.860 Social & Safety Travel and insurance topics combine leisure needs with economic safety.

The correlation results demonstrate that topics with high similarity often reside in adjacent or identical demand hierarchies, validating the hierarchical nature of user demands in the electric car domain. For example, the strong correlation between Topic8 (OTA and User Feedback) and Topic20 (Appearance Comparison) aligns with self-actualization and esteem needs, as users seek intelligent features and distinctive designs to express individuality and achieve personal fulfillment. Similarly, Topic7 (Model Testing) and Topic25 (Model Comparison) correlate highly, covering safety and social needs, where users evaluate vehicle reliability and engage in community discussions. This hierarchical progression implies that in the China EV market, users prioritize basic safety and performance before advancing to higher-level demands like social belonging and self-expression. Such insights can guide electric car manufacturers in developing targeted strategies that address the full spectrum of user needs, from foundational features to aspirational attributes.

For the dynamic analysis, we merged the initial 37 topics into 9 broader themes based on cosine similarity clustering to reduce fragmentation and enhance interpretability. The merged themes include Color and Design, Huawei Intelligence, Domestic Car Services, Purchase and Usage Experience, Model and Honors, Brand Competition, Battery Technology and Safety, New Car Launch and Promotion, and Model and Pricing. We then applied the DTM model to track the evolution of these themes over quarterly intervals from 2020 to 2023. The trends reveal significant shifts in user attention, driven by external events and market dynamics in the China EV industry. The plot of topic frequency over time shows that themes like Huawei Intelligence and Color and Design peak during specific periods, coinciding with industry collaborations and product launches. For instance, Huawei Intelligence spiked in Q2 2020 when BYD partnered with Huawei on autonomous driving solutions, while Color and Design saw renewed interest in early 2023 with new aesthetic releases. Additionally, themes such as Model and Pricing and Purchase and Usage Experience exhibit upward trends, indicating growing user focus on cost-effectiveness and holistic experiences as the electric car market becomes more competitive.

The evolution analysis underscores three key patterns in the China EV market. First, industry hotspots, such as technological partnerships and design innovations, directly influence public discourse, highlighting the importance of timely and relevant product announcements. Second, intensified competition fosters rational consumption behaviors, with users increasingly comparing prices and features across electric car brands. Third, brand marketing strategies profoundly shape user interactions, as seen in the sustained high热度 of Brand Competition, where users actively discuss and compare advancements like blade batteries and autonomous driving. These patterns suggest that electric car companies should monitor social media trends to anticipate demand shifts and tailor their innovations accordingly. The DTM model’s ability to capture these dynamics demonstrates its superiority over traditional methods like LDA, providing a more accurate and nuanced understanding of user demand evolution in the fast-paced electric car sector.

In conclusion, this study leverages the BERTopic model to analyze user demands in the China EV market, specifically focusing on the BYD Han electric car. Our findings reveal that user discussions center on new releases, cultural designs, and sales metrics, with feature word analysis indicating both dispersed and concentrated interests. The correlation analysis, integrated with Maslow’s theory, confirms the hierarchical structure of user demands, where lower-level needs like safety must be satisfied before higher-level social and self-actualization needs emerge. This hierarchy emphasizes the importance of a phased approach to product development and marketing in the electric car industry. The dynamic evolution analysis further shows that user attention is driven by industry events, market competition, and branding efforts, enabling companies to adapt strategies in real-time. The BERTopic model’s high consistency and contextual awareness make it a powerful tool for tracking these changes, outperforming traditional models like LDA. For electric car manufacturers, these insights recommend prioritizing core performance and safety features, then progressively incorporating social and aesthetic elements to meet evolving user expectations. This approach not only enhances product satisfaction but also fosters brand loyalty in the competitive China EV landscape. Future research could expand to multiple platforms and incorporate multimodal data to further refine demand analysis for electric cars.

The implications of this study extend beyond the electric car industry to other high-investment, technology-driven sectors such as smartphones and smart home devices. The methodology presented here—combining BERTopic with psychological theories—offers a scalable framework for understanding user demands in dynamic markets. As the China EV market continues to evolve, continuous monitoring of user-generated content will be essential for staying ahead of trends. By adopting advanced topic modeling techniques, companies can transform raw social media data into actionable insights, driving innovation and customer-centricity in the era of electric mobility.

Scroll to Top