Multi-Format Data Conversion for EV Battery Pack Test System Scheme Editors

The quality and reliability of an electric vehicle’s (EV) core energy source—the EV battery pack—are paramount. Comprehensive testing is therefore a critical phase in the manufacturing lifecycle. Modern EV battery pack test systems rely on precise, step-by-step procedures defined in test schemes. These schemes must be authored, edited, and exchanged between different system components, such as web-based graphical editors and the physical test system’s control software. This necessity gives rise to a fundamental technical challenge: the efficient and accurate conversion of data between the formats native to these different environments. Typically, web applications favor JSON (JavaScript Object Notation) for its lightweight and human-readable structure, while many industrial control systems, including those for EV battery pack testing, utilize XML (eXtensible Markup Language) for its robustness and schema validation capabilities.

This article explores methods for implementing bidirectional data conversion between JSON and XML, specifically tailored for the context of a EV battery pack test system. We will investigate, implement, and benchmark several algorithmic approaches, focusing on their accuracy and performance when handling both concise, manually crafted test sequences and large, complex datasets generated by actual EV battery pack test cycles.

1. Data Models and the Conversion Challenge

The core task is to transform a hierarchical data structure representing a test scheme from one serialization format to another without losing semantic information. We define the target XML structure using a formal Document Type Definition (DTD) and model the corresponding JSON structure.

1.1 XML and DTD Model

An XML document for a test scheme can be modeled as a labeled, ordered tree. Let $T = (N, <_{child}, <_{sib}, root, \lambda, \rho)$ be such a tree, where:

$N$ is a finite set of nodes (elements and attributes).
$<_{child}$ and $<_{sib}$ define the child and sibling relations, respectively.
$root \in N$ is the root element.
$\lambda: N \rightarrow E$ is a function assigning an element type from set $E$ to each element node.
$\rho_{@a}: N \rightarrow Str$ is a partial function assigning a string value from set $Str$ to an attribute named $a$.

The structure of valid documents is constrained by a DTD $D = (P, R, r)$, where:
$$ P: E \rightarrow Reg(E) $$
maps each element type to a regular expression over $E$ defining its permissible content model. The syntax for these expressions is:
$$ \mu ::= \epsilon \mid e \mid (\mu | \mu) \mid (\mu , \mu) \mid \mu^* $$
where $e \in E$, $\epsilon$ denotes the empty sequence, $|$ denotes choice, $,$ denotes sequence, and $*$ denotes repetition.

For an EV battery pack test scheme, $E$ might include elements like <TestStep>, <Action> (e.g., BTS, HIPOT, Relay), and <Parameter>. A simplified DTD fragment could be:
$$ P(\text{TestSequence}) = \text{TestStep}^* $$
$$ P(\text{TestStep}) = (\text{Action}, \text{Parameter}^*) $$

1.2 JSON Data Model

The corresponding JSON structure is inherently based on key-value pairs and ordered lists. It can be modeled as a set of objects $O$. Each object $o \in O$ is a finite set of members $m$. A member is a pair $(k, v)$ where $k$ is a string key and $v$ is a value. A value $v$ can be:
$$ v ::= s \in Str \mid n \in \mathbb{R} \mid \text{true} \mid \text{false} \mid \text{null} \mid o \in O \mid [v_1, …, v_n] $$
where $[v_1, …, v_n]$ denotes an array of values.

The hierarchical test scheme data from the web editor for the EV battery pack is serialized into such a JSON structure, where objects represent complex entities (like a step), and arrays represent lists (like a sequence of parameters).

1.3 The Conversion Mapping

The conversion problem is to define a bi-directional mapping $f: T \leftrightarrow O$ between the XML tree $T$ valid under DTD $D$ and the JSON object structure $O$. Key challenges include:

Structural Homogenization: XML distinguishes between elements and attributes; JSON does not. A strategy must be chosen (e.g., prefixing attribute keys with ‘@’).
Order Preservation: While XML element order is significant, JSON object member order is typically not guaranteed, though arrays preserve order.
Data Type Fidelity: Ensuring numbers, booleans, and null values are correctly translated.
Schema Compliance: The generated XML must conform to the strict DTD $D$ expected by the legacy EV battery pack test system.

2. Investigated Conversion Methods

We implemented and evaluated three distinct methods for converting JSON to the target XML format required by the EV battery pack test system.

2.1 Library Function: `df.to_xml()`

This method leverages the pandas library in Python. The principle is a two-step transformation: first, flatten the JSON into a tabular DataFrame representation, then serialize the DataFrame to XML.

JSON to DataFrame: The nested JSON object is normalized into a flat pandas DataFrame, potentially creating multiple rows or columns for nested structures.
DataFrame to XML: The df.to_xml() method is invoked, which outputs a generic XML structure based on the DataFrame’s rows and columns.

The mapping is automated by the library but is generic and not aware of the specific DTD $D$ for the EV battery pack test system. The conversion can be expressed as:
$$ O_{json} \xrightarrow{\text{normalize}} \text{DataFrame} \xrightarrow{\text{to\_xml()}} T_{generic} $$

2.2 Recursive Dictionary Method: `dict_to_xml()`

This is a classic recursive algorithm that traverses the Python dictionary (parsed from JSON) and builds an XML tree using the ElementTree API.

function dict_to_xml(tag, d):
    elem = Element(tag)
    for key, val in d.items():
        if isinstance(val, dict):
            elem.append(dict_to_xml(key, val))
        elif isinstance(val, list):
            for sub_val in val:
                elem.append(dict_to_xml(key, sub_val))
        else:
            child = SubElement(elem, key)
            child.text = str(val)
    return elem

The algorithm’s logic is:
$$ \text{Process}(key, value) =
\begin{cases}
\text{CreateElement}(key), \text{then } \forall v_i \in value: \text{Process}(key, v_i) & \text{if } value \text{ is a list} \\
\text{CreateElement}(key), \text{then } \forall (k_i, v_i) \in value: \text{Process}(k_i, v_i) & \text{if } value \text{ is a dict} \\
\text{CreateElement}(key, text=value) & \text{otherwise}
\end{cases}
$$

While flexible, it requires post-processing to match attribute conventions and the exact structure $T$ required by the EV battery pack system DTD.

2.3 Custom Converter: `JSON2XMLConverter`

We developed a custom converter function specifically designed to output XML that strictly adheres to the DTD $D$ of the target EV battery pack test system. This algorithm incorporates explicit knowledge of the schema.

Schema-Aware Traversal: It recursively traverses the JSON object $O$ but uses a rule set derived from $D$ to decide the XML tag names and whether a JSON key corresponds to an XML element or an attribute.
Conditional Logic: Based on the key name and its position in the hierarchy, the algorithm applies specific formatting rules (e.g., adding namespace prefixes, handling special step types like “BTS” or “HIPOT”).
Direct Tree Construction: It builds the ElementTree directly in the format required by the final system, minimizing the need for post-processing.

This process can be seen as a guided mapping:
$$ O_{json} \xrightarrow{f_D} T_{valid} $$
where $f_D$ is a function that incorporates the rules of DTD $D$. The core decision logic for a key $k$ with value $v$ is:
$$ f_D(k, v, \text{context}) =
\begin{cases}
\text{createAttribute}(k, v) & \text{if } k \text{ matches attr pattern in context} \\
\text{createElement}( \text{mapTag}(k) ), \text{ then } f_D(k’, v’, \text{newContext}) & \text{if } v \text{ is dict/list} \\
\text{createElement}( \text{mapTag}(k), text=v ) & \text{otherwise}
\end{cases}
$$

3. Experimental Design and Datasets

To evaluate these methods, we designed experiments focusing on accuracy and performance using two distinct datasets relevant to EV battery pack testing.

3.1 Datasets

Dataset Name	Description	Characteristic	Purpose
Short_File	Manually curated, containing a small sequence of common test steps (e.g., BTS, HIPOT, Relay operations).	Low complexity, limited nesting, small size (~1-10 KB).	Evaluate accuracy in typical, simple editing scenarios.
Long_File	Generated automatically by the EV battery pack test system, representing a full, complex test cycle.	High complexity, deep nesting, large size (~100 KB – 1 MB).	Evaluate performance and accuracy under real-world, strenuous conditions.

3.2 Evaluation Metrics

We defined the following quantitative metrics for comparison:

Conversion Accuracy (A): The primary metric. Measured by parsing the generated XML and the original JSON back into canonical tree structures and calculating the normalized tree edit distance or, more simply, the exact match ratio of key-value paths.
$$ A = 1 – \frac{\text{Number of Mismatched Paths}}{\text{Total Number of Paths}} $$
A perfect score is 1.0.
Conversion Time (T_c): The CPU time (in seconds) taken to execute the core conversion algorithm from JSON string to a basic XML tree object.
Formatting Time (T_f): The time taken to prettify or serialize the XML tree into a final, well-formatted string. (Note: JSON2XMLConverter merges this with T_c).
Total Running Time (T_t): The end-to-end time: $T_t = T_c + T_f$.

3.3 Experimental Procedure

For each dataset size $n$ (scaled from 1 to 100 files), we performed the following steps:

Read the set of $n$ JSON files (from either Short_File or Long_File pool).
For each of the three methods $M \in \{ \text{df.to_xml}, \text{dict_to_xml}, \text{JSON2XMLConverter} \}$:

Start timer.
Execute $M$(JSON) to get XML tree.
Stop timer for $T_c(M)$.
Format the XML tree to string.
Stop timer for $T_t(M)$. Calculate $T_f(M) = T_t(M) – T_c(M)$.
Validate the output XML against the reference (canonicalized original) to compute $A(M)$.

Aggregate results (average time, average accuracy) over the $n$ files for each method.

4. Results and Performance Analysis

4.1 Accuracy Assessment

The accuracy $A$ was the most critical metric for ensuring the EV battery pack test system would execute the correct scheme.

Method	Accuracy on Short_File (A_s)	Accuracy on Long_File (A_l)	Analysis
`df.to_xml()`	0.434 – 0.444	0.285 – 0.294	Poor accuracy. The generic tabular conversion fails to preserve the complex nested semantics of the EV battery pack test scheme, leading to significant data loss or misrepresentation.
`dict_to_xml()`	0.903 – 0.921	0.800 – 0.818	Good accuracy on short files, but degrades on long files. The generic recursive approach handles nesting well but may misinterpret keys that should be attributes or fail on schema-specific constraints, causing increasing errors with complexity.
`JSON2XMLConverter`	0.983 – 0.984	0.632 – 0.636	Excellent accuracy on short files, near-perfect. However, accuracy is lower on long files. This suggests the schema-aware rules are precise for standard sequences but may encounter edge cases or unhandled data patterns in large, auto-generated EV battery pack test cycles.

The trend shows a clear trade-off. The custom converter is superior for the intended use case of editor-generated schemes (Short_File). For processing large, system-generated dumps (Long_File), dict_to_xml() provides more robust, albeit not perfect, accuracy.

4.2 Runtime Performance Assessment

Performance is crucial for user experience in the web editor and for backend processing of large EV battery pack test logs.

Method	Trend of T_c (Conversion Time)	Trend of T_f (Formatting Time)	Trend of T_t (Total Time)	Analysis
`df.to_xml()`	Steep increase with file size/quantity. Long_File >> Short_File.	Steep increase. Long_File >> Short_File. Becomes dominant cost.	Very high growth rate. Impractical for large-scale EV battery pack data processing.	The two-step process via DataFrame is computationally expensive, especially the serialization to formatted XML.
`dict_to_xml()`	Moderate increase. Long_File > Short_File.	Moderate increase. Long_File > Short_File.	Linear growth with data size. Viable but not optimal for very large N.	The recursive algorithm is efficient. The separate formatting step adds a consistent overhead.
`JSON2XMLConverter`	(Combined in T_t) Gentle increase. Long_File ≈ Short_File.		Lowest absolute time and most stable growth. Scales efficiently.	By directly generating the correctly formatted XML string during traversal, it eliminates the formatting overhead and benefits from optimized, schema-specific code paths. Ideal for the EV battery pack test system environment.

The performance results clearly favor the JSON2XMLConverter. Its design eliminates intermediate representations and extra serialization steps, yielding superior speed and scalability, which is essential for handling the data flows in an EV battery pack production testing environment.

5. Discussion and Conclusion

This investigation into multi-format data conversion for EV battery pack test systems reveals that the choice of algorithm has a profound impact on both the correctness and the efficiency of the integration between web-based editors and legacy control software.

5.1 Summary of Findings

The benchmark leads to the following conclusions:

df.to_xml() is unsuitable for this domain-specific task. Its low accuracy and poor performance disqualify it for use in a precision-critical EV battery pack testing context.
dict_to_xml() offers a robust baseline. It provides good accuracy, especially on simpler data, and reasonable performance. It is a viable fallback or generic solution when schema compliance is slightly less rigid.
JSON2XMLConverter is the optimal choice for the primary use case. It delivers exceptional accuracy on editor-created test schemes and outstanding runtime performance. Its schema-aware design ensures the generated XML seamlessly integrates with the existing EV battery pack test system.

The apparent accuracy drop for the custom converter on the Long_File dataset is not a failure but an indicator of its specificity. It is tuned for the “write” path (editor to system), perfectly translating human-designed schemes. The Long_File dataset tests the “read” path (system dump to JSON), which may contain internal states or formats outside the editor’s purview. A complementary, specially tuned converter would be needed for that inverse operation.

5.2 Practical Implications and Future Work

For implementers of EV battery pack test systems, we recommend a hybrid strategy:

Use the custom JSON2XMLConverter as the default for saving and exporting schemes from the web editor to the test system, guaranteeing correctness and speed.
Implement a separate, robust import routine (potentially based on an enhanced dict_to_xml() logic) for loading and analyzing system-generated log files.

Future work will focus on enhancing the custom converter’s resilience to a wider variety of input patterns found in extensive EV battery pack test logs and exploring the formal verification of the conversion mapping against the XML Schema (XSD) to ensure guaranteed compliance. Furthermore, the integration of these converters into a unified data exchange framework will further streamline the workflow for EV battery pack validation engineers, contributing to faster development cycles and higher product quality.