An Enhanced ORB Algorithm Integrating Color Invariants and Multi-Scale Features for EV Charging Station Localization

In the context of autonomous electric vehicles, precise localization of EV charging stations is a fundamental prerequisite for enabling reliable and efficient autonomous charging operations. Current approaches to EV charging station localization predominantly rely on template matching and deep learning techniques. However, template matching methods often exhibit poor robustness under perspective changes, while deep learning-based solutions are frequently constrained by real-time performance limitations, making them unsuitable for practical deployment in dynamic environments. Additionally, the diverse appearances of EV charging stations due to manufacturer variations and complex environmental interference further complicate the localization process. To address these challenges, we propose an improved ORB feature matching algorithm that incorporates deblurring techniques and color-invariant processing while introducing scale invariance for enhanced EV charging station localization.

The proposed methodology begins with a comprehensive preprocessing stage designed to handle motion-induced blur and noise. We implement a multi-scale pyramid combined with fuzzy layer segmentation to effectively address non-uniform blurring caused by vehicle motion. The image is first converted to grayscale, and a multi-scale pyramid is constructed based on image dimensions and feature point layers. Adaptive mask segmentation is then applied to estimate non-binary masks, motion-blurred layers, and non-motion-blurred layers. The deblurring process involves constructing a blur kernel matrix K and solving through alternating optimization, where the objective functions are defined as:

$$ \arg \min_{L_i} \|k_{fi} \otimes L_i – B_{fi}\|^2 + \lambda \| \Delta L_i \|_0 $$

and

$$ \arg \min_{k_{fi}} \|k_{fi} \otimes L_i – B_{fi}\| + \gamma \|k_{fi}\|^2 + f(k_{fi}) $$

where $L_i$ represents the clear image, $f(k_{fi})$ denotes linear constraints on the blur kernel, $\|k_{fi}\|$ represents constraint terms, and $\gamma$ is the constraint coefficient. Multi-scale images undergo deconvolution deblurring within the pyramid structure, with weighted average fusion strategies generating the final clear image, effectively eliminating non-uniform motion blur.

Following deblurring, we introduce a color invariant model based on the Kubelka-Munk theory to enhance feature discrimination in complex environments. The spectral radiance model for object reflection can be expressed as:

$$ E(\lambda,x) = i(x)[1 – \rho_f(x)]^2 R_\infty(\lambda,x) + i(x)\rho_f(x) $$

where $\rho_f(x)$ represents the Fresnel reflection coefficient at position $x$, $\lambda$ denotes wavelength, $R_\infty(\lambda,x)$ is the reflectance, and $E(\lambda,x)$ represents the imaging result of spectral reflection. By performing first and second-order differentiation with respect to $\lambda$ and taking their ratio, we obtain a color invariant representation:

$$ H = \frac{E_\lambda}{E_{\lambda\lambda}} = \frac{\partial E / \partial \lambda}{\partial^2 E / \partial \lambda^2} = \frac{\partial R_\infty(\lambda,x) / \partial \lambda}{\partial^2 R_\infty(\lambda,x) / \partial \lambda^2} $$

To compute color invariants in RGB space, we apply a linear transformation to obtain spectral derivatives $(E, E_\lambda, E_{\lambda\lambda})$. The relationship between RGB components and these derivatives is given by:

$$ \begin{bmatrix} E \\ E_\lambda \\ E_{\lambda\lambda} \end{bmatrix} = \begin{bmatrix} 0.06 & 0.63 & 0.27 \\ 0.30 & 0.04 & -0.35 \\ 0.34 & -0.06 & 0.17 \end{bmatrix} \times \begin{bmatrix} R \\ G \\ B \end{bmatrix} $$

This color invariant processing significantly enhances the algorithm’s ability to distinguish features in regions where different colors share identical grayscale values, particularly important for EV charging station recognition where color cues provide critical discriminative information.

After obtaining color invariants $H(x,y)$ for both template and query images, we construct integral images to facilitate efficient computation. For an image $I(x,y)$, the integral image $I_\sum(x,y)$ at position $(x,y)$ represents the sum of all pixel values in the rectangular region from the origin to $(x,y)$:

$$ I_\sum(x,y) = \sum_{i=0}^{i \lt x} \sum_{j=0}^{j \lt y} I(i,j) $$

This representation enables rapid calculation of regional pixel sums through simple arithmetic operations on corner points:

$$ \sum = I_\sum(D) – I_\sum(B) + I_\sum(A) – I_\sum(C) $$

To address the scale invariance limitation of traditional ORB algorithms, we establish a scale space using box filters. For an image $f(x,y)$, we first apply Gaussian filtering:

$$ L(x,y,\sigma) = G(x,y,\sigma) \ast I(x,y) $$

where $\sigma$ represents scale information, $G(x,y,\sigma)$ is the Gaussian kernel function, and $\ast$ denotes convolution operation. The Laplacian of Gaussian is approximated using the Hessian matrix:

$$ H = \begin{bmatrix} L_{xx}(x,y,\sigma) & L_{xy}(x,y,\sigma) \\ L_{yx}(x,y,\sigma) & L_{yy}(x,y,\sigma) \end{bmatrix} $$

To improve computational efficiency, we replace the Gaussian second-order partial derivatives with box filters, obtaining $D_{xx}$, $D_{xy}$, $D_{yx}$, and $D_{yy}$ to approximate $L_{xx}$, $L_{xy}$, $L_{yx}$, and $L_{yy}$ respectively, forming the Fast-Hessian matrix:

$$ H_F = \begin{bmatrix} D_{xx}(x,y,\sigma) & D_{xy}(x,y,\sigma) \\ D_{yx}(x,y,\sigma) & D_{yy}(x,y,\sigma) \end{bmatrix} $$

The determinant of the Hessian matrix determines extreme points:

$$ \text{Det}(H) = D_{xx} \cdot D_{yy} – (\omega D_{xy})^2 $$

where $\omega \approx 0.9$ is a compensation coefficient accounting for approximation errors introduced by box filters.

Following feature point detection, we compute feature descriptors using the rotation-aware BRIEF (rBRIEF) algorithm. The binary test operator $\tau$ is defined as:

$$ \tau[p;x,y] = \begin{cases} 1, & p(x) > p(y) \\ 0, & p(x) \leq p(y) \end{cases} $$

where $p(x)$ and $p(y)$ represent pixel values at random points $x$ and $y$. To incorporate rotation invariance, for any keypoint with orientation $\theta$ and corresponding rotation matrix $R_\theta$, the oriented feature descriptor becomes:

$$ g_n(p,\theta) = f_n(p) \mid (x_i,y_i) \in S_\theta $$

With numerous feature points, mismatches frequently occur. We initially employ Hamming distance for coarse matching to eliminate obvious outliers, followed by an accelerated RANSAC algorithm for fine matching. For nearest neighbor matching pairs $(A_i, A_i)$ and $(B_j, B_j)$ with matching distance $l$, where $l \in [L_{\text{min}}, \lambda \cdot L_{\text{max}}]$ ($L_{\text{min}}$ and $L_{\text{max}}$ being minimum and maximum matching distances respectively, and $\lambda \in [0,1]$ typically set to 0.7), we define an evaluation function:

$$ F(i) = \sum_{j=1}^{c} \frac{R(i,j)}{1 + Y(i,j)} $$

where

$$ R(i,j) = \exp\left(-\frac{l(A_i,A_i) – l(B_j,B_j)}{Y(i,j)}\right) $$

and

$$ Y(i,j) = [l(A_i,A_i) + l(B_j,B_j)] / 2 $$

Here, $c$ represents the number of inliers, $R(i,j)$ denotes the relative distance difference between $(A_i,B_i)$ and corresponding feature points, and $Y(i,j)$ represents the average distance. We compute the average evaluation function $\bar{F}$ and retain matching points where $F(i) < \bar{F}$, forming a new sample set $C$. From this set, we randomly select four matching pairs as the inlier set $C_i$ to compute the mapping matrix.

For EV charging station pose estimation, we employ the Perspective-n-Point (PnP) algorithm, which requires at least four coplanar feature points for a unique solution. We select four vertices of the EV charging station as feature points, labeled $Q_0$, $Q_1$, $Q_2$, and $Q_3$, with corresponding world coordinates $(-135, 80, 0)$, $(135, 80, 0)$, $(135, -80, 0)$, and $(-135, -80, 0)$ in millimeters. The projection of these points in the image plane are denoted as $q_0$, $q_1$, $q_2$, and $q_3$. The coordinate transformation from template image to query image follows:

$$ \begin{bmatrix} x’ \\ y’ \\ 1 \end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & h_{33} \end{bmatrix} \cdot \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = H \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} $$

where $(x_i,y_i)$ and $(x’_i,y’_i)$ represent pixel coordinates in template and query images respectively, and $H$ is the $3 \times 3$ homography matrix. After solving for keypoint pixel coordinates in the query image, we utilize the predefined world coordinate system and EV charging station dimension information to compute object coordinates in the world frame, ultimately determining the relative pose between camera and target using PnP.

We conducted extensive experiments to evaluate the performance of our proposed algorithm for EV charging station localization. The experimental setup consisted of a ROS-controlled unmanned vehicle, an EV charging station, a depth camera with resolution 1920×1080 pixels, a Jeston Nano controller, and driving wheels. The depth camera was mounted on the front of the vehicle to capture images of the EV charging station.

Our deblurring approach demonstrated significant improvements over traditional methods. The multi-scale pyramid with fuzzy segmentation effectively handled non-uniform motion blur and noise. Comparative analysis with DeblurGAN-v2 showed our method achieving superior performance with PSNR of 36.15 dB compared to 32.52 dB, and SSIM of 0.95 compared to 0.92, as summarized in Table 1.

Table 1: Performance comparison of deblurring methods
Metric	DeblurGAN-v2	Our Method
PSNR (dB)	32.52	36.15
SSIM	0.92	0.95

We further evaluated the robustness of our deblurring algorithm under different blur types and intensities, as shown in Table 2. For Gaussian blur with $\sigma = 3.0$ simulating uniform blur from illumination changes, we achieved PSNR of 36.21 dB. For motion blur with length $l = 15.0$ pixels simulating camera displacement, SSIM reached 0.96. For mixed blur combining both types, although performance slightly decreased, the method maintained effective processing capability, demonstrating robust performance across varying blur conditions for EV charging station images.

Table 2: Deblurring robustness under varying blur conditions
Blur Type	Simulated Scenario	Simulated Intensity	PSNR (dB)	SSIM
Gaussian Blur	Uniform blur	$\sigma = 3.0$	36.21	0.94
Motion Blur	Non-uniform blur	$l = 15.0$	34.57	0.96
Mixed Blur	Complex environment	$l + \sigma$	30.43	0.89

The feature extraction and matching capabilities of our algorithm were rigorously tested against traditional ORB and SIFT algorithms using the HPatches dataset and real-world EV charging station images. Our method demonstrated substantial improvements in feature point distribution uniformity and matching accuracy. Under scale variations, our algorithm improved matching accuracy by 12.4% compared to traditional ORB while maintaining comparable performance to SIFT. Under viewpoint changes, accuracy improved by 14.5% with only 0.15 seconds additional computation time. Under illumination variations, accuracy improved by 8.9%, as detailed in Table 3.

Table 3: Feature matching performance comparison across different transformation conditions
Scenario	Algorithm	Feature Points	Time (s)	Accuracy (%)
Scale Change	ORB	152	0.26	82.70
	SIFT	247	0.45	96.40
	Our Method	398	0.57	95.10
Scale + Rotation	ORB	223	0.39	85.30
	SIFT	280	0.53	93.70
	Our Method	386	0.48	90.80
Viewpoint Change	ORB	361	0.46	80.20
	SIFT	570	0.55	98.60
	Our Method	916	0.61	94.70
Illumination Change	ORB	282	0.24	84.80
	SIFT	494	0.49	92.50
	Our Method	844	0.40	93.70

For EV charging station localization accuracy evaluation, we conducted tests at various positions relative to the charging station. The experimental results demonstrated that our method achieved positioning errors within 30 mm across all test scenarios, as shown in Table 4. The largest errors occurred at greater distances and significant orientation deviations, but overall performance remained stable and accurate for practical EV charging station localization applications.

Table 4: EV charging station localization accuracy at different positions
Test	Actual Position (mm)	Calculated Position (mm)	Error (mm)
1	(0, 50, 700)	(4.31, 49.4, 687.6)	(-4.31, 0.6, 12.4)
2	(-250, 50, 700)	(-228.5, 46.04, 685.84)	(-21.45, 3.96, 14.16)
3	(300, 50, 1000)	(277.95, 14.07, 974.77)	(22.05, 8.93, 25.23)
4	(0, 50, 400)	(-2.25, 37.31, 377.32)	(2.25, 12.69, 22.68)
5	(-350, 50, 1300)	(-335.8, 43.39, 1270.72)	(-14.2, 6.61, 29.28)
6	(250, 50, 400)	(370.71, 19.19, 378.65)	(29.29, 30.81, 21.65)

In comparison with other recent improvements to ORB algorithms, our approach demonstrates distinct advantages for EV charging station localization. While some methods focus solely on color invariance or feature distribution optimization, our comprehensive integration of deblurring, color invariants, and scale space construction provides a more robust solution for the challenging conditions encountered in real-world EV charging station environments. The method effectively handles the texture simplicity of typical EV charging stations while maintaining computational efficiency suitable for real-time applications.

In conclusion, our research presents a significantly enhanced ORB algorithm that integrates color invariants and multi-scale features specifically optimized for EV charging station localization. The key contributions include: (1) implementation of multi-scale pyramid fusion with fuzzy segmentation for effective deblurring and noise reduction, improving PSNR by 3.63 dB over conventional methods; (2) introduction of color invariant models to leverage color information while preserving ORB’s efficiency, increasing feature point detection by approximately 3 times and improving accuracy by 7.9% to 93.7%; (3) construction of scale space using integral images and box filters with Fast-Hessian matrix for scale-invariant feature extraction, improving matching accuracy by 12.4% with only 0.21 seconds additional computation time; and (4) incorporation of an accelerated RANSAC algorithm with inlier screening to eliminate mismatches and reduce computation time. The proposed method demonstrates robust performance across various environmental conditions and represents a significant advancement in reliable EV charging station localization for autonomous electric vehicles.