The image fusion rule determines the retention degree of each original image in the fused image. In this paper, the fusion rules are designed for the components of different frequency coefficients after compression.

The low-frequency coefficients (LL components) represent the approximation of the original image. The brightness and contrast of the fused image are mainly determined by the fusion rules of the LL components. The imaging effects of infrared and visible cameras are quiet different, because of their different imaging principle. In some environments which are favorable for visible light imaging, the visible image has large amount of information and the texture is rich while in the dark, foggy environments, the infrared imaging has its advantages of clear, stable and can indicate thermal information. For this part of the low-frequency coefficients, the local spatial frequency-based weighted fusion rule will be suitable, that is, the image with rich local spatial information will be set to a larger weight in the fusion process. Formula (1) shows the definition of local spatial frequency.

$$ \mathrm{R}\mathrm{F}\left( x, y\right)=\sqrt{{\displaystyle \sum_{m\in W, n\in W}{\left( f\left( x+ m, y+ n\right)- f\left( x+ m, y-1+ n\right)\right)}^2}/\left( w\times w\right)} $$

(1)

$$ C F\left( x, y\right)=\sqrt{{\displaystyle \sum_{m\in W, n\in W}{\left( f\left( x+ m, y+ n\right)- f\left( x-1+ m, y+ n\right)\right)}^2}/\left( w\times w\right)} $$

$$ S F\left( x, y\right)=\sqrt{RF{\left( x, y\right)}^2+ CF{\left( x, y\right)}^2} $$

In the above formula, RF (*x*, *y*) represents the local (in the window with the dimension of *W* × *W*)) spatial frequency of the pixel (*x*, *y*) in row direction, and CF (*x*, *y*) represents the local spatial frequency in column direction; SF (*x*, *y*) is the local spatial frequency of the point (*x*, *y*).

The LH and HL image components correspond to the boundary of the smooth part of the image. In order to highlight the feature of the target region and ensure that the local region with large energy can be clearly reflected in the fused image, the fusion rule will be based on the regional energy feature. The regional energy feature is defined as follows:

$$ E\left( x, y\right)={\displaystyle \sum_{m\in M, n\in N}{\left( f\left( x+ m, y+ n\right)\right)}^2} $$

(2)

The fusion rule of the LH and HL components pays much attention to retain the coefficients with large regional energy, so it is defined as follows:

$$ F\left( x, y\right)=\left\{\begin{array}{cc}\hfill I\left( x, y\right)\hfill & \hfill if\kern1em {E}_{\mathrm{I}}\left( x, y\right)>{E}_V\left( x, y\right)\hfill \\ {}\hfill V\left( x, y\right)\hfill & \hfill if\kern1em {E}_V\left( x, y\right)>{E}_I\left( x, y\right)\hfill \end{array}\right. $$

(3)

The HH components are the highest frequency components after decomposition, corresponding to the image detail textures. So the fusion rule of these parts will adopt the absolute maximum fusion rule so that the fused image can keep rich texture information.