SAR image target detection in complex environments based on improved visual attention algorithm
© Liu and Cao; licensee Springer. 2014
Received: 2 January 2014
Accepted: 26 March 2014
Published: 5 April 2014
A novel target detection algorithm for synthetic aperture radar (SAR) images based on an improved visual attention method is proposed in this paper. With the development of SAR technology, target detection algorithms are confronted with many difficulties such as a complicated environment and scarcity of target information. Visual attention of the human visual system can make humans easily focus on key points in a complex picture, and the visual attention algorithm has been used in many fields. However, existing algorithms based on visual attention models cannot obtain satisfactory results for SAR image target detection under complex environmental conditions. After analysing the existing visual attention models, we combine the pyramid model of visual attention with singular value decomposition to simulate the human retina, which can make the visual attention model more suitable to the characteristics of SAR images. We introduce variance weighted information entropy into the model to optimize the detection results. The results obtained by the existing visual attention algorithm for target detection in SAR images yield a large number of false alarms and misses. However, the proposed algorithm can improve both the efficiency and accuracy of target detection in a complicated environment and under weak-target conditions. The experimental results validate the performance of our method.
The visual attention model for synthetic aperture radar (SAR) image target detection plays a positive role because the human visual system focuses on the areas of interest and rapidly decides on them . Visual attention greatly improves the ability of the human visual system to deal with images under a complex environment. Therefore, a positive effect can be gained by introducing visual attention into target detection in SAR images.
With the continuous development in military technology, target detection in a SAR image becomes more difficult. The environment around targets becomes more complicated, and information on targets lessens. Traditional detection methods for SAR images such as constant false alarm rate (CFAR) have been improved to adapt different conditions, but they are also difficult to perform outstandingly. The reason is that under a complicated environment and weak-target condition, key pixels are lacking and interference pixels are too much. Limited improvements cannot compensate for defects. Existing visual attention models generally use the Gaussian pyramid model proposed by Burt and Crouely to simulate the feature of the human retina . The feature can be described by considering that the centre of the retina has a smaller receptive field; on the other hand, the receptive field at the periphery of the retina is much bigger . Therefore, we can conclude that as the feature sampling density and visual resolution of a position become smaller with the increase in distance from the centre of the human retina, the peripheral information is compressed . After building a Gaussian pyramid model, the visual attention system guides the attention to an area of interest according to some features of a target, such as the shape, colour and intensity . Current visual attention methods are not suitable for target detection in a complicated environment or under weak-target conditions .
The Gaussian pyramid model of visual attention suffers from the difficulty of effectively compressing a SAR image with weak targets. But compressing an image to different rates and keeping important information is of importance for the visual attention algorithm. Therefore, this paper proposes a new visual attention algorithm for target detection. Since the singular value decomposition (SVD) method can keep the important information of a SAR image when the image is compressed , combining it with the Gaussian pyramid model can produce images with different compression ratios, which enable the image to retain the target information and well obscure the environment information. After combining SVD with the Gaussian pyramid model, the variance weighted information entropy (WIE) method is used to distinguish the different types of areas and filter out the regions of interest (ROIs) without targets. As a result, the purpose of efficient target detection under a complicated environment and weak targets can be achieved.
The remainder of this paper is structured as follows: Section 2 introduces in detail the classic visual attention model and the pyramid model. Section 3 presents the steps of each detection stage and focuses on the SVD method and the variance WEI. Section 4 presents the experimental results of the SAR target extraction using the proposed techniques, and Section 5 draws the conclusion.
2 Classic visual attention model
The algorithm proposed in this paper is an improvement of Itti. Itti is a classic visual attention model. The Itti model determines the ROI in human eyes, which includes the target to be detected as a set of significant pixels in images, and then extracts the ROI by finding a significant pixel in the image [8–10]. The model can adaptively detect the ROIs in the image. Compared with most of the traditional algorithms that need to manually specify the ROI, the Itti model enjoys great advantages in target detection and recognition in image processing. Here, we first introduce the calculation process of the Itti model, and the details can be found in the literature . Then, we describe the details on building the pyramid model.
Then, we build a Gaussian pyramid for every feature. In the second part, a centre-surround difference module is used to extract the feature map. In the third part, a plurality of different feature maps is merged to form the conspicuity map, which is a saliency map, through an effective feature consolidation strategy. In the fourth part, the focus of the attention area is located based on the saliency map. In the final part of the visual attention model, the winner-take-all competition net is used to find the most significant point from the saliency map, and the inhibition-of-return is used to ensure that the area would not be focused again [12–14].
The Itti model adopts the linear discrete Gaussian filter to perform the smoothing and downsampling in the horizontal and vertical directions of the input image, respectively, and forms eight different resolution sub-images . Including the original image, nine images are required to build up the Gaussian pyramid structure. The smoothing filter is [1 4 6 4 1], and downsampling is achieved by a convolution with a [1 1]/2 filter. We use the two filters to achieve our objective, which take the average value of every two pixels in the previous image as one pixel value in the next image. The two steps can be combined into a convolution with a filter K = [1 4 6 4 1]*[1 1]/2 in the horizontal and vertical directions.
3 Improved visual attention model
In recent years, the SVD algorithm has been widely studied. This algorithm extracts the algebraic feature from an image . SVD has the characteristics of energy aggregation for an image , which makes it a popular technology in the area of image compression. The features represent the essential characteristics of an image; therefore, the SVD algorithm has the advantage of being insensitive to noise and complexity of an image. The variance WEI is a statistical form of the characteristics, which reflects the average information of a figure. It was first used to detect infrared images .
3.1 SVD integrated into visual attention
To make the visual attention model adjust to the target detection in a complicated-environment SAR image, we combine SVD with the pyramid model. This combination can compress the original image to a number of images with different compression rates. The information about the targets in the resulting images with different compression rates is retained and that on the environment is obscured. This result allows the centre-surround module to find the saliency point more effectively and optimise the detection results under a complicated environment.
The major steps of the proposed SVD-pyramid model method are as follows:
For an original image A, it is decomposed into a diagonal matrix B and another two orthogonal matrices using SVD.
The number of nonzero elements R is computed, and the elements are arrayed in a descending order to form vector P.
The m number of biggest elements from vector P is retained. The remaining elements form vector Q. The value of m is equal to 60% of R.
A new diagonal matrix B 1 is constructed using the elements of vector Q.
A new image A 1 is generated using the diagonal matrix B 1 and the two orthogonal matrices.
The above steps are repeated until the number of nonzero elements is less than 1 or R × 0.02.
3.2 Post-processing based on the WIE algorithm
The above results show that the WIE value of a real SAR image in different areas has a large difference. Based on the difference, we can determine the areas without targets. Therefore, the WIE method can be used to achieve the post-processing of the target detection.
3.3 Processing steps
Because the algorithm employed in this study is based on the Itti model, the detailed steps of the algorithm will not be presented. In the feature extraction module, we use the features that include the intensity, colour, orientation and consistency. In the centre-surround difference module, the no. 2, 3 and 4 images are selected as the centre image, and the numbers of the surround images are 2, 3 and 4.
4 Simulation and results
To verify the feasibility of the proposed algorithm, several simulations are performed. The simulation data are divided into two types, i.e. type 1 and type 2. Type 1 consists of 20 images, and the size of each image is 384 × 256. These are images in a simple environment and conspicuous target conditions. These targets are tanks in a grassland environment. Type 2 consists of images in a complicated environment and weak targets, and the size of each image is 2,406 × 512. The targets are tanks in grassland and jungle environment.
Whole performance comparisons among the three algorithms in a simple environment
Number of all targets
Number of detected targets
Number of undetected targets
Size of FOA (number of pixels)
2,000 to 7,000 (large)
The proposed algorithm
1,000 to 1,500 (small)
Whole performance comparisons among the three algorithms in a complicated environment
Number of all targets
Number of detected targets
Number of undetected targets
Size of FOA (number of pixels)
2,000 to 9,000 (large)
The proposed algorithm
1,000 to 1,500 (small)
By comparing the detected rates, we determine that the performance of the classic algorithm is significantly lower than that of the other two algorithms and the proposed algorithm performs best. From the comparison of the false alarm rates, the false alarm rate and misses of the classic algorithm are high. Although the CFAR method successfully detects all the targets, the false alarm rate is higher than that of the proposed method. The detected rate is defined as the ratio of the number of correct detected target to real target. The false alarm rate is defined as the ratio of the number of false detected target to real target. By comparing the three methods, the classic method produces too many false alarms which affect the detection accuracy. Itti model methods miss a lot of targets, but the proposed method can make up for the lack. In summary, the proposed algorithm can obtain not only a high detection rate but also a low false alarm rate. The improved visual attention algorithm can adapt to the conditions of a complicated environment and weak target.
Figure 7 shows that the detection rate of the classic method rapidly decreases. Because the complexity of the image increases, the background becomes complicated, and the classic method detects more complicated regions than real targets. The detection rate of the CFAR method decreases slowly. It is based on a large number of false alarms. With the increase in complexity, the detection efficiency rapidly decreases. The proposed method yields better results in both detection and false alarm rates.
In this study, we have developed an improved visual attention algorithm adapted to SAR image target detection under complicated environment and weak-target conditions. The method of combining SVD with the pyramid model and using WIE to filter out the false alarms has been introduced in detail. To validate the performance of the method, some simulations were performed. The results show the feasibility of the improved visual attention algorithm to SAR image target detection under complicated environment conditions.
This work was supported in part by the National Natural Science Foundation of China under Projects 60802065, 61271287 and 61371048.
- Walther DB, Koch C: Attention in hierarchical models of object recognition. Prog. Brain Res. 2007, 165: 57-78.View ArticleGoogle Scholar
- Itti L, Koch C: Computational modeling of visual attention. Nat. Neurosci. 2001, 2: 194-203. 10.1038/35058500View ArticleGoogle Scholar
- Walther D PhD thesis. In Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics. Pasadena, CA: California Institute of Technology; 2006.Google Scholar
- Tsotsos JK, Liu Y, Martinez-Trujillo J, Pomplun M, Simine E, Zhou K: Attending to visual motion, computer vision and image understanding. Spec. Issue Attention Perform. Comput. Vision 2005, 100: 3-40.Google Scholar
- Navalpakkam V, Itti L: An integrated model of top-down and bottom-up attention for optimal object detection. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New York; June 2006:2049-2056.Google Scholar
- Feng J, Cao Z, Pi Y: Multiphase SAR image segmentation with G0 statistical model based active contours. IEEE Trans. GRS 2013, 51(7):4190-4199.Google Scholar
- Rybak IA, Gusakova VI, Golovan AV, Podladchikova LN, Shevtsova NA: A model of attention-guided visual perception and recognition. Vis. Res. 1998, 38: 2387-2400. 10.1016/S0042-6989(98)00020-0View ArticleGoogle Scholar
- Shah S, Levine MD: Visual information processing in primate cone pathways-part. I. A model, systems, man, and cybernetics. IEEE Trans. 1996, 2: 259-274.Google Scholar
- Li Z, Itti L: Saliency and gist features for target detection in satellite images. IEEE Trans. Image Process. 2011, 20(7):2017-2029.MathSciNetView ArticleGoogle Scholar
- Desimone R, Duncan J: Neural mechanism of selective visual attention. Annu. Rev. Neurosci. 1995, 18: 193-194. 10.1146/annurev.ne.18.030195.001205View ArticleGoogle Scholar
- Lee S, Kim K, Kim JY, Kim M, Yoo HJ: Familiarity based unified visual attention model for fast and robust object recognition. Pattern Recogn. 2010, 43(3):1116-1128. 10.1016/j.patcog.2009.07.014View ArticleMATHGoogle Scholar
- Treisman AM, Gelade G: A feature-integration theory of attention. Cogn. Psychol. 1980, 12: 97-136. 10.1016/0010-0285(80)90005-5View ArticleGoogle Scholar
- Cater K, Chalmers A, Ward G: Detail to attention: exploiting visual tasks for selective rendering. In Proc. 14th Eurographics Workshop Rendering. Aire-la-Ville: Eurographics Association; 2003:270-280.Google Scholar
- Itti L, Koch C, Niebur E: A model of saliency-based visual attention for rapid scene analysis, pattern analysis and machine intelligence. IEEE Trans. 1998, 20: 1254-1259.Google Scholar
- Rigas I, Economou G, Fotopoulos S: Low-level visual saliency with application on aerial imagery. Geosci. Remote Sens. Lett. IEEE 2013, 10: 1389-1393.View ArticleGoogle Scholar
- Borji A, Itti L: Exploiting local and global patch rarities for saliency detection. In Proc. IEEE Conference on Computer Vision and Pattern Recognition. Providence; 2012:478-485.Google Scholar
- Walther D, Rutishauser U, Koch C, Perona P: Selective visual attention enables learning and recognition of multiple objects in cluttered scenes. Comput Vision Image Understanding 2005, 100(1–2):41-63.View ArticleGoogle Scholar
- Andrews H, Patterson C: Singular value decomposition (SVD) image coding. IEEE Trans. Commun. 1976, 24(4):425-432. 10.1109/TCOM.1976.1093309View ArticleGoogle Scholar
- Arnold B: An investigation into using singular value decomposition as a method of image compression. Department of Mathematics and Statistics, University of Canterbury; 2000.Google Scholar
- Yang L, Zhou Y, Yang J: Variance WIE based infrared images processing. Electron. Lett. 2006, 42(15):857-859. 10.1049/el:20060827View ArticleGoogle Scholar
- Ruotolo R, Surace C: Using SVD to detect damage in structures with different operational conditions. J. Sound Vib. 1999, 226(3):425-439. 10.1006/jsvi.1999.2305View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.