Skip to main content

Advertisement

A hierarchical propelled fusion strategy for SAR automatic target recognition

Article metrics

Abstract

Synthetic aperture radar (SAR) automatic target recognition (ATR) is playing a very important role in military and civil field. Much work has been done to improve the performance of SAR ATR systems. It is well-known that ensemble methods can be used for improving prediction performance. Thus recognition using multiple classifiers fusion (MCF) has become a research hotspot in SAR ATR. Most current researchers focus on the fusion methods by parallel structure. However, such parallel structure has some disadvantages, such as large time consumption, features attribution conflict and low capability on confuser recognition. A hierarchical propelled strategy for multi-classifier fusion (HPSMCF) is proposed in this paper. The proposed HPSMCF has the characters both of series and parallel structure. Features can be used more effective and the recognition efficiency can be improved by extracting features and fusing the probabilistic outputs in a hierarchical propelled way. Meanwhile, the confuser recognition can be achieved by setting thresholds for the confidence in each level. Experiments on MSTAR public data demonstrate that the proposed HPSMCF is robust for variant recognition conditions. Compared with the parallel structure, HPSMCF has better performance both on time consumption and recognition rate.

1. Introduction

SAR is playing an important role both in national defense and civil applications, because SAR can work in all weather and day/night conditions. For better taking advantage of SAR data, the problem of target recognition should be overcome. In 1997–2000, L.M. Novak publishes a number of articles on SAR ATR [13]. After that many researchers have done lots of work to improve the performance of SAR ATR systems.

It is well-known that ensemble methods can be used for improving prediction performance [4]. The main idea behind the ensemble methodology is to combine several individual classifiers in order to obtain a classifier that outperforms every one of them. In 1998, J. Kittler .etc. develop a common theoretical framework for combining classifiers [5]. In 2010, Lior Rokach proposes an ensemble system which is composed of several independent base-level models [4]. These base-level classifiers are respectively constructed using different techniques and methods. With the development of MCF technology, MCF has been extensively applied in many areas, such as character recognition [6], multi-sensor data classification [7] and SAR ATR [8, 9].

It can be seen that MCF has become a research hotspot. Thus many new theories in pattern recognition are gradually introduced to MCF. For example, Jorge Sánchez and Javier Redolfi propose a novel approach for the combination of classifiers based on a graph defined in the space of concepts and a Markov chain defined on that graph [10]. Some researchers consider the contributions of various types of feature sets and classifiers, so weights are given to features or classifiers [1113]. It is believed that confidence-weighted learning is more consistent with human’s prediction model. In [14], authors use clustering ensemble for classifiers combination, because the use of cluster analysis techniques in supervised classification tasks has shown that they can enhance the quality of the classification results.

All the methods above belong to parallel structure of MCF (PSMCF) [5]. The flow of PSMCF is shown in Figure 1. PSMCF can indeed improve the decision accuracy by complementing the disadvantages between different classifiers. However, such parallel structure has some disadvantages. Firstly, if the decision-making strategy is not suitable, the accuracy will be reduced. Secondly, all the features should be extracted in a parallel structure, so large time-consuming is needed. Thirdly, the parallel structure can only make a decision on the final fusion result, so it has low capability on confuser recognition.

Figure 1
figure1

The common flow of multiple classifiers fusion.

In order to overcome such disadvantages of the PSMCF, a hierarchical propelled strategy for multi-classifier fusion (HPSMCF) is proposed in this paper. HPSMCF has the characters both of series and parallel structure. Features can be used more effective and the recognition efficiency can be improved by extracting features and fusing the probabilistic outputs in a hierarchical propelled way. Determining whether to go to the next level depends on the comparison between confidence and empirical threshold, so HPSMCF can reduce the time consumption without lower the recognition rate. Also the confuser recognition can be achieved by setting thresholds for making decisions in each level. Experiments on MSTAR public data demonstrate that the proposed HPSMCF gets better performance than PSMCF when applied to SAR ATR.

The rest of the paper is organized as follows. Section 2 introduces the flow of the proposed fusion strategy, the definition of classification confidence and classification weight in each level. In section 3, the detail description of three levels HPSMCF is given. Then Section 4 presents experimental results and analysis on MSTAR database. Finally in section 5 the conclusion and future work are stated.

2. Hierarchical propelled fusion strategy

If there are L levels, the basic framework of HPSMCF is shown in Figure 2.

Figure 2
figure2

The basic flow chart of hierarchical propelled fusion strategy.

If there are c classes samples, the classifier in level l (1 ≤ lL) can get posterior probability output of each class shown by Pl = {p1, p2, ...p c }.

1) Classification confidence

The classification confidence can be defined as:

conf ( P l ) = max c p i max P l \ max c p i max c p i s . t . i = 1 c p i = 1
(1)

P l \ max c p i means the set Pl except max c p i . If conf (Pl) is greater than a determined value, it means that the class represented by max c p i is acceptable.

In HPSMCF, if the confidence in each level is bigger than the empirical threshold T, the process is ended and the system can output the predicted class of the unknown image. If not, the process will go to the next level. A new feature is extracted and fed to another classifier. It is worth to remind that the confidences in each level are different because different features are used. All the thresholds can be written as:

T = T 1 , T 2 , T 2 * , .. T L , T L *

In level l (l > 1), there are two thresholds (T l , T l *). T l indicates the threshold in level l before fusion, and T l * is used for the fusion result.

2) Weights of Pl

If in level l, the confidence of the probability output is less than the empirical threshold, it means that this classification result is probably wrong. So in this level, the classification is not given and the process will go to next level. However, the probability output in level l Plwill be fused in the following levels with a weight. The weight of Pl can be expressed as:

w lj 1 l L , 2 j L , l j

Where j indicates the level in which Pl is fused. Weights used for fusion in each level are shown in Table 1.

Table 1 The confused matrix of weights

The weight w lj has a character as follows:

w lj = { 1 l = j μ w l j 1 l < j
(2)

where 0 < μ < 1. It means that if the probability output Pl cannot give enough evidence to get the classification result in level j-1, the weight of Pl should be reduced when Pl is used in level j and the reduced coefficient is μ.

3) Basic flow

The detailed process at each level is described as follows:

In level 1, extracting the first feature, and feeding the feature to classifier. Computing the confidence of P1. If conf(P1) > T1, output the classification result; otherwise the flow goes to level 2;

In level 2, extracting the second feature, and feeding the feature to classifier. Computing the confidence of P2. If conf(P2) > T2, output the classification result; otherwise fusion w12P1 and w22P2. Computing the confidence of the fusion result. If conf(fusion(w12P1, w22P2)) > T 2 * , output the classification result; else go to next level;

The processes in the following levels are the same as level 2, except in level L. The last level L makes the final decision. If in level L, the classification result cannot be got, all the process will be ended, and output ‘cannot recognition’.

It can be seen that if the threshold T L * is set to 0 and all of the other thresholds are set to 1, our HPSMCF is simplified as PSMCF; if the fusion processes are cancelled, HPSMCF is simplified as series structure.

3. Application of the proposed three-tier HPSMCF to SAR ATR

In order to verify the feasibility of the strategy proposed in this study, three-tier HPSMCF is applied to SAR images automatic target recognition. This work includes three important parts: Feature Extracting, Classification, and Decision Fusion.

1) Feature extracting

In our current research, three projection features are used. They are Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Non-negative Matrix Factor (NMF).

A. PCA

PCA is a common feature extraction method in pattern recognition, which has been widely used for target classification in SAR images. PCA is based on the assumption that high information corresponds to high variance.

If X = {x1, x2, ...x n }, X m×n is the original data, the mean of X is:

X ¯ = 1 n i = 1 n X i
(3)

The covariance matrix of X:

Q = i = 1 n X i X ¯ T X i X ¯
(4)

Eigenvalues and eigenvectors of Q:

V , D = EIG Q
(5)

Function EIG can get all the eigenvalues and eigenvectors of Q. It is proved that the first few principal components account for most of the variation. They can be used to describe the data, leading to a reduced-dimension representation. Choose the eigenvectors correspond to the k largest positive eigenvalues to form the transformation matrix. Then the PCA feature of X is extracted by multiplying the transformation matrix. The detailed information about PCA is introduced in reference [15, 16].

B. LDA

LDA uses the discrimination information between different classes, so the class information of samples should be known [15, 17]. Denote N sampled images as X = {x1, x2, ...x N }, the within-class scatter matrix is defined as:

S w = j = 1 c i = 1 N j x i j μ j x i j μ j T
(6)

Where x i j is the i th sample of class j, μ j is the mean of class j, c is the number of classes, and N j is the number of samples in class j.

The between-class scatter matrix is defined as:

S b = j = 1 c μ j μ μ j μ T
(7)

where μ represents the mean of all classes.

The goal of LDA is to maximize the between-class measure while minimizing the within-class measure. One way to do it is to maximize the ratio det |S b |/ det |S w |. If S w is a nonsingular matrix, when the column vectors of the projection matrix W = PCA(S w −1S b ), the ratio is maximized. It should be noted that the number of eigenvectors of W is at most c-1.

C. NMF

NMF has been successfully used for matrix decomposition and dimensionality reduction. The non-negativity constraint leads to a part-based representation because it allows only additive, not subtractive, combinations of the original data [18, 19].

Assume matrix D R n × m is decomposed into two matrices W R n × r and H R r × m , so that:

D WH D i , j , W i , μ , H μ , j 0
(8)

with 0 ≤ i < n − 1, 0 ≤ j < m − 1 and 0 ≤ μ < r − 1.

Define cost function based on the Square Euclidian Distance:

arg min W , H D WH 2 = 0 i < n , 0 j < m D i , j WH i , j 2
(9)

The Square Eculidian Distance measure in (9) is non-increasing under the following iterative update rules:

H H W T D W T WH W ia W ia D H T ia WH H T ia
(10)

for 0 ≤ a < r, 0 ≤ μ < m and 0 ≤ i < n. Appropriate W, H can be found by iteration.

D. Feature ordering

In our current research, the principle of feature ordering is the computational complexity for extracting each feature. The computational complexity of PCA is the smallest, and the computational complexity of NMF is the largest. So in the three-tier HPSMCF, PCA is used in level 1, LDA is used in level 2 and NMF is used in level 3.

2) Classifier

In order to make the metric of probability output same in each level, Support Vector Machine (SVM) is used in all levels. SVM classification method has extraordinary potential capacity. Using kernel function, SVM can well solve the non-linear classification problem [6, 20].

SVM discriminates two classes by fitting an optimal linear separating hyperplane (OSH). The optimization principle is based on structural risk minimization (SRM). SRM aims to maximize the margins between the OSH and the closest training samples. These closest training samples are called support vectors. The details of solving the optimization problem in SVM are introduced in [21].

For a test sample, SVM can get the probability output in the following form:

P = p 1 , p 2 , p i , p c

where 0 < jc, c is the number of class. P j is the probability that the test sample belongs to class j.

The kernel of SVM used in this paper is radial basis function (RBF):

k x , y = exp γ x y 2

and the parameters of C and y are set to C = 32, γ = 1/32.

3) Fusion theory

Dempster-Shafer Evidence Theory is close to human decision principle and has been widely used in information fusion and classifier fusion [22, 23].

Assuming Θ is a mutually exclusive and exhaustive finite set, which is called frame of discernment. The mapping m : 2Θ → [0, 1] called Basic Probability Assignment Function (BPAF) is defined as:

{ m ( Θ ) = 0 A Θ m A = 1
(11)

If A Θ and m(A) > 0, A is called a Focal Element.

If Θ has n Focal Elements: A = {A1, A2, …A i , …A n }, Dempster’s rule of combination is described as follows:

{ m ( Θ ) = 0 m ( A ) = A i = A i = 1 n m i A i 1 A i = Θ i = 1 n m i A i = A i = A i = 1 n m i A i 1 k
(12)

where k is a measure of conflict between n evidences and can be got by

K = A 1 A n m 1 A 1 . m 2 ( A 2 ) m n ( A n ) = 1 A 1 A n = m 1 A 1 . m 2 ( A 2 ) m n ( A n )
(13)

In level l, there are l evidences need to be fused. So in the second and third level of our three-tier HPSMCF, the value of n in (13) is 2 and 3, respectively.

4. Experiment analysis

The Moving and Stationary Target Acquisition and Recognition (MSTAR) program was initiated by the U.S. Defense Advanced Research Projects Agency (DARPA) and the U.S. Air Force Research Laboratory (AFRL) in the summer of 1995. The SAR images used in our experiments are taken from MSTAR public release database. The database consists of X-band SAR images with 1 ft. × ft. resolution for multiple targets. The SAR target images were captured at two different depression angles 15° and 17° over 360° aspect angles [24]. Figure 3 shows some sample images with both optical images and SAR images. The statistics of the MSTAR public database is summarized in Table 2.

Figure 3
figure3

Some samples with both optical images and SAR images in MSTAR. (a)-(b) Optical images for BMP2, BTR70, T72, 2S1, D7, ZIL131 and ZSU23-4. (c)-(d) Corresponding SAR images for 7 targets.

Table 2 Part of MSTAR public database

The format of the data is RAW including amplitude and phase information. In the following experiments, only amplitude information is used. All the images are cropped by extracting 64×64 patches from the center of the image but without any other preprocessing.

(1) 3-Class recognition and confuser recognition

1) 3-Class recognition

In this experiment, 3 classes targets (BMP2, BTR70, T72) are used. BMP2 and T72 have three series as shown in Table 2. Only the images of BMP2 sn-c9563, BTR70 and T72 sn-132 at depression 17° are used for training data. All of the images of these three classes at depression 15° are used as the testing data.

Table 3 is the confusion matrix of the classification results by HPSMCF. The overall recognition rate (RR), especially the BMP2, is not as high as the state-of-the-art recognition rate. This is caused by the images without any preprocessing. Figure 4 shows the comparison between PCA+SVM, LDA+SVM, NMF+SVM, PSMCF and HPSMCF. It can be clearly seen that our HPSMCF has the best performance on this 3-class classification problem.

Table 3 The confusion matrix of HPSMCF performance
Figure 4
figure4

Recognition performance of different methods: PCA+SVM, LDA+SVM, NMF+SVM, PSMCF and HPSMCF.

2) Confuser recognition

In this experiment four nontarget vehicles (2S1, D7, ZIL131, and ZSU23-4), are added to the testing set in Table 3 as confusers.

The rejection rates are listed in Table 4. Our HPSMCF gets the average rejection rate 70.2% while PSMCF rejects 52.4% of confusers. It can be seen that no matter 3-class recognition or confuser recognition, our HPSMCF has better performance.

Table 4 Confuser rejection rate

(2) Depression angle and configuration variance

In order to test our designed method’s robustness of processing different condition data, the data in 3-class problem is divided into one training set and three testing sets. The images of BMP2-c21, BTR70-c71, and T72-132 in depression angles of 17° are employed as training set. The other images are used for testing sets. The detail of the data sets is shown in Table 5. This new mode of partition the data sets can get variance in depression angle and configuration.

Table 5 New division of 3-class data

The recognition results of five methods on three testing sets are presented in Table 6 and Figure 5. They clearly show that tests on set 2 have the highest recognition rate and tests on set 4 have the lowest recognition rate. The recognition rate of set 4 is the lowest, which means both target type and depression angle can have effect on the recognition performance. However, the recognition rate of set 2 is higher than set 3, which means the variance of target configuration in set 2 has more effect on recognition than the variance of depression angle in set 3.

Table 6 Recognition results on new division
Figure 5
figure5

The recognition results on different test set.

Meanwhile, Table 6 and Figure 5 also show that the PSMCF indeed outperforms the method which uses single feature, but the proposed hierarchy framework HPSMCF in this paper has the best results.

The number of levels which a process reaches is called hierarchy depth. The hierarchy depth of three testing sets is presented in Table 7. For most samples of set 2, only the first level is used for recognition. Just a few recognition processes can reach the third hierarchy. However, the number of recognition processes which use all three levels has obvious growth when recognizing the samples of set 3 and set 4, which means that set 3 and set 4 have more difficult recognition condition than set 2. Therefore our system can choose different hierarchy depth according to the complexity of recognition processes.

Table 7 The hierarchy depth of three testing sets

The comparison on average recognition rate and time consumption with PSMCF when testing set 2~set 4 is shown in Table 8. It proves that the proposed method outperforms PSMCF both on recognition rate and time consumption.

Table 8 Comparison with on average RR and used time

5. Conclusion and future work

In order to overcome the disadvantages of the common fusion method by parallel structure, a hierarchical propelled strategy of multiple classifiers fusion (HPSMCF) is proposed in this paper. The recognition efficiency can be improved by extracting features and fusing the probabilistic outputs in the hierarchical propelled way. Also the confuser recognition can be achieved by computing confidence and making decisions in each level. Experiments on MSTAR public data set demonstrate the effectiveness of the proposed hierarchical propelled fusion strategy. Compared to the single classifier based recognition processes, HPSMCF has higher recognition rate. Meanwhile, the proposed method outperforms the traditional parallel structure on both time consumption and recognition rate.

The next step in our research work will consist in selecting the threshold T adaptively and using more features and more classifiers to evaluate the feasibility of this system. On these bases, our goal is to build a recognition framework based on human cognition theory. Meanwhile, the proposed strategy can also be considered for the multiple sensors fusion, etc.

References

  1. 1.

    Novak LM, Halversen SD, Owirka GJ, Hiett M: Effects of polarization and resolution on SAR ATR. IEEE Trans. Aerosp. Electron. Syst. 1997, 33(1):102-116.

  2. 2.

    Novak LM, Owirka GJ, Weaver AL: Automatic target recognition using enhanced resolution SAR data. IEEE Trans. Aerosp. Electron. Syst. 1999, 35(1):157-175. 10.1109/7.745689

  3. 3.

    Novak LM, Owirka GJ, Brower WS: Performance of 10- and 20-target MSE classifiers. IEEE Trans. Aerosp. Electron. Syst. 2000, 36(4):1279-1289. 10.1109/7.892675

  4. 4.

    Rokach L: Ensemble-based classifiers. Artif. Intell. Rev. 2010, 33(1–2):1-39.

  5. 5.

    Kittler J, Hatef M, Duin RPW, Matas J: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20(3):226-239. 10.1109/34.667881

  6. 6.

    Rahman AFR, Fairhurst MC: Multiple classifier decision combination strategies for character recognition: a review. Doc. Anal. Recognit. 2003, 5(4):166-194. 10.1007/s10032-002-0090-8

  7. 7.

    Waske B, Benediktsson JA: Fusion of support vector machines for classification of multisensor data. IEEE Trans. Geosci. Remote Sens. 2007, 45(12):3858-3866.

  8. 8.

    Xin Y, Yukuan L, Jiao LC: SAR automatic target recognition based on classifiers fusion. In Proceeding of2011 International Workshop on Multi-Platform/Multi-Sensor Remote Sensing and Mapping (M2RSM). Xiamen; 2011:1-5.

  9. 9.

    Huan R, Pan Y: Decision fusion strategies for SAR image target recognition. IET Radar Sonar Navigat. 2011, 5(7):747-755. 10.1049/iet-rsn.2010.0319

  10. 10.

    Sánchez J, Redolfi J: Classifier combination using random walks on the space of concepts. Prog. Pattern Recognit. Image Anal. Comput. Visi. Appl. 2012, 7441: 789-796. 10.1007/978-3-642-33275-3_97

  11. 11.

    Busagala LSP, Ohyama W, Wakabayashi T, Kimura F: Multiple feature-classifier combination in automated text classification. In Proceeding of 2012 10th IAPR International Workshop on Document Analysis Systems (DAS). Gold Cost, QLD, Australia; March 2012:43-47.

  12. 12.

    Crammer K, Dredze M, Pereira F: Confidence-weighted linear classification for text categorization. J. Mach. Learn. Res. 2012, 13: 1891-1926.

  13. 13.

    Jian H, Zhan-Shen F, Bo-Ping Z: A graph-theoretic approach to classifier combination. In Proceeding of2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Kyoto; March 2012:1017-1020.

  14. 14.

    Duval-Poo M, Sosa-García J, Guerra-Gandón A, Vega-Pons S, Ruiz-Shulcloperet J: A new classifier combination scheme using clustering ensemble. Prog. Pattern Recognit. Image Anal. Comput. Vis. Appl. 2012, 7441: 154-161. 10.1007/978-3-642-33275-3_19

  15. 15.

    Martinez AM, Kak AC: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23(2):228-233. 10.1109/34.908974

  16. 16.

    Changzhen Q, Hao R, Huanxin Z, Shilin Z: Performance comparison of target classification in SAR images based on PCA and 2D-PCA features. In Proceeding of 2009 2nd Asian-Pacific Conference on Synthetic Aperture Radar (APSAR). Xian, Shanxi; October 2009:868-871.

  17. 17.

    Belhumeur PN, Hespanha JP, Kriegman DJ: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19(7):711-720. 10.1109/34.598228

  18. 18.

    Lee DD, Seung HS: Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401: 788-791. 10.1038/44565

  19. 19.

    Nikolaus R: Learning the parts of objects using non-negative matrix factorization. Term Paper, MMER Team; 2007.

  20. 20.

    Ying W, Ping H, Xiaoguang L, Renbiao W, Jingxiong H: The performance comparison of Adaboost and SVM applied to SAR ATR. In Proceeding of 2006 International Conference on Radar. Shanghai; 2006:1-4.

  21. 21.

    Vapnik VN: Statistical Learning Theory. Wiley, New York; 1998.

  22. 22.

    Kuncheva CP: Classifiers, Methods and Algorithms. Wiley, New York; 2004.

  23. 23.

    Van-Nam H, Tri Thanh N, Cuong Anh L: Adaptively entroy-based weighting classifiers in combination using dempster-shafer theory for word sending disambiguation. Comput. Speech Lang 2009, 24(3):461-473.

  24. 24.

    Haichao Z, Nasser MN, Yanning Z, Thomas SH: Multi-view automatic target recognition using joint sparse representation. IEEE Trans. Aerosp. Electron. Syst. 2012, 48(3):2481-2497.

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China under Projects 60802065 and Projects 61271287.

Author information

Correspondence to Zongjie Cao.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Reprints and Permissions

About this article

Keywords

  • SAR target recognition
  • Multiple classifiers fusion
  • Hierarchical propelled strategy