Digital image splicing blind detection is becoming a new and important subject in information security area. Among various approaches in extracting splicing clues, Markov state transition probability feature based on transform domain (discrete cosine transform or discrete wavelet transform) seems to be most promising in the state of the arts. However, the up-to-date extraction method of Markov features has some disadvantages in not exploiting the information of transformed coefficients thoroughly. In this paper, an enhanced approach of Markov state selection is proposed, which matches coefficients to Markov states base on well-performed function model. Experiments and analysis show that the improved Markov model can employ more useful underlying information in transformed coefficients and can achieve a higher recognition rate as results.
With digital imaging equipment and processing software springing up, tampering of digital image has become so easy and convenient. A frequent and fundamental type of image tampering is splicing which pastes image fragments from the same or different images into the host image by crop-and-paste operation. Though there might be some more professional artifices such as scaling, rotation, brightening, blurring, and smoothing after splicing, a careful and skillful splicing image can avoid any obvious trace of manipulation even without any of those above post-operations. Just as what the dissymmetry of information security reveals, though digital image tampering is quite a simple thing with modern techniques, its detection is actually a tough mission. Consequently, when photos as a record of what have happened cannot be trusted, it is a great threat to our society security especially in aspects like news media, military, and legal arguments.
To certify the authenticity of photos, one of the traditional and useful detection methods is watermarking, which is widely used in copyright protection and digital image authentication . Watermark method detects the tampered images by checking the embedded information which is inserted at the time of imaging . However, most digital cameras do not have that function owing to cost or imaging quality consideration, which makes the approach not universal in application. What is more, all active detecting methods need prior information, which is not available for the third party, and so they are not adapted to many practical applications. In contrast, passive or named blind approaches for image splicing detection have no demand of prior information and only exploit the knowledge of the image itself, which make them more popular and make them gain more attention. In this paper, we focus on passive image splicing detection approach.
Many passive image splicing detection methods have been proposed in recent years . In contrast to some earlier researches which emphasized on the splicing trace or duplication property on the space domain [4, 5], it is now widely believed that some statistical features in the transform domain which stand for the regularities of coefficient correlation would be disturbed by splicing. Upon this assumption, many promising models or algorithms have been proposed. Chen et al.  put forward a blind image splicing detection method based on two-dimensional (2-D) phase congruency and statistical moments of characteristic functions of wavelet sub-bands, which achieved a detection rate of 82.32%. Dong et al.  proposed a simple but efficient approach that is performing well both in detection accuracy and computational complexity; it extracted statistical features from image run-length representation and image edge statistics. Zhao et al.  came up with a method exploiting features of gray-level run-length run-number vectors from de-correlated chroma channels. In the work of Shi et al. , a nature image model was proposed, which combined together the characteristic function moments of wavelet sub-bands and Markov transition probabilities on block discrete cosine transform (DCT) coefficients as splicing features and achieved an average accuracy as high as 91.87%. Though it seems natural when we take more splicing features together, we are more likely to get some higher detection rates at the cost of more algorithm complexity and time consumption. The best merit of the work of Shi et al.  is that it revealed the promising prospect of Markov features on transform domain, in the sense of the 88.31% accuracy achieved by the 96-D Markov features alone which contributed most in the combined feature model. In another work , He et al. extended the Markov features in  to discrete wavelet transform (DWT) coefficients and combined them with features on block DCT coefficients. Also, despite the dimension of features going up to as high as 7,290 before REF feature elimination, it did achieve the best detection rate of 93.55% up to now on Columbia Image Splicing Detection Evaluation Dataset .
As previous research shows, the Markov feature on transform domain seems to have a better performance on image splicing detection compared with other statistical features. Thus, it is our best concern with Markov feature extraction model which has mostly two steps, i.e., state selection and state transform probability calculation. Both state selection methods in  and  are rounding and threshold, which is too simple to reflect all the useful information embodied in the transformed domain coefficients. In this paper, instead of resorting to higher dimensions and more complicated combination of features, we will thoroughly analyze the distribution characteristic of transformed domain coefficients and put forward our enhanced state selection method.
The rest of this paper is organized as follows. Splicing feature extraction process is explained in Section 2. We then analyze the coefficient distribution regulation, point out the disadvantages of previous state selection method, and propose our improved algorithm in Section 3. In Section 4, we give the experiment results which confirm the analysis in Section 3. Conclusion and discussion is given in Section 5.
2. Feature extraction
In order to demonstrate the benefit of our proposed method compared with previous one more specifically and clearly, we explain and focus on certain aspects of the Markov feature model in this section and ignore all other parts of the combined features in  and . Transform domain-based Markov feature extraction is processed as follows: firstly, block discrete cosine transform or discrete wavelet transform is performed on images; then, difference operation is conducted; finally, select a state for each coefficient and then calculate Markov state transform probabilities. The brief flow diagram is given in Figure 1.
2. 1 Block discrete cosine transform
Due to the capability of block discrete cosine transform (BDCT) in de-correlation and energy compaction, it is widely used in image or video processing like compression and de-noising. The BDCT in this paper is set to have a block size of 8 × 8 with the reasons analyzed in , and the transform formula (1) is given as
2.2 Discrete wavelet transform
Wavelet analysis does well at catching short-time transient or local change signals . Since splicing borders are sharp transitions in nature, DWT is adaptive to image splicing detection. In , He et al. resorted to Markov random progress (transition probability matrix) to capture dependency among wavelet coefficients across positions, scales, and orientations and achieved good detection results. The wavelet transformation (Equations 2, 3, and 4) is operated as follows :
2.3 Difference 2-D array
As many researches indicate, one of the main obstacles for splicing detection is the interference from the image content. A difference 2-D array is introduced to eliminate this interference. Difference 2-D arrays are denoted as Fθ(u, v) (Equations 5, 6, 7, and 8) :
where (u, v) represents difference array coordinates, (i and j stand for original array coordinates, θ represents different directions (horizontal (h), vertical (v), diagonal (d), and minor diagonal (m)). This paper exploits difference array in horizontal and vertical directions to represent splicing features.
2.4 Markov transition probability matrix
Under the assumption that pasted parts are additive to the host image and the additive noise is independent to the host image, the distribution of the spliced image is the convolution of the distribution of the host image and that of the additive noise . When additive splicing noise obeys Gaussian distribution, the splicing operation will cause the disturbance of concentration along the main diagonal of Markov transition probability matrix of the difference array, and this statistical artifact can be employed to detect splicing.
Because the coefficients of difference 2-D array are rational numbers and have a vast range, it is needed to resort to some techniques like rounding and threshold. If the value of an element after rounding in a difference array is larger than T or smaller than − T, it will be represented by T or − T. After undergoing rounding and threshold operation, the coefficients of difference 2-D array become Markov states. This procedure results in the mentioned state transition probability matrix. In horizontal (9) and vertical (10) directions, the element in the matrix is given as follows:
3. The enhanced Markov state selection method
The process of Markov feature extraction based on DCT or DWT are the same, as shown in Figure 1. Other parts except state selection are introduced in Section 2, and we will describe this part in detail, e.g., why doing this and how to do, after which our enhanced method is given.
As shown in the Section 2.4, Markov features are in fact transition probabilities calculated between states. However, the coefficients after difference operation of DCT or DWT are both continuous rational numbers, which is not convenient to be regarded as states directly. So, we need to select some limited states to represent all coefficients. By state selection, we mean the method of determining which state a specific coefficient stands for. Supposing we get N states, and the dimension of Markov transition probabilities will be N × N, which should be as small as possible considering the calculation complex and time consumption for classification on the premise of not affecting the recognition results too much.
3.1 The original Markov state selecting method and its disadvantages
The Markov state selecting method in  and  was rounding coefficients and then setting those coefficients above threshold T to T. The above papers both considered different T values and selected the suitable one in view of compromising between recognition rate and time consumption. After analyzing the coefficient histogram after BDCT and difference operation, we find that most coefficients are small and centering on zero, and a similar situation is also found in the DWT domain. On the other hand, considerable coefficients do exist beyond frequently adopted threshold. On all accounts, during the rounding and threshold operation, part of the coefficient information is lost, as described in the following:
Rounding operation matches decimal coefficients to two neighboring states in a simple way and loses information carried by the variety of difference.
Threshold makes the coefficients above T to T and loses the difference information, too.
As mentioned above, there should not be too many Markov states in consideration of calculation complexity and time consumption, so loss of information is somewhat unavoidable. However, the rounding and threshold method is too stiff, and it cannot adapt flexibly to coefficient distribution regularity. From Table 1, we can see that the ratio of coefficients between −0.5 and 0.5 after BDCT and difference operation is over 30%. Also, from Table 2 in D1 sub-band, that ratio after DWT and difference is as high as 46.5%, which means that the number of coefficients for state S = 0 is almost the same as that of all other states. As we all know, Markov transition probabilities cannot describe the difference information of coefficients that stand for the same state. Consequently, we should make the Markov states represent them more sufficiently rather than map them with only one state. For the second issue, information loss is even more obvious. As Table 1 shows, the ratio of BDCT coefficients over threshold T = 3 after horizontal and vertical differences are 42.7% and 42.8%, respectively, and Table 2 shows that the ratio of DWT coefficients over T = 3 at level 2 horizontal direction after vertical difference reaches up to 57.5%. For the ratio of zero, the highest value is 46.5% as illustrated in Table 3. Mapping very large scale coefficients to only one state is not appropriate due to the difference information lost. Generally speaking, the function of Markov state selection is mostly for calculating state transition probabilities, and it does not catch enough attention it deserved. When the lost information during this process may be valuable to us and should be preserved, the rounding and threshold method will lead to the decrease of detection rate. In the following, we will give our enhanced method for Markov state selection.
3.2 The proposed Markov state selection method
The former Markov state selection method by rounding and threshold is not a desirable one because it does not take into account the coefficient distribution property. Thus, the first thing we should have done is to study the regularity of coefficient distribution. We do BDCT and DWT followed by difference operation on Columbia gray images, put all the coefficients together, and then analyze their histograms. Figure 2 is the histogram of coefficients after BDCT and horizontal and vertical difference operation, in the range of [−30, 30] and with a minimal interval of 0.1. We find that majority of coefficients are in [−10, 10] and that the most abrupt part of the histogram is near zero. Also, the two histograms of different direction coefficients are resembled, so when we analyze BDCT coefficients thereafter, only horizontal direction is considered. The upper parts of Figure 3 are histograms of coefficients after level 1 DWT (in horizontal, vertical, and diagonal sub-bands, respectively) and horizontal difference operation, whereas the lower parts of Figure 3 are the level 2 results. The figures show that level 1 horizontal and vertical sub-band coefficients after horizontal difference are like BDCT, which are mostly in [−10, 10] and center sharply on zero, while in level 2, coefficients have a relatively larger range, and energy out of [−10, 10] is also strong.
The Markov state number must be finite and not too large; how to map the vast number of coefficients with limited states has great meaning. The method we give is to map states with the coefficients according to various presupposed function models. Those functions are just envelopes of discrete matching probabilities for each state. The matching probability is defined as follows: suppose the number of all coefficients is M and the number of coefficients corresponding to a specific state is K, then the matching probability for this state is K/M. The simplest model is the average function, i.e., mapping the states evenly with the coefficients according to a fixed ratio determined by the number of state N. Though the average function model can avoid matching too many outbound coefficients to the same state and each state stands for coefficients evenly, it fails to employ thoroughly the regularity of coefficient distribution analyzed above. Therefore, besides the average function, we considered other models like absolute linear function, quadratic function, Gaussian function, and exponential function. Experiment results are given in next section, and the comparison will demonstrate which function model reaches our target best.
Whatever function model is adopted, the mapping process is similar with that of the average function model. When the Markov state number is set to be N (odd), the coefficients will be divided to N parts with N − 1 border values and there are N corresponding percentages or matching probabilities for each part.
From Figures 2 and 3, we know that coefficients distribute symmetrically approximately at X = 0. Thus, when calculating matching probabilities, we count the negative and positive parts together and the number of border values will be reduced by half. To calculate the (N − 1)/2 border values, (N + 1)/2 probabilities for each nonnegative state are needed, noted as p1, p2,…, p(N + 1)/2. We employ the function models with estimated parameters to get these probabilities.
For each assumptive function model, we use grid search method to estimate the most suitable parameters. Since all coefficients need to be mapped, the sum of percentage for each state should be 1. Also, each percentage as a probability should be in the range of [0, 1]. Hence, there are two limitation conditions (11) for the estimation:
After each matching probability is confirmed, we will calculate the border values, noted as T1, T2, …, T(N − 1)/2. Suppose w as the appropriate step parameter with reasonable precision compared to coefficients, which is set to 0.01 in our experiment. The algorithm is described as follows:
Set initial values Ts = 0, Te = ω, and i = 1.
Calculate the percentage of coefficients in [−Te, − Ts) and (Ts, Te]; if not above or equal to Pi, then Te = Te + ω.
Repeat step 2. If the percentage is above or equal to Pi, then end the inner loop; return Te, set Ts = Te, Te = Te + ω, and i = i + 1.
Repeat steps 2 and 3. If the outer loop number is above (N − 1)/2 or Te is above or equal to the max coefficient, then end the outer loop and return Te.
For each function model and combination of parameters in grid search process, after border values are determined by the above algorithm, we can easily calculate the Markov transition matrix as described in Section 2 and at last find the most suitable parameters which lead to the highest detection rate for a specific function model.
4. Experiment results and analysis
4.1 The image dataset
So far, the widely used evaluation dataset is the Columbia Image Splicing Detection Evaluation Dataset , which consists of 933 real images and 912 splicing images, as shown in Figure 4. Real images are taken by one camera, and splicing images are manufactured from two real images, only splicing operation and no post dispose, which makes for splicing detection particularly.
We mark real images +1 and spliced images −1, and then the problem becomes a two-way classification, which can be solved by a support vector machine. In experiments, the classifier is LIBSVM by Lin  and an RBF kernel is used. Each time before classification, grid algorithm is used to find the best C and G, and half of real and splicing images is selected randomly as train set, with the left as test set. Since the experiment sample number is small, we conducted independent experiments 50 times and got the average results to reduce the stochastic impact.
4.3 Analysis of experiment results
4.3.1 Comparison of different function models for Markov state selection
As discussed in the previous section, we use several function models to map coefficients to Markov states. Under the two limitation conditions, many groups of function parameters are tested by grid search method, and finally, a well-performed parameter group is selected for a specific function. In Tables 4 and 5, we list all of the mentioned function models with their corresponding best-performed parameters and their detection rate for BDCT and DWT domains.
Results show that exponential function model performs best for the BDCT domain. As shown in Figure 2, the shape of the histogram in the positive part for BDCT is close to the exponential distribution, and it seems that the best choice of the state matching function model is just the same type of function as the envelopes of the coefficient histogram. Though this assumption does not hold in the DWT domain in which the histogram in Figure 3 is near the Gaussian distribution, the best result of state matching function model is the average function as demonstrated in Table 5. We also found that the Gaussian model is the second best choice, and the result is very close to that of average model. Taking into account that the samples in the experimental dataset are limited, it may be difficult to find out all the underlying secrets on limited testing data. In any case, our experiment does suggest that the best choice of the Markov state matching function model has something to do with the regularity of coefficient distribution.
4.3.2 Results of the enhanced state selection method on BDCT domain
Figures 5 and 6 show the detection results of Markov models with different state selection methods on coefficients in BDCT domain (horizontal and vertical differences respectively). AC stands for average detection rate, TP for detection rate of real image, and TN for splicing image. Exponential function model is used for state matching in our enhanced method. The data show that our method will improve the detection rate by 1.0% to 2.2%, respectively, without increasing the feature dimension, which makes great sense because the detection rate has not gone up for a long time. He et al.  increased the detection rate from 91.87% in  to 93.55% with the feature dimension enlarged from 266 to 7,290.
What is more, when Figures 5 and 6 are compared, we find that the AC increment when N = 7 is larger than when N = 9, which verifies our assumption for the disadvantage of original state selection method: it is not suitable to threshold too many coefficients above T to T with very useful information for splicing detection lost. When N = 9, the rate increment is about 1%. If we continue to enlarge T, the improved state selection method will lose its superiority because the ratio of coefficients above T will be decreased.
4.3.3 Results of the new state selection method on DWT domain
He et al.  proposed the feature extraction method which employed a 2-D difference array stand for the correlations of position, scale, and direction, respectively, in DWT domain. Among these three features, position contributes the most, and detection rate of direction or scale feature alone is less than 75%. So, we only compare the detection rate on position feature in this paper. Figures 7 and 8 give the detection rate of Markov features from the two state selection methods after DWT and horizontal difference, with N standing for the number of Markov states. Horizontal, vertical, and diagonal sub-bands are taken together to extract the Markov features, and the average function model is used for state matching in our enhance method. The results show that our method will improve the detection rate by 1.2% to 1.7%, respectively.
This paper improves the Markov state selection method in the works of Shi et al.  and He et al. , whose detection rate was the highest in the state of the art on Columbia Image Splicing Detection Evaluation Dataset. Our proposed method maps coefficients with the Markov states according to various function models. Experiments reveal that different coefficient distribution may have different best function models for state selection. Our enhanced method increases the detection ability of Markov features and can promote the AC rate by up to 2.2% without increasing the feature dimension.
Nikolaidis N, Pitas I: Copyright protection of images using robust digital signatures. In Proceeding on International Conference on Acoustics, Speech, and Signal Processing, Atlanta. Edited by: Reeves Stanley J. Piscataway: IEEE Press; 1996:2168-2171.
Chen W, Shi Y: Image splicing detection using 2-D phase congruency and statistical moments of characteristic function. In SPIE, Electronic Imaging, Security, Steganography, Watermarking of Multimedia Contents IX, San Jose. Bellingham: SPIE; 2007:1-8. vol. 6505
Dong J, Wang W, Tan T, Shi Y: Run-length and edge statistics based approach for image splicing detection. In 7th International Workshop in Digital Watermarking (IWDW 2008), Busan. Heidelberg: Springer; 2008:76-87.
This work is funded by the National Natural Science Foundation of China (61071152, 61271316), 973 Program of China (2010CB731403, 2010CB731406), and National ‘Twelfth Five-Year’ Plan for Science and Technology Support (2012BAH38 B04).
Authors and Affiliations
School of Information Security Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Bo Su, Quanqiao Yuan & Shilin Wang
Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.