In order to extend the reference IBDH algorithm [3], we use DCT as a decomposition transform to change the error image histogram compared to the basic algorithm. In fact, we want to create a quasi-sparse frame [27] with less zero pixels (fully black pixels) and much more non-zero pixels which their gray levels are very near to zero. One of the most popular ways to modify interpolation-based data hiding techniques is to use a better interpolator or histogram modification through histogram shifting and histogram adjustment. As IBDH method in [3] is a most recent version of IBDH techniques that uses a novel interpolator alongside a histogram modification process [3], we wish to combine this method with another process based on discrete cosine transform (DCT) to improve its aggregation performance. In this regard, we use DCT with different patch sizes to make a combinational approach entitled interpolation-based data hiding using discrete cosine transform (IBDH-DCT). Our experiments show medium-sized patches are more effective. If a transform is able to create a quasi-sparse image with less zero pixels, it is probably able to improve IBDH in ViSAR frames. As we know, the mentioned transform can be invertible generally, but in the use of it to make transformed frames, we have to scale and quantize the coefficients matrix, so after re-scaling, a loss may be seen because of the quantization. However, this loss does not affect the watermark/embedded data, but the final data hiding approach might be non-reversible. In the next sub-sections, basic concepts around DCT will be reviewed at first, and then, the proposed method will be presented.
2.1 2D DCT for frame transformation
DCT is one of the most important decomposition transforms for signal and image processing. For example, JPEG compression works based on a core DCT. This transform avails the cosine basis functions which can be orthonormal. An important property of DCT is its real coefficients compared to discrete Fourier transform (DFT) or fast Fourier transform (FFT). Another property of DCT is lower computational complexity which makes it appropriate for real-time multimedia coding. Furthermore, with respect to energy compression for high-performance image coding (i.e., maximum information at the lowest file size), DCT is a powerful transform like Karhunen Loeve transform (KLT), but with a lower complexity. Equation (1) shows 2D DCT for two-dimensional data like gray-scale frames. Also, Eq. (2) denotes the inverse DCT (IDCT). X(k, l) as DCT coefficients are real and converted version of an image/patch with size of N-by-N will be N-by-N again (in below, x(m, n) shows the image pixels, size of the source image is N × N, i.e., 0 ≤ m, n ≤ N − 1). The basis functions are seen in Fig. 2 for N = 8. N is the patch size in which for an N-by-N patch, there are N2 basis functions.
$$ {\displaystyle \begin{array}{l}X\left(k,l\right)=\alpha (k)\;\alpha (l)\sum \limits_{m=0}^{N-1}\sum \limits_{n=0}^{N-1}x\left(m,n\right)\kern0.24em \cos\;\left(\frac{k}{N}\left(m+\frac{1}{2}\right)\pi \right)\;\cos\;\left(\frac{l}{N}\left(n+\frac{1}{2}\right)\pi \right)\\ {} where\kern0.24em \alpha (s)=\Big\{\begin{array}{c}\sqrt{\frac{1}{N}}\kern0.5em for\kern0.48em s=0\\ {}\sqrt{\frac{2}{N}}\kern0.5em Otherwise\end{array}\end{array}} $$
(1)
$$ x\left(m,n\right)=\sum \limits_{k=0}^{N-1}\sum \limits_{l=0}^{N-1}\alpha (k)\;\alpha (l)\kern0.24em X\left(k,l\right)\kern0.24em \cos\;\left(\frac{k}{N}\left(m+\frac{1}{2}\right)\pi \right)\;\cos\;\left(\frac{l}{N}\left(n+\frac{1}{2}\right)\pi \right) $$
(2)
DCT is also computable in matrix form of which \( \underset{\_}{x} \) is the image matrix, \( \underset{\_}{C} \) is the DCT matrix, and the transformed image matrix is \( \underset{\_}{X} \) from Eq. (3). The functions of DCT are generally defined as Eq. 4. Figure 3 shows virtually texturized results for different patch sizes in a sample ViSAR frame. It is obvious that each patch is different in terms of ability of creating a quasi-sparse illustration.
$$ {\displaystyle \begin{array}{l}\underset{\_}{X}=\underset{\_}{C}\;\underset{\_}{x}\;{\underset{\_}{C}}^t\\ {}\mathrm{where}\kern0.36em C\left(i,j\right)=\Big\{\begin{array}{c}\frac{1}{\sqrt{N}}\\ {}\sqrt{\frac{2}{N}}\cos \left(\frac{i}{N}\left(j+\frac{1}{2}\right)\pi \right)\;\end{array}\kern0.5em \begin{array}{c}\begin{array}{l}i=0\\ {}\end{array}\\ {}i>0\end{array}\\ {}\mathrm{and}\kern0.34em \mathrm{similarly}:\kern0.36em \underset{\_}{x}={\underset{\_}{C}}^t\underset{\_}{X}\underset{\_}{C}\end{array}} $$
(3)
$$ F\;\left(k,l,m,n,M,N\right)=\alpha (k)\;\alpha (l)\kern0.24em \cos\;\left(\frac{k}{N}\left(m+\frac{1}{2}\right)\pi \right)\cos\;\left(\frac{l}{M}\left(n+\frac{1}{2}\right)\pi \right);\kern0.36em {\displaystyle \begin{array}{c}0\le k,m\le N-1\\ {}0\le l,n\le M-1\end{array}} $$
(4)
2.2 Quasi-sparse bit injection using IBDH and DCT
The reference method of IBDH has been discussed in [3]. This method is applied to ordinary ViSAR frames, and the only histogram processing is performed using some modification or shifting techniques like [5] which a little help IBDH find more suitable places for injecting payload bits. Our experiments show that a transform that can basically change histogram of the ViSAR frames towards a quasi-sparse condition is more effective in comparison to usual histogram processing techniques which do not work on the frames to be quasi-sparse. However, we can use both histogram modification and histogram transformation concurrently. To do so, we use a basic theory like IBDH in [3], a histogram modification technique as per [5], and a DCT-based decomposition process towards histogram transformation. Our proposed method is given in Algorithms 1 and 2 for sender side and receiver side, respectively. All DCT patches are assumed as a single image because size of the original frame and its transformed version (towards quasi-sparsity) should be the same. Therefore, a plotted histogram corresponds to a transformed image, not a specific patch.
Algorithm 1: The embedding process in IBDH-DCT at the sender side. |
Input: An original host frame and hidden data. Procedure 1) Compute DCT coefficients of the original host frame. 2) Scale the DCT coefficients matrix into an interval of [0,255]. 3) Quantize scaled DCT coefficients matrix according to a digital image and consider as a new host frame with quasi-sparse spatial distribution. 4) Down-sample the quasi-sparse host frame (standard down-sampling is used). 5) Calculate a reconstructed version (up-scaled interpolated frame) of quasi-sparse host frame using interpolation technique. 6) Calculate an error image by subtraction of the original quasi-sparse host frame and its interpolated version considering histogram modification. 7) Calculate four key parameters of the reference IBDH technique based on histogram of the error image. 8) Inject bits of hidden data into the quasi-sparse host frame according to key parameters in the prior step and create a watermarked frame. 9) Transfer the watermarked frame to the receiver along with all key parameters computed at sender side. Output: The watermarked frame and key parameters related to the error image. |
Algorithm 2: The extraction process in IBDH-DCT at the receiver side. |
Input: Receive watermarked frame, and the key parameters in Algorithm 1. Procedure 1) Extract the hidden bits and the error image through an inverse function in IBDH theory (see the main source for IBDH details). 2) Down-sample the watermarked frame (standard down-sampling is used to have a down-sampled version which is exactly equal to the down-sampled version of original frame in Algorithm 1). 3) Re-construct the down-sampled frame of the prior step by interpolator to generate the interpolated frame. 4) Restore the quasi-sparse host frame by adding error image and the interpolated frame. 5) Rescale the quasi-sparse host frame to generate approximate DCT coefficients. 6) Compute an approximate version of the rescaled quasi-sparse host frame through inverse DCT as the original host frame. |
Output: The original host frame and injected bits. |
Algorithm 1 includes all steps of data embedding process at the sender, and Algorithm 2 contains steps of the reverse process at the receiver side which is named extraction. The proposed method is not although fully reversible in terms of the host image reversibility because the frame transformation process is lossy; however, this transformation process is near-lossless with a loss that can be ignored. Since we use a real decomposition transform, near-lossless happens (for example in the case of FFT with complex basis, a huge loss happens). Therefore, all the process can be near-lossless. On the other hand, because there is a full reversibility for the hidden data, we can compute quality metrics in the transformed samples.