Skip to main content

Accelerating the image processing by the optimization strategy for deep learning algorithm DBN

Abstract

In recent years, image processing especially for remote sensing technology has developed rapidly. In the field of remote sensing, the efficiency of processing remote sensing images has been a research hotspot in this field. However, the remote sensing data has some problems when processing by a distributed framework, such as Spark, and the key problems to improve execution efficiency are data skew and data reused. Therefore, in this paper, a parallel acceleration strategy based on a typical deep learning algorithm, deep belief network (DBN), is proposed to improve the execution efficiency of the DBN algorithm in Spark. First, the re-partition algorithm based on the tag set is proposed to the relief data skew problem. Second, the cache replacement algorithm on the basis of characteristics is proposed to automatic cache the frequently used resilient distributed dataset (RDD). By caching RDD, the re-computation time of frequently reused RDD is reduced, which lead to the decrease of total computation time of the job. The numerical and analysis verify the effectiveness of the strategy.

1 Introduction

With the improvement of observation ability in the field of big data [1,2,3] and the coexistence data of different imaging methods, wave bands, and resolution levels, remote sensing (RS) data has the characteristics of getting data in a short cycle, from a wide boundary and growing exponentially. In sharp contrast with remote sensing data acquisition ability, the processing ability of remote sensing information is relatively lower, and big data, small knowledge status has caused a certain degree of data disaster.

At present, with the combination of various fields and remote sensing technology becoming more and more close, the demand for using remote sensing images to accurately distinguish geographical information is highlighted. Many deep learning models are used in remote sensing image processing, such as Zhang et al. [4] made a technical tutorial on the state of the art for deep learning about remote sensing data, and Das and Ghosh [5] presented a deep learning approach for spatiotemporal prediction of remote sensing data. Hinton et al. [6] gave deep neural networks for acoustic modeling in speech recognition. Badrinarayanan et al. [7] constructed a deep convolutional encoder-decoder architecture for image segmentation. Compared with traditional neural networks and support vector machines, the accuracy of recognition using convolutional neural networks was improved. Mou et al. [8] presented a convolutional neural network (CNN) extraction of visual features which is more suitable for image retrieval. From the above, it can be seen that the deep learning model shows a strong ability for learning and generalization of remote sensing data and can better facilitate remote sensing image classification and identification of features. Deep belief network (DBN) algorithm has high learning and generalization ability for data and has achieved a lot of achievements in the field of image processing and network applications. For example, Dahl et al. [9] improved context-dependent pre-trained deep neural networks for large vocabulary speech recognition. Chen et al. [10] made spectral-spatial classification of hyperspectral data based on deep belief network. Fischer and Igel [11] trained restricted Boltzmann machines. Tran et al. [12] proposed an approach to fault diagnosis of reciprocating compressor valves using deep belief networks.

However, the deep learning algorithm has high computational complexity and requires multiple iterations to make the parameters converge to an optimal value. This results in the low efficiency and time overhead of data classification by the deep learning algorithm. The existing researches considered few about the data skew and replacement during the task assignment on an in-memory framework which leads to the increase of computation time.

In order to improve the efficiency of DBN, a parallel acceleration strategy for DBN (PA_DBN) is proposed, which includes re-partition algorithm and RDD (resilient distributed data) cache algorithm based on reused frequency and RDD size. Re-partition algorithm is used to solve the problem of data skew, and RDD cache algorithm is used to solve the problem of data reused. Then the strategy is verified effectiveness on the basis of remote sensing data processing.

2 Model definition and analysis

2.1 Basic model for DBN

The core architecture of the DBN algorithm is restricted Boltzmann machines (RBM). RBM simplifies the link between the visual layer and a hidden layer of the Boltzmann machine.

Definition 1 Joint distribution function of RBM. Suppose there are m nodes in the visual layer, in which the ith node is represented by vi, and the hidden layer is represented by n nodes, where the jth node is represented by hj. Assume the visual layer as v = (v1, v2, vm) and the hidden layer as h = (h1, h2, , hn), in which 1 ≤ i ≤ m, 1 ≤ j ≤ n, vi {0, 1}, hj {0, 1}, and the energy function of RBM is defined as:

$$ E\left(v,h\left|\theta \right.\right)=-\sum \limits_{j=1}^n\sum \limits_{i=1}^m{w}_{ij}{h}_j{v}_i-\sum \limits_{i=1}^m{a}_i{v}_i-\sum \limits_{j=1}^n{b}_j{h}_j $$
(1)

Among them, θ = {wij, ai, bj; 1 ≤ i ≤ m, 1 ≤ j ≤ n}, wij represents the weight value between hidden node j and visual node i, the bias value of visual layer node i denotes as ai, and the bias value of hidden node j denotes as bj. When the parameters are determined, the joint distribution function of RBM is defined as:

$$ p\left(v,h\left|\theta \right.\right)=\frac{1}{Z\left(\theta \right)}\exp \left(\left(-E\left(\left(v,h\left|\theta \right)\right.\right.\right)\right) $$
(2)
$$ Z\left(\theta \right)=\sum \limits_{v,h}\exp \left(\left(-E\left(\left(v,h\left|\theta \right)\right.\right.\right)\right) $$
(3)

Definition 2 Softmax function. This function is a classifier function, and since logical regression is a linear regression model, it is often used to solve the classification problem. For the specific testing set {((x(1)),,y(2))), , (x(m), y(m))}, hypothesis function hθ(x(j)) and cost function J(θ) are represented as:

$$ {h}_{\theta}\left(x\left.{}^{(i)}\right)\right)\left(=\right.\kern0.5em \left[\begin{array}{l}p\left({y}^{(i)}=1\left|\left({x}^{(i)}\right)\right.;\theta \right)\\ {}p\left({y}^{(i)}=2\left|\left({x}^{(i)}\right)\right.;\theta \right)\\ {}p\left({y}^{(i)}=k\left|\left({x}^{(i)}\right)\right.;\theta \right)\end{array}\right] $$
(4)
$$ J\left(\left.\theta \right)\right)=-\frac{1}{m}\left[\sum \limits_{i=1}^m\sum \limits_{j=1}^k1\left\{{y}^{(i)}=j\right\}\log \frac{e^{\theta_j^T{x}^{(i)}}}{\sum_{l=1}^k{e}^{\theta_l^T{x}^{(i)}}}\right] $$
(5)

where θ is the parameter of the model, and k represents several classes.

DBN can be regarded as a multi-layer RBM; the sample label is combined with the softmax classification function to supervise the training, and the backpropagation is used to form the DBN and tune the model.

2.2 Job execution model

Definition 3 RDD execution time. Assuming that each RDD has n partitions, RDD is denoted as RDDi = {Pi1, Pi2, , Pin}, where PTij represents the jth partition of RDDi. Therefore, the execution time of RDDi is the maximum computation time of n partitions for RDDi, which is denoted as TRDDi that is:

$$ {T}_{{\mathrm{RDD}}_i}=\max \left({T}_{P_{i1}},{T}_{P_{i2}},\cdots, {T}_{P_{in}}\right) $$
(6)

The computation time of partition is composed of read and process cost, where read cost is the time to get parent partitions and process cost is the processing time depending on the type of complexity, closure, and the size of parent partitions. Assume parentsij as the parents RDD of RDDij, and the computation time of Pij can be represented as:

$$ {T}_{PT_{ij}}=\mathrm{read}\left({\mathrm{parents}}_{ij}\right)+\mathrm{process}\left({\mathrm{parents}}_{ij}\right) $$
(7)

Lemma 1 The consistency principle of partition computation time. One partition of RDD ij with larger computation time will increase the execution time of RDD i .

Proof Assume PTmax as the partition with the largest computation time in RDDi and Tmean as the mean computation time of all partitions of RDDi, it is easy to get \( {T}_{PT_{\mathrm{max}}}>{T}_{\mathrm{mean}} \). Based on the definition above, we can know that RDD execution time is depending on the partition with maximum execution time.

Lemma 2 The task skewing principle. The efficient task allocation can effectively reduce the execution time of local tasks, and accelerate the execution of tasks.

Proof Assume the current set of workers {w1,w2,…wm} has been ordered in accordance with the computing ability, and the input partitions of all workers are {Pw1,Pw2,…,Pwm}. According to the division, w1 has completed multiple rounds of pull tasks, and wm is the last worker to join which has not performed pull tasks. For the worker w1, record format in the partState, is 1,2,…,m − 1, which indicates that w1 has completed the local pull of the first m − 1 partition. For the worker wm, record format in the partState is m, which indicates wm, has not performed any local pull tasks.

Before the task is switched, the local pull task that the worker w1 is going to perform can be defined as:

$$ {\mathrm{Task}}_{p1}=\mathrm{compute}\left({P}_{w_1},{P}_{w_m}\right) $$
(8)

The local pull tasks that are performed by the worker wm can be defined as:

$$ {\mathrm{Task}}_{pm}=\mathrm{compute}\left({P}_{w_1},{P}_{w_2},\dots, {P}_{w_m}\right) $$
(9)

From the point of view of task workload, Taskpm > Taskp1. Since w1 is the fastest worker, wm is the slowest worker and worker w1 and wm swap tasks. In essence, the switching task increases the computation cost of w1 and reduces the workload of wm, so task allocation is skewed.

Because the computing ability of w1 is higher than wm, the workload of Taskpm is larger than Taskp1, so we can get two characteristics, w1.runTimes(Taskpm) < wm.runTimes(Taskpm) and wm.runTimes(Taskpm) > wm.runTimes(Taskp1).

It means that the execution time of Taskpm on w1 is less than that of Taskpm on wm. The above characteristics show that the task allocation can effectively reduce the execution time of local tasks and accelerate the execution of tasks.

Lemma 3 The principle of saving time. Assume the overall execution time of RDD i is r i , then the time to save by caching RDD i will be \( \left({r}_i-1\right)\ast {T}_{{\mathrm{RDD}}_i} \) .

Proof Assume the execution times of each RDD in a job as R = {r1, r2, , rn}, ri as the total execution time of RDDi and the time to calculate RDDi as \( {T}_{{\mathrm{RDD}}_i} \). Based on Definition 3, if the RDDi is cached, then the time \( \left({r}_i-1\right)\ast {T}_{{\mathrm{RDD}}_i} \) is saved. When cached all the RDD in the job, the saved time for Tjob is represented as:

$$ {T}_{\mathrm{job}}=\left({r}_1-1\right){T}_{{\mathrm{RDD}}_1}+\left({r}_2-1\right){T}_{{\mathrm{RDD}}_2}+\cdots +\left({r}_n-1\right){T}_{{\mathrm{RDD}}_n}=\sum \limits_{i=1}^n\left({r}_i-1\right){T}_{{\mathrm{RDD}}_i}>0 $$
(10)

3 Methods

In this section, a parallel acceleration DBN strategy (PA_DBN) is proposed to improve the execution efficiency of the DBN algorithm in Spark, and the detail process of the PA_DBN strategy is shown in Fig. 1.

Fig. 1
figure 1

Flow chart of a PA_DBN strategy

The detail process of PA_DBN strategy is:

  1. Step 1.

    Initialize the read data path and the number of data partitions. Spark uses RDD’s text file operator to read the data from HDFS to the memory of the Spark cluster.

  2. Step 2.

    Create an RBM training method, which contains backpropagation; the result of the backward calculation is used as the next RBM input data, and the weight of DDBN algorithm is updated forward to reduce the error.

  3. Step 3.

    If data skew occurs, perform the RP algorithm; the re-partition (RP) algorithm is used to partition RDD to avoid the situation that some RDD has much larger size and leads to higher computation time.

  4. Step 4.

    If there is any RDD with reused frequency more than 2, the RDD cache (RC) algorithm will be performed, which is used to cache frequently reused RDD with higher weight on the basis of the RDD frequency and RDD size. When memory space is insufficient, the RDD with smaller weight will be replaced first.

  5. Step 5.

    The weight parameter is initialized, the weight of the first layer is calculated, and the weight of the hidden layer is calculated in combination with the function of DBN training in step 2, and then the weight values of each node are merged.

  6. Step 6.

    Save the weight parameters to HDFS by training.

3.1 Re-partition algorithm

In Spark, the RDD partition is partitioned according to the hash partition algorithm, which results in the different size of the RDD partition in DBN. Based on Definition 3, the maximum partition execution time of RDD determines the execution time of RDD. Therefore, the different size of RDD partition affects the execution speed of DBN. RP algorithm is proposed to solve the problem of data skew caused by skew partitioning of data.

The details of the RP algorithm are:

  1. Step 1.

    The sample data set is sampled on a small scale, and the sample set is judged to determine whether the data is skewed or not.

  2. Step 2.

    If the data is skewed, repartition the data. By a series of segmentation tags, if the data has n partitions based on the parallelism degree, then we need to have n − 1 segmentation tags (s1,s2,...,sn − 1).

  3. Step 3.

    When the data is partitioned under Spark, the data is distributed to different partitions according to the tag set, such as key1 < s1, s2 < key2 < s3,..., sn-2 < keyn − 1 < sn − 1, sn − 1 < keyn.

3.2 RDD cache algorithm

In the process of PA_DBN execution, by minimizing the space occupied by the storage area and reserving the memory space allocated to the execution area, the task execution efficiency can be effectively improved. The memory area minimization algorithm is shown in Algorithm 1, and the specific steps are as follows:

  1. Step 1.

    Information of RDDs are obtained from DAG graph of the DBN job. Through pruning analysis and depth-first access, the RDD with action operation is set as the root node, RDDs with the frequency of 0 is pruned, while RDD with frequency more than 1 is reserved as the alternatives. When f = 0, delete is no longer used; when f > 0, the higher the frequency, the greater the weight of caching. Traversing the key-value pair set R < RDDi, f >, putting the RDD of f > 1 into the candidate list to be cached. The dynamic frequency f decreases by 1 for each RDD visited.

  2. Step 2.

    In actual execution, once the generated RDDs are in the alternative cache list, they should be compared. If there are more than one RDDs in the candidate list, it is sorted by relative value/size and placed in the cache. The restriction is that the size of RDD and the currently used storage area cannot exceed the allocated storage memory size.

  3. Step 3.

    When the memory space is insufficient, the least weighted RDDs are cleaned and replaced in turn according to the order of the weight list until the size of the space required by the new RDD is satisfied.

figure a

4 Result and discussion

4.1 Experimental environment

The experimental environment uses one master and four workers to establish a spark cluster. Considering the large amount of computation and abundant information of remote sensing image data, remote sensing image data is used as a data source in this experiment, and the data is derived from a Landsat8 satellite in 2013. The study area is Manas County, Xinjiang, located at 85.7–86.7° E and 43.5–45.6° N (Fig. 2). In this paper, the pre-processing of remote sensing data is realized by using ENVI5.2 software to improve the authenticity of remote sensing data. Then common feature indexes are extracted, such as normalized vegetation index (NDVI), difference vegetation index (DVI), ratio vegetation index (RVI), and enhanced vegetation index (EVI). Based on the four characteristic index parameters, five different text sample sets are extracted as shown in Table 1. The other data in the sample set include desert, river, and so on; the training samples and testing samples in grassland; and others are 5000 and 3000 item as shown in Table 2.

Fig. 2
figure 2

Remote sensing image of Manasi County, Xinjiang

Table 1 Different combination of characteristics
Table 2 Training samples and testing samples

4.2 The execution time of PA_DBN

In this experiment, sample 1 is selected as the training set of PA_DBN model. Under the two orders of magnitude of 5000, 10,000 and 20,000, the test time of PA_DBN and traditional DBN training data is shown in Table 3 and Fig. 4, where 1, 2, and 3 represent the item of 20,000, 10,000, and 5000 in Fig. 3.

Table 3 The execution time of PA_DBN and DBN
Fig. 3
figure 3

The comparison with the two algorithms

Table 3 and Fig. 3 show that the execution speed of the PA_DBN algorithm is 12.7% times faster than that of the DBN algorithm under the same data volume compared with that of the DBN algorithm under the same amount of 10,000 item. In the same amount of 20,000 item, the execution speed of the PA_DBN algorithm is about 16.4% times higher than that of the DBN algorithm under the same data volume.

4.3 The execution time by using RP algorithm

With the hidden layer number 3 of DBN, different samples are used for input data, that is, different feature selection. The comparison of execution time between the original hash partition and re-partition algorithm is shown in Fig. 4 by processing the same sample set, where the number of sample category represents sample 1–5.

Fig. 4
figure 4

The comparison by taking a re-partition algorithm

In Fig. 4, it can be seen that under different sample sets, the execution speed of PA_DBN is different, and the execution speed of PA_DBN is improved by using the re-partition algorithm. At the same time, the execution time of PA_DBN varies.

Different feature combinations are used as DBN input data, such as NDVI, RVI, DVI, and EVI to test the accuracy. When the number of hidden layers of DBN is 3 and the number of iterations is 1000, the other parameters are configured fixed. When different feature combinations are used as output parameters, the accuracy of DBN for grassland discrimination is shown in Table 4. For model structure n-h-o, n represents the number of characteristics, h represents the number of hidden layer, and o represents the number of output.

Table 4 Different characteristics combination

In Table 4, we could obtain that different feature combinations are used as input parameters so that the accuracy of DBN is different; when the feature combination of input parameters is NDVI + RVI + DVI + EVI, the highest accuracy is 96.19.

4.4 The execution time of RC algorithm

In this experiment, the fifth sample set is selected as the training set of the PA_DBN algorithm. By adjusting the number of hidden layers to increase the number of cache RDD, the execution time of RDD before and after using the RC algorithm is tested, as shown in Fig. 5.

Fig. 5
figure 5

The comparison by taking RDD cache algorithm

As shown in Fig. 5, by using RC algorithm, the execution time of PA_DBN algorithm for data training is shortened, and the execution efficiency of PA_DBN algorithm speeds up. Meanwhile, the execution time of both PA_DBN with RC and DBN is increased with the increase in the number of hidden layers, and improving the execution efficiency of DBN becomes more and more significant. In order to further improve the accuracy of DBN, we test the accuracy of PA_DBN under a different number of hidden layers. The feature combination of NDVI, RVI, DVI, and EVI is used as the input data, iteration times 1000. The accuracy of PA_DBN was shown in Table 5.

Table 5 Different number of hidden layer

In Table 5, it can be seen that the difference in a topological structure is the difference of the corresponding number of hidden layers. As the number of hidden layers increased, the accuracy of PA_DBN for grassland discrimination showed an upward trend. The highest accuracy rate of PA_DBN for grassland discrimination was 97.41, and the accuracy of PA_DBN decreased when the number of hidden layers was more than four layers. Therefore, the accuracy of DBN does not increase as the number of hidden layers increases indefinitely.

From Tables 2, 3, and 4 and Figs. 4 and 5, there are three groups of experiments in this section. Experiment 1 shows that the training speed of the PA_DBN algorithm is better than that of DBN algorithm under the same order of magnitude. Experiment 2 verifies that the RP algorithm is used to solve the problem of data skew and improve the speed of PA_DBN execution. Experiment 3 verifies that the RC algorithm is used to solve the problem of high automatic cache re-usability without fine-grained data replacement.

5 Conclusions

In this chapter, we proposed a PA_DBN strategy under Spark to solve some problems existing in the implementation of DBN algorithm on the basis of the theoretical analysis, such as data skew, lack of fine-grained data replacement, and high automatic cache re-usability. These problems lead to the defects of high complexity and low execution time of DBN. The parallel acceleration strategy based on Spark DBN is adopted to solve the problems. The execution efficiency of PA_DBN strategy is improved, and the training sample is solved by a re-partition algorithm. The problem of skew of this set makes the amount of data contained in each partition of RDD more uniform and improves the speed of DBN training. Through the RC algorithm, it can cache the RDDs with high reused frequency in the DBN algorithm. The experiments are conducted to verify the effectiveness of the presented strategy.

Our future work is mainly concentrated on the following aspects: analyze different types of remote sensing resources, design the optimization strategy adapting to the load and type of jobs, and take advantages of another convolutional algorithm to improve the execution efficiency.

Abbreviations

CNN:

Convolutional neural network

DAG:

Directed acyclic graph

DBN:

Deep belief network

RBM:

Restricted Boltzmann machines

RDD:

Resilient distributed dataset

RS:

Remote sensing

References

  1. W.S. John, Big data: a revolution that will transform how we live, work, and think. International Journal of Advertising 33(1), 181–183 (2014)

    Article  Google Scholar 

  2. K. Kambatla, G. Kollias, V. Kumar, et al., Trends in big data analytics. Journal of Parallel and Distributed Computing 74(7), 2561–2573 (2014)

    Article  Google Scholar 

  3. C.L.P. Chen, C.Y. Zhang, Data-intensive applications, challenges, techniques and technologies: a survey on big data. Information Sciences 275(11), 314–347 (2014)

    Article  Google Scholar 

  4. L. Zhang, L. Zhang, B. Du, Deep learning for remote sensing data: a technical tutorial on the state of the art. IEEE Geoscience & Remote Sensing Magazine 4(2), 22–40 (2016)

    Article  Google Scholar 

  5. M. Das, S.K. Ghosh, Deep-STEP: a deep learning approach for spatiotemporal prediction of remote sensing data. IEEE Geoscience & Remote Sensing Letters 13(12), 1984–1988 (2016)

    Article  Google Scholar 

  6. G. Hinton, L. Deng, D. Yu, et al., Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Processing Magazine 29(6), 82–97 (2012)

    Article  Google Scholar 

  7. V. Badrinarayanan, A. Kendall, R. Cipolla, et al., SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  8. L. Mou, P. Ghamisi, X. Zhu, et al., Deep recurrent neural networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing 55(7), 3639–3655 (2017)

    Article  Google Scholar 

  9. G.E. Dahl, D. Yu, L. Deng, et al., Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing 20(1), 30–42 (2012)

    Article  Google Scholar 

  10. Y. Chen, X. Zhao, X. Jia, et al., Spectral–spatial classification of hyperspectral data based on deep belief network. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 8(6), 2381–2392 (2015)

    Article  Google Scholar 

  11. A. Fischer, C. Igel, Training restricted Boltzmann machines. Pattern Recognition 47(1), 25–39 (2014)

    Article  Google Scholar 

  12. V.T. Tran, F. Althobiani, A. Ball, et al., An approach to fault diagnosis of reciprocating compressor valves using Teager-Kaiser energy operator and deep belief networks. Expert Systems With Applications 41(9), 4113–4122 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the reviewers for their thorough reviews and helpful suggestions.

Funding

This paper was supported by the National Natural Science Foundation of China under Grant Nos. 61262088, 61462079, and 61562086.

Availability of data and materials

All data are fully available without restriction.

Author information

Authors and Affiliations

Authors

Contributions

CTY is the main writer of this paper. She proposed the main idea, completed the experiment, and analyzed the result. CYY and ZH gave some important suggestions for this paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Changyan Ying.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ying, C., Huang, Z. & Ying, C. Accelerating the image processing by the optimization strategy for deep learning algorithm DBN. J Wireless Com Network 2018, 232 (2018). https://doi.org/10.1186/s13638-018-1255-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13638-018-1255-6

Keywords