- Open Access
Trend prediction of the 3D thermocline’s lateral boundary based on the SVR method
© The Author(s). 2018
- Received: 22 January 2018
- Accepted: 11 October 2018
- Published: 24 October 2018
In recent years, with the use of the float-based wireless sensor network in the Argo project, a large amount of ocean data has been obtained. These data can be applied to analyze the oceanic thermocline, and the forecasting trend of the thermocline through the SVR method in machine learning is presented in this paper. Firstly, this paper refines the spatial resolution with the SVR method and determines the lateral boundary of the three-dimensional thermocline through the information entropy method. Combined with BOA Argo data from 2004 to 2015, this paper then predicts the thermocline trend (10°–25° S and 55°–80° E) over the next 4 years. The results show that the trend of the three-dimensional thermocline’s lateral boundary can be effectively predicted with the application of SVR method.
The ocean with abundant resources and broad development prospects has a vital significance to the humankind. With the increasing frequency of maritime and military activities, the rapid development of the marine economy, and the worsening marine environment, the marine science is increasingly raising public attention. The objective of marine scientific research are the composition, structure, property, distribution, genesis, and evolution rules of various natural phenomena relevant to the oceans, as well as the exploitation and utilization of marine resources. However, owing to the lack of oceanographic data, the analysis of the internal characteristics of the marine environment mainly focuses on large-scale regional and seasonal changes. To some extent, the implementation of Argo project has provided the data with more detailed and precise space and time intervals for marine environmental research. Meanwhile, many overseas and domestic scholars have utilized the data from Argo project to study the oceanic thermocline, circulation, water mass, and so forth [1–4]. Due to the limitation of data acquisition equipment and methods, more accurate data cannot be obtained at present. Over the past 15 years, the international Argo program has built a global ocean observing network of 3000 satellite-tracking automatic detection floats through the joint cooperation of more than 30 member countries in the world. It has extensively collected the global temperature and salinity data from the surface to 2000-m depth of ice-free deep ocean. As a kind of atypical wireless sensor network, the ocean observing network of Argo program is mainly applied for perceiving the relevant attributes of the ocean, such as temperature, salinity, depth, and other environmental information. Studying the Argo data is conducive to further exploring the internal state of the ocean, deeply analyzing how the oceans affect the global climate and obtaining innovative achievements in the fundamental researches and operational applications in the fields of marine, meteorological, fishery, and transportation.
The temperature and salinity of seawater are the basic elements of the oceans, and the variation of their spatial and time distribution is closely related to almost all the phenomena in the oceans. Therefore, the distribution and variation of the temperature and salinity will affect the movement of seawater and form different water masses by their different characteristics. In different sea areas, the spatial and temporal distribution of various marine environmental elements is extremely complicated, and the seawater (temperature, salt, density) spring layer is an important phenomenon in the ocean. Concerning the division of vertical boundaries of water masses, the existence and change of thermocline directly affects the mariculture and fishing, as well as the sound channel characteristics generated by the change of the acoustic cline, thereby affecting the submarine sonar communication system. The strong spring layer can also hinder the transport of nutrients between the upper and lower water layers, along with the vortex and convective heat exchange, play a natural “barrier” role. At the same time, the formation mechanism of thermocline is closely related to circulation, water mass, and internal wave of the ocean. Hence, the researches on thermocline are crucial to national defense, underwater communication, fishing, material diffusion, turbulent thermal diffusion, and other marine theoretical study. In recent years, with the rapid development of computer technology, a variety of data processing methods are generated. Regarding marine data with the characteristics of diverse types, large amounts, and complicated correlations, the traditional interpolation method can be replaced by the SVR method. As a statistical method, the traditional interpolation techniques require the model to re-run all the data. In terms of the dynamic SVR methods, it can only run on those new data, but not all the data again . The predictive and generalization capacity of SVR depends on the choice of kernel function, while the traditional SVR method mainly chooses the kernel function based on the experience of certain risks in specific applications. Radial basis function (RBF) is an extensively used SVR kernel function with high prediction accuracy [6, 7].
In this paper, a method for predicting the trend of three-dimensional thermocline’s lateral boundary is presented on the basis of SVR method. Specifically, the BOA Argo temperature data with spatial resolution of 1° × 1° is initially refined. Then, the lateral boundary of the three-dimensional thermocline is determined with high-resolution data through the information entropy method, and its future variation trend is predicted as well.
The remainder of this paper is organized as follows. In Section 2, this paper introduces the source of Argo project data and the present research status of thermocline. In Section 5, the design and implementation of the algorithm are demonstrated in detail. In Section 4, the numerical and predicted results of the experiment are compared and analyzed. In Section 5, the summary and prospects of the paper are shown.
This section first introduces the details of the Argo project and proposes the shortcomings of ocean observations in the accurate determination of the thermocline. Then, the current situation and significance of the thermocline researches are given as well.
2.1 Argo project
Currently, the data provided by the Argo floats has been upgraded to version 3.1, which is utilized for further marine researches. Dong et al. adopt the temperature, salinity, and pressure profiles of the Argo floats to deduce the mixed-layer depth (MLD) of the Southern Ocean . The estimation accuracy of Argo profiling float dataset for the temperature and heat storage in the upper North Atlantic Ocean is studied by Hadfield et al. as well . The study of Guinehut et al. aims to analyze the contribution of the combination of high-resolution sea level and sea surface temperature satellite data with accurate but sparse in situ temperature profile data as given by Argo to the reconstruction of the large-scale, monthly mean, 200-m depth temperature fields . Resnyanskii et al. use the Argo profiling floats dataset to estimate the means, variances, and three-dimensional spatial covariances of the temperature and salinity anomalies in the upper 1400 m ocean layer . Maze et al. introduce how to conduct the unsupervised classification of Argo temperature profiles . In recent years, it is difficult to deal with increasing volume of the marine data through the traditional mathematical statistics method, so the artificial intelligence method can be applied to process and analyze the massive data. However, the application range of the ocean observational data acquired through the conventional observation means or Argo floats is limited by some problems, such as inconsistent observation depth, discontinuous observation time, and spatial discrepancy. The member countries of Argo project have analyzed the Argo data objectively and developed the gridded products [15–18]. As a supplement to the basic information on the global ocean phenomena, it greatly facilitates the further researches. Second Institute of Oceanography, SOA has also constructed a gridded dataset of Argo temperature and salinity in the global ocean through a more simple and effective objective analysis method, referring to the BOA Argo (http://www.argo.org.cn/).
The spatial range of BOA Argo data covers the global ocean (180°W–180°E, 79.5°S–79.5°N) with a spatial resolution of 1° × 1°. The seawater between 10 and 1950 m in depth is divided into 58 vertical standard layers, and the minimum distance between the two layers is 10 m. The gridded dataset could be used for studying the basic phenomena of physical ocean, but the precision of its data reaches the requirements when the thermocline is judged. In this paper, the SVR method will be applied to refine the data.
2.2 Research status and significance of thermocline
There exist the thermocline, halocline, pycnocline, and sound velocity spring layer in the ocean, and the thermocline refers to an area with a great change in the vertical gradient of seawater temperature. Since the eigenvalues of the thermocline mainly comprises the strength and thickness of the thermocline, and the depth of the upper boundary [19, 20], how to determine the three-dimensional boundary of thermocline and predict the variation trend of thermocline plays a key role in the analysis.
At present, a series of studies on the ocean temperature structure have been carried out. The researches on thermocline are meaningful for not only the theoretical study but also the national defense, underwater communications, and fisheries. In terms of fishing, the Thunnus albacares is one of the major targets of the oceanic tuna fishery worldwide, and it moves mostly inside the mixed layer and occasionally below the upper boundary of the thermocline, which is influenced by the temperature gradient greatly . Meanwhile, there are many environmental factors affecting the fishing rate of the Thunnus albacares. Romena  pointed out that the distribution of adult Thunnus albacares was affected by the 20 °C isotherm, and Song  analyzed that the vertical distribution of Thunnus albacares was related to the thermocline. Concerning the underwater communication and military detection, the underwater acoustic communication is currently the unique means of communication in the ocean, but the acoustic propagation in the water is influenced by changes in temperature, salinity, and density. Therefore, a sudden change in seawater structure in the thermocline area will directly affect the sound transmission, leading to sonar failure . In this condition, the researches on the thermocline are significant to study the distribution of marine fishing grounds and underwater communication detection.
3.1 Thermocline determination
In the analysis of thermocline, the thermocline should be initially determined, and the thermocline phenomena are usually described by the characteristic features of thermocline, namely the depth, the intensity, and the thickness. Hence, it is vital to determine the boundary of the thermocline and obtain the characteristic quantity of the thermocline. The traditional methods for determining the upper and lower boundaries of the thermocline are the vertical gradient method, the curvature extremum method, and the S-T method .
There exist some limitations of the traditional determination method for thermocline. Specifically, the vertical gradient method will cause the discontinuity between the two critical points of shallow water (less than 200 m in depth) and deep water (over 200 m in depth). After using the standard layer data to plot the temperature and depth curves, it is intuitive to determine the depth of the upper and lower bounds of the thermocline through the maximum curvature point method. In the case of insignificant curvature or multiple thermoclines, this method brings difficulties to data analysis. The S-T method is mainly applicable to the deep-water oceanic area, but not suitable for the areas obviously affected by solar radiation, precipitation, and diluted water. Since only the upper boundary of thermocline can be determined in the S-T method, this paper combines the “information entropy method” in machine learning with the traditional method for more precise determination . The relevant principles and computational analysis process of the information entropy method are presented as below.
The entropy values increase as the uncertainty of the variables increases.
Information gain: the higher the information gain, the higher the purity acquired by performing the division through the attribute “a.”
Select nsamples attributes, then xijis the value of the ith sample’s the jth attribute (i = 1, 2, …, n; j = 1, 2, …, m).
Normalize the index and make the homogeneity data a homogeneity.
Calculate the proportion of the ith sample in jth attribute.
Calculate the entropy value of the jth attribute.
Calculate the information gain.
Calculate the weights of each index.
Calculate the comprehensive score of each sample.
Where s is the comprehensive score of each sample when it comes to form a thermocline; w is the important degree of each attribute for the formation of thermocline.
As the “information entropy method” combined with the traditional method can cover the shortage of only considering the strength, the thermocline can be more comprehensively and accurately determined, and then the lateral boundary of the three-dimensional thermocline can be determined.
3.2 SVR algorithm and prediction evaluation
3.2.1 Principles of support vector regression
Qij = K(xi, xj) ≡ ϕ(xi)Tϕ(xj) is the kernel function.
3.2.2 Model evaluation
The precision, recall, and F1-measure are adopted to evaluate the prediction results of the algorithm, and TP, FP, TN, and FN in the in the formula are defined as follows.
TP, positive samples predicted as true; FP, positive samples predicted as false; TN, negative samples predicted as true; FN, negative samples predicted as false.
In this section, the experimental method is validated, and the lateral boundary of thermocline is determined by combining this method with the information entropy method. The variation trend of the lateral boundary of thermocline is predicted as well. The experiment adopts Ubuntu Linux 16.04, Python 3.6, scikit-learn 1.9.2 as experimental test platform.
4.1 Data selection
4.2 Method validation
The evaluation for prediction results
As seen from the above table, the accuracy rate can reach above 0.5, especially in the winter and summer when the thermocline appears obvious, and both the recall rate and F1 value obtain higher results. Therefore, it can be concluded that the SVR method used in this paper can accurately predict the variation trend of thermocline’s lateral boundary.
4.3 Data preprocessing
4.3.1 Data refinement
The high-resolution marine temperature and depth data for more accurate thermocline determination and trend prediction, but the fitness of the gridded data from BOA Argo is far below our requirements at present. This paper uses the SVR method to refine the BOA Argo data to eventually obtain high-resolution data of 0.01° × 0.01° × 5 m.
Figure 4 demonstrates the temperature distribution of the real data on the left and the temperature distribution of the high-resolution data on the right. The overall trend of the temperature distribution under the real data consists with that under the high-resolution data, but the temperature distribution under the high-resolution data is more refined. Specifically, the regional boundary of temperature change is more obvious, the area with the temperature jump in the upper right part of the figure shows more prominent temperature gradients, and the excessive change of boundary temperature is more precise. In this condition, the high-resolution data after refinement with a great research value is easier to observe and analyze.
4.3.2 Determination of thermocline
Where l is the temperature strength, t is the temperature, d is the depth, and n is the layer number.
It is assumed that the thermocline does not exist on the surface, and the test results are supposed not to be affected irrespective of the first layer.
Considering the judgment criteria of temperature strength, “l > 0.2” samples are filtered and written into the data.txt file, which would be utilized for merging and selecting thermoclines.
Based on these factors combined with the high-resolution data, the existence of a thermocline in the area can be accurately judged. Then, the position of the thermocline’s lateral boundary can be accurately determined according to the determination of the critical position of thermocline and non-thermocline area.
4.4 Trend prediction for the lateral boundary of thermocline
It can be known from Fig. 5 that the temperature distribution predicted by the SVR method is very similar to that under the high-resolution data, which is consistent with the overall temperature distribution trend of the real data. The temperature gradient boundary and distribution can be accurately obtained from the prediction results, which show that the use of SVR method is effective for forecasting the temperature boundary accurately in the horizontal direction. In Fig. 6, the trend of temperature changing curves for the real data, high-resolution data, and prediction results are basically the same, and the predictive temperature value is slightly lower than the true value merely in the condition of the shallow seawater. Therefore, it is reckoned that the temperature prediction results with high precision can be obtained through the SVR method in the depth direction. By comparing the temperature distribution in the horizontal direction and the temperature variation curve in the depth direction, it can be seen that the SVR method can be used to predict the ocean temperature variation, and then predict the variation trend of the three-dimensional thermocline’s lateral boundary.
Figure 7 reveals the variation trends of the lateral boundary of the thermocline have no great difference from 2016 to 2019, and the temperature near the equator is obviously higher than that away from the equator as a whole. The highest temperature appeared at approximately 10°–15° South and longitude 70°–80° East, and the lowest temperature appeared at approximately 20°–22.5° South and longitude 75°–80° East. In this sea area, the maximum temperature reaches 29 °C in 2018, and the maximum temperature of 2019 is 28.7 °C, which is basically the same as that of 2016 and 2017. From 2016 to 2019, the range of high-temperature areas shows a decreasing trend, and the temperature gradient of the upper right part of the figure also decreases. Accordingly, it is obvious from the prediction results that the temperature change varies with the year, which can provide a reference for further research on the factors affecting the temperature change.
In this paper, the SVR method in machine learning is applied to predict the variation trend of three-dimensional thermocline’s lateral boundary in the study area (10°–25° S and 55°–80° E). In this condition, the paper first utilizes the original temperature and depth data of the ocean to make a prediction of 2016 with SVR method, and compares them with the real data of 2016 in order to verify the feasibility of the SVR method. Based on the SVR method, the temperature and depth data of the sea area are then processed with high-resolution data (horizontal resolution of 0.01° and vertical resolution of 5 m). The changes in the high-resolution temperature distribution of refinement results are easier for observation and analysis. Finally, the “information entropy method” in machine learning is combined with the traditional judgment method to determine the lateral boundary of thermocline in this paper. Meanwhile, the SVR model is adopted to analyze the variation trend of three-dimensional thermocline’s lateral boundary from 2017 to 2019. The results show that the use of SVR method can realize the variation trend prediction for three-dimensional thermocline.
With regard to ocean data refinement in the future, the SVR method are adopted to refine the temperature, salinity, and depth data at a higher resolution, and the amount of refined data will grow exponentially. In the research of thermocline judgment, we study the three-dimensional boundary of thermocline so as to judge the three-dimensional “temperature jump body” more accurately and plan to propose the concept of three-dimensional “temperature jump body.” The study of three-dimensional “temperature jump body” will provide a more precise thermocline location for marine fishing ground distribution and acoustic communication research. In terms of the thermocline prediction, we will further consider the external environmental factors, climate change, anthropogenic effects, and other conditions simultaneously, to obtain more accurate prediction results for thermocline.
The Argo data were collected and made freely available by the International Argo Program and the national programs that contribute to it. (http://www.argo.ucsd.edu/, http://argo.jcommops.org/). The Argo Program is part of the Global Ocean Observing System. The BOA Argo data were obtained from China Argo real-time data center (http://www.argo.org.cn/).
The work was supported by the Zhejiang Provincial Natural Science Foundation of China (No.LY14F020044), the National Natural Science Foundation of China (U1713205, 51409117, 51809112), and the Stable Supporting Fund of Science and Technology on Underwater Vehicle Laboratory (SXJQR2018WDKT04).
Availability of data and materials
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
HQ and ZD conceived and designed the experiments; YJ and ZD performed the experiments; YJ and CW analyzed the data; HQ and CW wrote the paper. All authors read and approved the final manuscript.
The authors declare that they have no competing interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- D. Yuan, Z. Zhang, P.C. Chu, et al., Geostrophic circulation in the tropical North Pacific Ocean based on Argo profiles. J. Phys. Oceanogr. 44(2), 558–575 (2014)View ArticleGoogle Scholar
- X.F. Wu, Q.L. Zhang, Z.H. Liu, Annual and interannual variations of the Western Pacific Warm Pool volume and sources of warm water revealed by Argo data. Sci. China. (Series D). 57(9), 2269–2280 (2014)View ArticleGoogle Scholar
- L. Cheng, J. Zhu, Uncertainties of the ocean heat content estimation induced by insufficient vertical resolution of Historical Ocean subsurface observations. J. Atmos. Ocean. Technol. 31(6), 1383–1396 (2014)View ArticleGoogle Scholar
- C.H. Sun, Z.H. Liu, M.R. Tong, et al., The application of Argo data to water masses analysis in the Northwest Pacific Ocean. J. Mar. Sci., 10(2), 1–13 (2008) Google Scholar
- P.K. Bhaskaran, R.R. Kumar, R. Barman, et al., A new approach for deriving temperature and salinity fields in the Indian Ocean using artificial neural networks. J. Mar. Sci. Technol. 15(2), 160–175 (2010)View ArticleGoogle Scholar
- B. Gu, V.S. Sheng, A robust regularization path algorithm for v-support vector classification. IEEE Trans. Neural. Netw. Learn. Syst. 28(5), 1241–1248 (2017)View ArticleGoogle Scholar
- B. Gu, V.S. Sheng, K.Y. Tay, et al., Incremental support vector learning for ordinal regression. IEEE Trans. Neural. Netw. Learn. Syst. 26(7), 1403–1416 (2015)MathSciNetView ArticleGoogle Scholar
- Argo Science Team.On the design and implementation of Argo: an initial plan for a global array of profiling floats. ICPO Report No.21. GODAE International Project office, Bureau of Meteorology, 1998Google Scholar
- X. Jianping, A exploration of global ocean argo observing [M] (Ocean Press, Beijing, 2002)Google Scholar
- S. Dong, S. Janet, T. Gille Sarah, et al., Southern Ocean mixed layer depth from ARGO float profiles. J. Geophys. Res. Oceans, vol. 113, C06013 (2008). https://doi.org/10.1029/2006JC004051
- R.E. Hadfield, N.C. Wells, S.A. Josey, et al., On the accuracy of North Atlantic temperature and heat storage fields from Argo. J. Geophys. Res. Oceans. 112(C1), C01009-[17pp] (2007)View ArticleGoogle Scholar
- S. Guinehut, P.Y.L. Traon, G. Larnicol, et al., Combining Argo and remote-sensing data to estimate the ocean three-dimensional temperature fields—a first approach based on simulated observations. J. Mar. Syst. 46(1–4), 85–98 (2004)View ArticleGoogle Scholar
- Y.D. Resnyanskii, M.D. Tsyrulnikov, B.S. Strukov, et al., Statistical structure of spatial variability of the ocean thermohaline fields from Argo profiling data, 2005–2007[J]. Oceanology. 50(2), 149–165 (2010)View ArticleGoogle Scholar
- G. Maze, H. Mercier, R. Fablet, et al., Coherent heat patterns revealed by unsupervised classification of Argo temperature profiles in the North Atlantic Ocean. Prog. Oceanogr. 151, 275–292 (2017)View ArticleGoogle Scholar
- T.U. Bhaskar, M. Ravichandran, R. Devender, An operational objective analysis system at INCOIS for generation of Argo value added products. Indian National Center for Ocean Information Services (INCOIS) (2007). http://www.incois.gov.in/documents/TechnicalReports/INCOIS-MOG-ARGO-TR-04-2007.pdf
- H. Shigeki, O. Tsuyoshi, N. Tomoaki, A monthly mean dataset of global oceanic temperature and salinity derived from Argo float observations. Jamstec. Rep. Res. Dev. 8, 47–59 (2008)View ArticleGoogle Scholar
- D. Roemmich, J. Gilson, The 2004–2008 mean and annual cycle of temperature, salinity, and steric height in the global ocean from the Argo program. Prog. Oceanogr. 82(2), 81–100 (2009)View ArticleGoogle Scholar
- F. Gaillard, E. Autret, V. Thierry, et al., Quality control of large Argo datasets. J. Atmos. Ocean. Technol. 26(2), 337–351 (2010)View ArticleGoogle Scholar
- P. Joseph, A History of Thermocline Theory. Physical Oceanography. Springer New York, pp. 139–152 (2006)Google Scholar
- X.H. Zhang, X.D. Zhang, L.I. Yan, Characteristic analysis and calculation of the ocean thermocline. Marine Forecasts, 28(5), 69–76 (2011)Google Scholar
- D. Laurent, K. Holland, et al., Deep diving behavior observed in yellowfin tuna, ( Thunnus albacares). 19(1), 85–88 (2006)Google Scholar
- VLIZ IMIS, Romena N A. Factors affecting distribution of adult yellowfin tuna (Thunnus albacares) and its reproductive ecology in the Indian Ocean based on Japanese tuna longline fisheries and survey information. Minerva. Pediatr., 2000, 29(34):2027–2030Google Scholar
- Song L M, Zhang Y U, LIU XIONG X U, et al. Environmental preferences of longlining for yellowfin tuna (Thunnus albacares ) in the tropical high seas of the Indian Ocean. Fish. Oceanogr., 2008, 17(4):239–253View ArticleGoogle Scholar
- Z. Xu, Y.G. Zhang, Simulation for acoustic channel influenced by shallow water thermocline. J. Syst. Simul. 24(10), 2167–2166 (2012)Google Scholar
- B. Jiang, X.R. Wu, J. Ding, Comparison of the calculation methods of the thermocline depth of the South China Sea. Mar. Sci. Bull. 35(1), 64–73 (2016)Google Scholar
- Y. Jiang, Y. Gou, T. Zhang, et al., A machine learning approach to Argo data analysis in a thermocline. Sensors. 17(10), 2225 (2017)View ArticleGoogle Scholar
- S. Yang, Y. Zhang, H. Zhang, et al., The relationship between the temporal-spatial distribution of fishing ground of yellowfin tuna (Thunnus albacares) and themocline characteristics in the tropic Indian Ocean. Acta. Ecol. Sin. 32(3), 671–679 (2012)View ArticleGoogle Scholar
- M.C. Gregg, Diapycnal mixing in the thermocline: a review. J. Geophys. Res. Oceans. 92(C5), 5249–5286 (1987)View ArticleGoogle Scholar