### Progress in research

The application of image recognition information processing is more extensive. In recent years, the basic recognition technology of images has been developed very rapidly. Starting from the point of view of embedded application and application, and combined with internet communication technology, the basic recognition technology of image has made people feel the application advantages and has also changed people's basic way of life, providing convenience for human life. From simple to complex, at first, the application of recognition technology is the basic recognition method of text, and later develops into image processing technology, and finally will develop into object recognition [12]. The development of digital image information helps the compression and transmission of images, and thus in the process of real image transmission, the image is not easy to distort and maintains good stability. The process of object recognition is excessive to the cognition of computer or artificial intelligence robot to 3D object. In recent years, some basic problems have been exposed gradually, one of which is the poor ability of image conversion and adaptation [13]. Image recognition objects may be affected by the larger environmental noise, so that part of the information of the image is covered, and the amount of feature extracted by the researchers is relatively small. The basic research direction of the text is to deal with incomplete 3D information image processing. Through the optimization and improvement of the algorithm, the running time of the computer can be reduced, and the real reliability and usability of the calculation program can be enhanced [14]. Traditional research models include pattern recognition method, neural network identification method and so on.

### Image preprocessing technology

People can observe the movement and characteristics of objects through the camera. However, the interference of many human factors and environmental factors may lead to the poor quality of the image itself, and thus the acquisition of the image is not clear, which is the basic characteristics of incomplete three-dimensional information. In this paper, the discriminant features of incomplete 3D information were studied, and the image preprocessing was carried out at the same time. Vision and hearing are the common sensory modalities of human beings, and the recognition and application of vision to image information are the most [15]. In the process of image collection and transmission, the change of image quality may occur, which may also be the problem of image equipment, or be the method of image acquisition. In the view of the target image, it can be determined that it is unpredictable image noise. The noise of general image is divided into many kinds, including electromagnetic interference, sensor interference, filtering noise, and so on [16].

In the preprocessing of image, the basic information of the image is acquired by the sensor. And under the basic influence of imaging equipment and environmental factors, the information will always be affected by imaging sensors, and the original recognition image is not ideal. The difference will make the image error exists in the recognition process. The purpose of image preprocessing is to make the processing image in an important position, and the purpose of image enhancement is to obtain images with better definition and better visual effect, so as to be convenient for computer processing and calculation. In order to eliminate the noise that interferes with the image, the image should be enhanced, and the more common technique is the convolution feature in the frequency domain. The basic transformation expression of image enhancement is [17]

$$ g\left(x,y\right)=h\left(x,y\right)\ast f\left(x,y\right) $$

(1)

In which, VD is a convolution method for processing two images. The image is made up of individual pixels, so that it is the main means of direct connection and processing of image enhancement in space. The methods of general image enhancement are mainly divided into the following methods: global operation method, neighborhood operation method, and point operation method. In addition, the methods and techniques of image enhancement include the modification of histogram, the method of gray level conversion and the basic technology of color processing [18]. Histogram can be used to deal with the original image to obtain histogram. Then, the uniform histogram is obtained by function transformation. Finally, the basic image clearer than the original image is obtained after modification and homogenization. At this point, the histogram should be specified and be matched, thus to form a histogram equalization map of a predetermined shape, and highlight the gray quality of the image [19].

For noisy images, as well as the gray difference between the noise gray level and the period, it is necessary to smooth the image. The most commonly used method is the smoothing method of mean filtering, as shown in Fig. 1.

Assuming that the image expression with noise is*f*(*x*, *y*), the smooth transition of the image is processed uniformly, and the expression is calculated as follows [20]:

$$ g\left(x,y\right)=\frac{1}{M}\sum \limits_{\left(x,y\right)\in S}f\left(x,y\right)=\frac{1}{M}\sum \limits_{\left(x,y\right)\in S}{f}^{,}\left(x,y\right)+\frac{1}{M}\sum \limits_{\left(x,y\right)\in S}n\left(x,y\right) $$

(2)

In which, *f*^{,}(*x*, *y*) represents the image without noise, *n*(*x*, *y*) represents the image with noise, *S* represents the set of all the median points (*x*, *y*) in the field, and *M* represents the set of total points in the set *S*. Examples of smoothing filtering processing are shown in Fig. 2.

Assuming that the mean value of noise is 0, the property is superposition noise and has little correlation with the signal transmission, and then the variance of the image after smoothing is [21]

$$ D\left\{\frac{1}{M}\sum \limits_{\left(i,j\in S\right)}n\left(i,j\right)\right\}=\frac{1}{M}\sum \limits_{\left(i,j\in S\right)}D\left\{n\left(i,j\right)\right\}=\frac{1}{M}{\sigma}^2 $$

(3)

### Image interpolation and geometric change model

In the field of visual information research, since computers cannot have the ability to recognize intelligently like the human eye, when using a computer to identify two or more images obtained in different shooting environments of the same scene, a certain algorithm is needed. To achieve this, this method is image matching. The definition of image matching refers to establishing a relationship between geometric space and gray intensity between the reference image and the image to be matched. Solving spatial geometric transformation between images is an important step in image matching.

In image matching, there are two main ways to solve spatial geometric transformation models: global spatial geometric transformation model and local spatial geometric transformation model. The global spatial geometric transformation model refers to the same transformation model for the whole image when solving the transformation model between two images, that is, using a transformation function between two images. The global spatial geometric transformation model is currently a method often used in image matching. The local spatial geometric transformation model refers to the decomposition of an image into several small blocks when solving the transformation model between two images. The transformation model used by each small block is different, that is, one is used between each small block. The transformation function is used to represent, then the entire transformation model is implemented by multiple transformation functions. Since the local spatial geometric transformation model is more complicated, the range currently used is relatively small.

By using the interpolation method of image difference, the results are fuller, the resolution of the image itself is stronger, and the maximum data characteristic information obtained by the computer is more accurate. Therefore, the interpolation method is often used. Image interpolation refers to obtaining unknown data point from a known data point by calculation. The commonly used interpolation methods are near interpolation, double line interpolation, and three line interpolation. The near interpolation method is the least complicated method in the three methods, the main process of which is to calculate the gray value of two pixels, and the expression of which is [22]:

$$ f\left(x,y\right)=g\left(x,y\right) $$

(4)

$$ y=\left[v+0.5\right] $$

(5)

$$ x=\left[u+0.5\right] $$

(6)

Double line interpolation method is a linear interpolation operation of a pixel in two directions of x and y. In the nearest 2*2 neighborhood of the output, the weighted average method of gray value pixels is used, as shown in Fig. 3. The calculation formula of double line interpolation is [23]

$$ {\displaystyle \begin{array}{l}f\left(x,y\right)=x\left[f\left(1,0\right)-f\Big(0,0\Big)\right]+\left[f\left(0,1\right)-f\Big(0,0\Big)\right]y\\ {}+\left[f\left(1,1\right)+f\Big(0,0\left)-f\right(0,1\left)-f\right(1,0\Big)\right] xy+f\left(0,0\right)\end{array}} $$

(7)

In order to obtain the transformation relationship between the coordinate point of coordinate system and the coordinate point of another coordinate system, rigid transformation, radiation transformation, and projection transformation are often used. The operation of the rigid transformation method is that a certain image is rigidly operated, and the relative position of the points of the two pixels is unchanged. The rigid transformation method satisfying the requirement can be used to obtain the midpoint transformation coordinates of the three-dimensional space by rotation or translation, and the calculated expression is [24]

$$ \left[\begin{array}{l}{x}^{,}\\ {}{y}^{,}\end{array}\right]=\left[\begin{array}{l}\cos \theta, \pm \sin \theta \\ {}\sin \theta, mcod\theta \end{array}\right]\left[\begin{array}{l}x\\ {}y\end{array}\right]+\left[\begin{array}{l}{t}_x\\ {}{t}_y\end{array}\right] $$

(8)

In which, *θ* is the basic angle of rotation, and \( \left[\begin{array}{l}{t}_x\\ {}{t}_y\end{array}\right] \) is the translation basic vector value of the rigid transformation of the object.

After the image is transformed, the original parallel relation can still be maintained. The transformation relation of the image is called affine transformation method. And the simplified expression of two-point matrix is as follows:

$$ \left[\begin{array}{l}{x}^{,}\\ {}{y}^{,}\end{array}\right]=\left[\begin{array}{l}{m}_0{m}_1\\ {}{m}_3{m}_4\end{array}\right]\left[\begin{array}{l}x\\ {}y\end{array}\right]+\left[\begin{array}{l}{m}_2\\ {}{m}_5\end{array}\right] $$

(9)

The transformation of left and right matrices is the projection transformation method of images, and the relationship between the two projection points is

$$ q={H}_pP\left(\begin{array}{l} At\\ {}{v}^Tu\end{array}\right)P $$

(10)

In which, *A* represents a matrix of2 × 2, 2 × 2 is the vector of 2 × 1, and vector *V* = [*v*_{1}, *v*_{2}]^{T} represents the 9 basic elements of the *H*_{P}matrix.

The expression of the matrix of similarity transformation is

$$ q={H}_sP=\left(\begin{array}{l} sR,t\\ {}{0}^T,1\end{array}\right)P $$

(11)

In which, *s*parameter is the coordination coefficient of each calculation, and*R* is the transformation form of orthogonal matrix of 2 × 2.