亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

A Fast Image Matching Algorithm Based on Yolov3

2021-11-23 08:59:26，，

Transactions of Nanjing University of Aeronautics and Astronautics 2021年5期

，，

College of Astronautics，Nanjing University of Aeronautics and Astronautics，Nanjing 211106，P.R.China

Abstract:In view of the fact that the traditional Hausdorff image matching algorithm is very sensitive to the image size as well as the unsatisfactory real-time performance in practical applications，an image matching algorithm is proposed based on the combination of Yolov3.Firstly，the features of the reference image are selected for pretraining，and then the training results are used to extract the features of the real images before the coordinates of the center points of the feature area are used to complete the coarse matching.Finally，the Hausdorff algorithm is used to complete the fine image matching.Experiments show that the proposed algorithm significantly improves the speed and accuracy of image matching.Also，it is robust to rotation changes.

Key words：Yolov3；image matching；Huasdorff；two-stage matching

0 Introduction

Image matching navigation technology is an autonomous navigation method based on image matching technology for precise navigation of carriers.It has broad application prospects in the fields of precision attack，unmanned aerial vehicle（UAV）navigation，deep space exploration，etc.However，autonomous navigation system has special demands for real-time accuracy and robustness of image matching algorithm［1］.

Image matching algorithm includes grayscale matching and feature matching，among which feature matching algorithm is more suitable for autonomous navigation system.In terms of feature-based image matching algorithms，the domestic and foreign scholars have proposed many algorithms，such as the image matching algorithm based on the scaleinvariant feature transform（SIFT）proposed by Lowe in 1999［2］.SIFT can solve the scale and rotation problems，but its use of feature operators seriously slow down the speed of operation.Bay et al.［3］proposed the speeded up robust feature（SURF）algorithm in 2006.The feature descriptor constructed by SURF is different from SIFT，but its overall idea is similar to that of SIFT.The computing speed of SURF is about three times that of SIFT.In 2011，Rublle et al.［4］proposed a fast feature point extraction and description algorithm called oriented fast and rotated brief（ORB），which greatly improves the running speed of the algorithm.In 2017，Zahra et al.［5］proposed a method to eliminate redundant key points in view of the problem that feature points extracted by SIFT are too close to each other，so as to improve the running factors of the algorithm.The above algorithms are sensitive to the image size so that the matching speed is decreased.The navigation task requires relatively high realtime matching algorithm.The huge amount of data computation will seriously slow down the speed of the process of image matching.Even the fast ORB algorithm can hardly meet the navigation requirements when it is at a huge magnitude of computation［6］.

In 2012，image recognition was realized for the first time through the convolutional neural network（CNN），which promoted the development of deep learning in the field of images.Since then，various CNN frameworks were extensively studied and applied［7］.Compared with the artificial feature operator，the feature extraction and target detection algorithm based on deep learning can extract the information of the image more intuitively and steadily，and reduce the number of features［8］.In 2014，Ross et al.［9］proposed the region-CNN（R-CNN）network.By extracting the candidate regions and classifying the corresponding regions，it played a good role in image recognition.Therefore，in recent years，a variety of algorithms based on R-CNN have also been proposed，such as fast R-CNN，faster R-CNN，and mask R-CNN［10-11］.Yolo（You only look once），a target detection system based on a single neural network，was proposed by Joseph et al.in 2016［12］，which further improved the accuracy and speed of detection.After that，the author proposed Yolo9000 and Yolov3［13-14］.The detection speed of these algorithms is improved.MatchNet proposed by Han et al.［15］and Deep Compare proposed by Zagoruyko et al.［16］judge whether the images match or not，through the similarity prediction of images based on deep learning.Fan et al.［17］proposed a spatial-scale double-channel deep CNN method and Tian et al.［18］proposed L 2-NET，both of which matched the tags of the feature areas through deep learning.However，the definition accuracy of these algorithms does not meet the requirement for pixel level accuracy.It is difficult to satisfy the accuracy of navigation.

In view of the special requirements of navigation system for the accuracy and real-time capability of image matching algorithms，this paper divides image matching into two steps：Coarse matching and fine matching based on the idea of two-stage matching.Firstly，on the strength of Yolov3，we detect the feature area of the target recognition model，which can effectively avoid the operation delay caused by the over image size and improve the realtime performance of the image matching algorithm.At the same time，it can effectively remove the influence of image background on image matching and improve the accuracy of matching.Then，we apply the partial Hausdorff distance（PHD）image matching algorithm for matching position correction，which can avoid the error of the coordinates of the feature area determined by deep learning and improve the accuracy of the image matching algorithm.

1 Coarse Matching Based on Deep Learning

1.1 Yolov3 network structure

Yolov3 extracts feature information from input image by feature extraction network，and gets the feature information of different scales of input image.The first part of Yolov3 network is Darknet-53.The network is from 0 layer to the 74th layer，of which 53 layers are convolution layers.The network extracts the input image features by convolution kernel of 3×3 and 1×1，and the rest is residual network，which makes the model more easily optimized and more easily access to depth information［19］.The Yolov3 multi-scale detection network is shown in Fig.1.

Fig.1 Yolov3 multi-scale detection

From the 75th layer to the 105th layer，the feature information interaction network of Yolov3 network is equivalent to the fully connection layer of the traditional neural network.The network is divided into three scales.The minimum input scale Yolo layer of Yolov3 is 13×13 feature image，containing 1 024 channels altogether.The input of the mesoscale Yolo layer is 26×26 feature image，containing 512 channels altogether.The maximum input scale Yolo layer is 52×52 feature image，containing 256 channels altogether.The local feature interaction is achieved by means of convolution within each scale.Convolution operation does not change the scale of the feature image，while it reduces the number of feature image channels ton.The size ofnis determined by the classification type.Finally，the output of 13×13×nfeature images is carried out to conduct regression and classification.

1.2 Prediction of bounding box

Yolov3 usesK-means clustering to get the size of the prior frame and sets up three prior frames for each of the down sampling scales.There are nine kinds of prior frames which are clustered.In this algorithm，the priori frames provided by COCO dataset are（10×13），（16×30），（33×23），（30×61），（62×45），（59×119），（116×90），（156×198），（373×326），respectively.The minimum 13×13 characteristic map is applied with the maximum prior frames（116×90），（156×198），（373×326），which are suitable for detecting larger objects.The median 26×26 characteristic map is applied with the median prior frames（30×61），（62×45），（59×119），which are suitable for detecting medium-sized objects and detecting smaller objects.The maximum 52×52 characteristic map is applied with the minimum prior frames（10×13），（16×30），（33×23），which are suitable for detecting smaller objects.

The algorithm uses the intersection over union（IOU）to measure the accuracy of detecting corresponding objects in a specific data set.When the statistical IOU is below 0.5，the prediction of the algorithm is considered to be inaccurate.Bounding-Box regression is used to fine tune the window to make the inaccurate feature box closer to the correct mark.So that the location will be more accurate.The coordinate information of the characteristic region is obtained.The specific mapping relations are as follows

wheretx，ty，tw，thare the actual output targets of the Yolov3 model as the input of the window mapping transformation；σ(t)is Sigmoid function，as a criterion for transformation；bx，by，bw，bhare coordinates and sizes of the center of bounding box；cx，cyare the corresponding coordinates of feature image；pw，phare the corresponding coordinates of prior frame.

1.3 Dataset and experiments

Satellite images and the features are selected.By rotating any angle and adding random noise，including the pepper and salt noise and the Gaussian noise，the data are further enhanced by random scale transformation between 0.8 and 1.4，and the dataset is expanded to 2 000 pieces，as shown in Fig.2.Part of dataset is shown in Fig.3.

Fig.2 Schematic diagram of data enhancement

Fig.3 Part of training set samples

After completing the dataset，the Yolov3 network is used to train and iterate 2 000 times.This experiment runs on the computer with Ryzen 3600 CPU，16 GB RAM and RTX2060 GPU，6 GB RAM；the operation system is LinuxUbuntu16.04；Python2.7 is compiled；the main algorithm frame is Darknet and OpenCV 2.After the training is completed，the weight file is tested and verified.The accuracy rate of the test set is 98.36%.The detection time of the single picture is 0.094 s.Experimental results are shown in Fig.4.

Fig.4 Results of Yolov3 image detection

In this paper，other datasets are also established and the results of the detection are verified.Some of the results are shown in Fig.5.

Fig.5 Results of Yolov3 image detection on other datasets

The prediction label will be used as the basis for judging coarse matches.If the labels are consistent，the feature areas extracted from the two images are considered to be matched successfully.Then，the next fine matching is carried out.

2 Fine Matching Based on PHD Image Matching Algorithm

Hausdorff distance is a way to describe the least similarity between two images.PHD image matching algorithm can avoid the influence of noise and false detection points.PHD image matching algorithm is introduced to carry out two-stage fine matching.

PHD between the given imageAand the imageBis defined as［20］

wherefandgare forward and backward coefficients respectively；N A，N Brepresent the total numbers of points in imagesAandB，respectively；d(a，B)=minb∈Brepresents the minimum distance from pointbin setBto pointa；d(b，A)=mina∈Arepresents the minimum distance from pointain setAto pointb.The Hausdorff distance is the least similarity between the two images.By traversing all the feature points，the characteristic point corresponding to the smallest Hausdorff distance is found to be the best matching point.

PHD image matching algorithm is of high matching rate and simple computation logic.But from the algorithm flow，it can be found that the computation of the algorithm is determined by the number of feature points of the image，so the algorithm is very sensitive to the complexity and size of the image.Therefore，the Yolov3 algorithm can be used to complement each other and to accelerate the Hausdorff algorithm.At the same time，the Hausdorff algorithm is used to perform the two-stage fine image matching.

3 Two-Stage Image Matching Algorithm Flow

In this paper，a two-stage image matching algorithm combining Yolov3 algorithm and Hausdorff image matching algorithm is proposed.Firstly，the coarse matching is carried out by the Yolov3 algorithm，and then the two-stage fine matching is made by using the PHD image matching algorithm.Meanwhile，due to the large image size，the PHD image matching algorithm does not meet the requirement for real-time in image matching navigation.The Yolov3 algorithm can improve the real-time performance of the algorithm and satisfy the requirement of image matching navigation.

Although there are many convolutional layers，the convolutional layers of many channels have no inheritance.In addition，the estimation of feature area prediction becomes simpler.On one hand，each feature region only matches one prior box，which reduces the complexity of the algorithm.On the other hand，the time complexity of the Huasdorff image matching algorithm isO(n，m)，which means its complexity is related to the feature points of prestored and measured graphs.The coarse matching can greatly reduce the number of feature points and greatly reduce the complexity of the algorithm.The detailed flow chart of the algorithm is shown in Fig.6.

Fig.6 Flow chart of two-stage matching algorithm

The specific process is as follows：

（1）Read in the refer image named Refer and the real image named Real.Use the Yolov3 algorithm to extract the feature image from the refer image and the real image to achieve the feature regions named Refer_roi and Real_roi，respectively and corresponding region labels named Label_refer and Label_real，respectively.

（2）If Label_refer and Label_real are consistent，the next step will be fine matching.Otherwise，the two images do not match.

（3）Label_refer and Label_real correspond to Refer_roi and Real_roi，respectively.By using the center point coordinates as the center，intercept Refer_cut and Real_cut from Refer_roi and Real_roi，respectively.

（4）Use Refer_cut and Real_cut as the input of PHD image matching algorithm to carry out the fine image matching.

（5）Output the best matching point between Refer_cut and Real_cut.

（6）Use the best matching point output in the 5th step to the point corresponding to the original images，and output the best match point.

4 Experiment

In order to verify the effectiveness of the algorithm，a matching experiment is carried out.This paper validates the algorithm for ordinary optical images，synthetic aperture radar（SAR）images and satellite images.The experiment runs on the computer with Ryzen 3600 CPU 16 GB RAM and RTX2060 GPU 6 GB RAM，the operation system is LinuxUbuntu16.04，Python2.7 is compiled，and the compilation environment is pycharm2018.1.The main algorithm framework is Darknet and opencv2.

4.1 Coarse matching based on Yolov3

Select the Washington SAR image and choose the Pentagon as the feature.Besides，we randomly intercept 500 images and expand the dataset by rotating arbitrary angle and adding Gauss noise and salt and pepper noise to 2 000.Part of dataset is shown in Fig.7.

Fig.7 Schematic diagram of image enhancement

After completing the dataset，the Yolov3 network is used for training.The number of iterations is 2 000 times.After the training is completed，the weight files are tested and verified.The accuracy rate of the test set is 98.36%，and the detection time of the single picture is 0.094 s.

We select the refer image named Refer with the size of 329 pixel×214 pixel and the real image named Real with the size of 130 pixel×100 pixel，as shown in Fig.8.Using the trained model for feature detection，we can predict the bounding boxes as the feature area named Refer_roi of the refer image and Real_roi of the real image.These images are shown in Fig.9.The labels of the two feature areas are the same，so the center points of the two feature areas are the coarse matching results for fine matching.

Fig.8 Refer image and real image

Fig.9 Detection results of refer and real images

4.2 Fine matching based on PHD image matching algorithm

Beside taking the center point coordinates of bounding box as the center，intercept the two-stage feature images from Refer_roiwith size of 40 pixel×40 pixel and Real_roi Refer_roi with size of 30 pixel×30 pixel.The two images，called Refer_cut and Real_cut，are shown in Fig.10.Using the intercepted images as the input of PHD image matching algorithm，the experiment shows that the exact matching point is（2，4）and the matching time is 0.275 s.The matching result of Refer_cut and Real_cut images is shown in Fig.11.

Fig.10 Refer_cut and Real_cut images

Fig.11 Matching result of Refer_cut and Real_cut images

The final matching point of the algorithm is（120，40）and the matching time is 0.476 s.Result is shown in Fig.12.

Fig.12 Final matching result of Refer and Real images

If only Yolov3 is used as the matching algorithm，the matching point is（123，44）and the correct matching point is（120，40）with a matching error of 5.From the experimental results，it can be seen that if only Yolov3 is used for image matching，the matching error does not meet the requirement of matching accuracy.If only the Huasdorff matching algorithm is used，the matching error is 0 and the matching time is 12 s，which satisfies the accuracy requirement but does not satisfy the real-time performance of the algorithm.When the two algorithms are combined，the matching error is 0 and the matching time is 0.476 s，which means that the realtime and accuracy requirements are satisfied at the same time.

Similarly，the above algorithm is applied to optical images and satellite images.The total matching time of optical images is 0.476 s，and the final matching point is（99，357）.The matching result is shown in Fig.13.The total matching time of the satellite image is 0.391 s，and the final matching point is（120，40），as shown in Fig.14.

Fig.13 Matching results of optical images

Fig.14 Matching results of satellite images

From the matching results，we can find that the algorithm has high matching accuracy in all kinds of images，and the total time of matching is less than 1 s，which meets the real-time requirement.The statistical results of the two-stage matching algorithm are shown in Table 1.From Table 1，we can also find that the matching error of the standard algorithm in a certain angle rotation is within 1 pixel，which meets the matching requirements of navigation.Moreover，it is proved that the algorithm is robust.

Table 1 Statistical results of the two-stage matching algorithm

The proposed algorithm is compared with image matching algorithms based on SURF，ORB and wavelet transform（WT）.In the experiment，the matching time of WT based two-level image matching algorithm is about 10 s，which is significantly higher than other algorithms.The matching time of the remaining three types of algorithms is shown in Fig.15.

Fig.15 Statistical chart of matching error under rotation

From the contrast results，it can be found that the real-time performance of algorithm in this paper and the image matching algorithm based on ORB are obviously better than the image matching algorithm based on SURF.At the same time，the matching time of the image matching algorithms based on ORB and SURF increases obviously with the increase of image size.However，with the increase of image size，the matching time of the proposed algorithm is less than 0.5 s，which is better than the image matching algorithm based on ORB.

Considering that the ORB algorithm is similar to the proposed algorithm in terms of time performance，only the performance of these two algorithms is compared.Fig.16 shows the matching results of the 1 080 pixel×720 pixel image using ORB.From the result，it can be found that only one correct matching point exists among the seven successful matching points.And image matching accuracy comparison is shown in Fig.17.From Fig.17，it can be found that the image matching algorithm based on ORB has obvious matching error in the actual image application in some cases.In the case of similar matching time，the proposed algorithm has the advantage of high matching accuracy.

Fig.16 Matching results based on ORB

Fig.17 Accuracy comparison of image matching algorithms

5 Conclusions

This paper first introduces the structures of Yolov3 CNN and PHD image matching algorithms.Aiming at solving the problems that there is deviation in the bounding box predicted by Yolov3 CNN，which does not meet the accuracy requirements，and PHD image matching algorithm is sensitive to image size，which does not meet real-time requirements，we make a targeted improvement by the combination of the two algorithms.Experimental results show that the algorithm is effective，accurate and real-time.At the same time，it can resist a certain degree of rotation.However，in the actual situation，how to match the incomplete features and how to distinguish two similar features still need to be solved.

Transactions of Nanjing University of Aeronautics and Astronautics2021年5期

Transactions of Nanjing University of Aeronautics and Astronautics的其它文章: Improved Particle Swarm Optimization for Solving Transient Nonlinear Inverse Heat Conduction Problem in Complex Structure; Microstructure and Mechanical Properties of Ti-4Al-1.5Mn Resistance Spot Welding Joints; Influence of Refractive Index Distribution on Brillouin Gain Spectrum in GeO 2-Doped Optical Fibers; Transactions of Nanjing University of Aeronautics & Astronautics Information for Contributors; Enabling Technology of Multiagent Manufacturing System:A Novel Mode of Self-organizing IoT Manufacturing; Remaining Useful Life Prediction of Aeroengine Based on Principal Component Analysis and One-Dimensional Convolutional Neural Network