Hang CHEN, Dao-zhong SUN
(1Guangdong Industrial Robot Integration and Application Engineering Research Center, Guangzhou 510540, China) (2Department of Computer Science and Engineering, Tianhe College, Guangdong Polytechnic Normal University, Guangzhou 510540, China) (3College of Electronic Engineering, South China Agricultural University, Guangzhou 510642, China)
Abstract: At present, the traditional industrial robot visual sorting technology has been unable to effectively detect and identify workpiece with complicated shapes and dense placement. Therefore, in order to improve the accuracy of sorting workpiece detection on the production line, a target detection algorithm based on Cuckoo Search (CS) optimized deep learning Convolutional Neural Network (CNN) is proposed. The composition of the visual sorting system was first analyzed. Then the model structure of the classic Faster R-CNN is used to achieve the target detection, and the CS optimization algorithm is applied to the parameter training of the CNN model, which solves the local optimal problem of back propagation and improves the iteration speed. The experimental results of workpiece inspection show that compared with the traditional CNN model, the proposed CS-CNN model has better accuracy of target detection and improves the convergence speed of the network.
Key words: Deep learning, Convolutional neural network, Industrial robot, Visual sorting, Target detection, Cuckoo search algorithm
With the advent of the industry 4.0 era, industrial robot technology has been widely used in the manufacturing field, instead of manually sorting the materials on the production line, effectively increasing the productivity and reducing the labor cost. Intelligent industrial robots not only free workers from a boring and tiring work environment, but also improve their productivity. Industrial robot vision sorting, as an important application branch in the field of robotics, is a key technology for industrial automation and intelligence, involving many technical fields such as optics, digital image processing technology and artificial intelligence[1-3]. It is important to enable industrial robots to complete the capture of targets through visual guidance.
At this stage, related research on visual sorting of industrial robots has become a hot issue in the field of industrial automation [4-5]. Literature [6] proposed an industrial robot sorting system simulation method based on VC++6.0 development platform and OpenGL graphics library, which intuitively realizes the dynamic simulation of robot sorting workpieces. Literature [7] proposed a PLC-based industrial robot bottle sorting system, and gave detailed hardware and software design methods to achieve single-target, multi-target tracking and crawling. Literature [8] and Literature [9] proposed a robot vision sorting system based on machine vision. The improved Canny edge extraction operator could obtain the edge information and improve the target detection accuracy as a matching image feature. However, the existing sorting system cannot achieve ideal target detection and recognition when facing the sorting task whose shape, size and placement position are uncertain. Because deep convolutional neural networks show excellent performance in image processing, researchers have begun to try to cite them for industrial robot vision sorting tasks. For example, literature [10] proposed a fast visual recognition algorithm for industrial sorting robots based on deep learning. The recognition accuracy in the laboratory environment can be maintained at a high level.
Therefore, in order to further improve the accuracy of sorting workpiece detection in the actual production line environment, a target detection algorithm based on deep learning convolutional neural network is proposed, and the cuckoo search algorithm (CS) is introduced to improve the parameter training of CNN. The neural network is initialized by the network parameters trained by the CS algorithm, and then further optimized by the stochastic gradient descent algorithm. The test data shows that the CS-CNN model could effectively complete the visual inspection of the target workpiece and improve the accuracy and efficiency of the sorting robot.
A typical industrial robot vision sorting system is shown in Fig.1. It can be divided into hardware and software. Among them, the most important module of the hardware part is the industrial robot body, as shown in Fig.2 (i.e., 6-axis industrial robot). It also includes an image acquisition module, an industrial control computer module, and a conveyor module. The software part is responsible for calling the deep learning model to identify the location of the target, the transmission of the positioning information, and the control of the grabbing action, and finally control the robot to implement the sorting action. The image acquisition module is mainly composed of a camera, a lens and a light source. The deep learning model in the industrial control computer module is the goal of this paper.
Fig.1 Schematic diagram of industrial robot sorting system
Fig.2 6-axis industrial robot
Traditional artificial neural network models use back propagation, but usually only three layers. Deep learning deepens the depth of the network by adding multiple hidden layers. The currently used model is the convolutional neural network [11]. It is a classic neural network model, and the weight parameters contained in the network structure are shared with each other, which reduces the amount of calculation and has received extensive attention in image processing.
The convolutional neural network consists of an input layer, a convolutional layer, a pooled layer, a fully connected layer, and a final output layer, and it could directly take the original image as an input, without performing a preprocessing operation, as shown in Fig.3 [12].
Fig.3 Convolutional neural network structure
First, the convolutional layer is responsible for the extraction of target features in the image, each of which has a convolution kernel of the same size. As shown in Fig.4, convolution with a 4×4-input matrix and a convolution kernel of size 2×2 finally computes an output matrix of 3×3.
Fig.4 Convolution layer principle
Suppose the size of the input image isn×n, the number of convolution kernels isnc,the convolution kernel size isf×f,fill edge length isp, step size iss,then the calculation formula of the output image after convolution is as follows:
(1)
Where, ?·」 is rounded down.
There are two common pooling methods:1)Maximum pooling sum; 2) The average pooling. Since this paper uses the Faster R-CNN framework [13], it is the RoI pooling layer. Neurons are the basic organizational structure of convolutional neural networks. Assume there are 3 inputs, the calculation formula of the perceptron unit is as follows:
(2)
Where,wrepresents weight andbrepresents weight bias.fis the activation function. The activation function selected in this paper is ReLU and its function form is:
(3)
The softmax function in the full connection layer calculates the activity value of each neuron in the following way:
z(l)=W(l)*a(l-1)+b(l)
(4)
a(l)=fl(z(l))
(5)
Where,a(l)represents the activity value of neurons in l layer, andWrepresents the weight matrix.
Cuckoo search algorithm(CS)is a relatively new meta-heuristic optimization algorithm, which can simulate cuckoo nesting and egg solving behavior and levy flight [14]. Similar to particle swarm optimization, both of them need to update the position of individuals in the population. The specific ways are as follows:
(6)
Levy(s,λ)=s-λ
(7)
Where,sis the random step size obtained by Levy flight. Suppose the probability of cuckoo eggs being found by nest owners isP∈[0,1]. After updating the location, generate a random numberr∈[0,1],comparerwithP. Ifr>p, randomly change the position of the bird’s nest once, otherwise the position of the bird’s nest will not be changed. The steps of the cuckoo algorithm are as follows:
(1) Initialize the bird’s nest position;
(2) Calculate the value of the objective function corresponding to each nest position;
(3) Update nest location via Levy flight;
(4) Update part of the nest position with a certain probability;
(5) Repeat step 3 until the iteration stop condition is met.
Compared with other group intelligent optimization algorithms, the advantage of the cuckoo algorithm is that it is searched and has strong searching capability, and the set parameters are few and easy to implement. In addition, the overall performance of the cuckoo algorithm’s search ability and robustness is more prominent, which is suitable for solving continuous optimization problems, which is in line with the parameter training requirements of CNN.
The network structure of the Fast R-CNN used in this paper is shown in Fig.5. In order to avoid the phenomenon that the gradient disappears during the initialization of the parameters, the CS algorithm is used to train the network parameters. After the training, the random gradient descent algorithm is further optimized to solve the local optimal problem of back propagation and improve the iteration speed. The initial positionX(x1,x2,x3,…,xn) of the nest is randomly generated, wherenrepresents the number of weight parameters in the convolutional neural network. The fitness functionFis the sum of the absolute values of the error between the actual output and the predicted output, which is shown as follows:
(8)
Where,Nrepresents the number of output layer nodes,krepresents the coefficients, andyiandoiare the desired output and predicted output of the output nodei, respectively. The training process of the CS-CNN model is shown in Fig.6.
Fig.5 Fast R-CNN network structure
Fig.6 CS-CNN based convolutional neural network training process
In order to verify the target detection performance of the proposed CS-CNN model, the real workpiece image was trained as a training sample. The training picture was 500 sheets and the size was 40×40. And under the same experimental conditions, the results of the proposed model are compared with those of the standard CNN model.
The system hardware configuration is GPU 950M,Memory is 8G,Intel(R) Core(TM) i3 CPU,the main frequency is 3.07 GHz,and the memory is 4 GB. The operating system is Windows 10. The configuration parameters of CS algorithm are: Detection probability=0.1, maximum number of iterations=10, population size=80.
The convolutional neural network based on CS-CNN algorithm is tested with several irregular shapes and stacked visits. The test results are shown in Fig.7. It can be seen that the CS-CNN convolutional neural network model can effectively complete the visual inspection of the target workpiece.
Fig.7 The detection effect of the CS-CNN model
Table 1 shows the recognition accuracy of the CS-CNN model. Table 2 compares the recognition accuracy of three different models. Fig.8 shows the training error and accuracy of the CNN and CS-CNN models. It can be seen from Table 1 and Table 2 that the CS optimization algorithm is used to update the CNN training compared to the network parameter training method that relies only on the conventional gradient descent algorithm. Parameters can be more accurate. In addition, as can be seen from Fig.8, the convergence curve of the CS-CNN model is smoother than the CNN model, which greatly accelerates the convergence speed.
Table 1 CS-CNN model identification accuracy
Table 2 Comparison of recognition accuracy of different models
Fig.8 Training error of CNN and CS-CNN models
In order to further improve the accuracy of sorting workpiece detection in the actual production line environment, a target detection algorithm based on deep learning convolutional neural network is proposed, and the cuckoo search algorithm (CS) is introduced to improve the parameter training of CNN. Through the workpiece inspection test, the following conclusions could be drawn: (1) Verification of the effectiveness of the proposed CS-CNN model;(2)Updating CNN training parameters through CS optimization can speed up convergence and improve detection efficiency;(3)Compared with the typical CNN model, the CS-CNN model has higher target detection accuracy.