亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

Improved Medical Image Segmentation Model Based on 3D U-Net

2022-09-29 01:46:50LINWeiFANHongHUChenxi胡晨熙YANGYiYUSuping禹素萍NILin

Journal of Donghua University(English Edition) 2022年4期

LIN Wei(林威), FAN Hong(范紅), HU Chenxi(胡晨熙), YANG Yi(楊宜), YU Suping(禹素萍), NI Lin(倪林)

College of Information Science and Technology, Donghua University, Shanghai 201620, China

Abstract: With the widespread application of deep learning in the field of computer vision, gradually allowing medical image technology to assist doctors in making diagnoses has great practical and research significance. Aiming at the shortcomings of the traditional U-Net model in 3D spatial information extraction, model over-fitting, and low degree of semantic information fusion, an improved medical image segmentation model has been used to achieve more accurate segmentation of medical images. In this model, we make full use of the residual network (ResNet) to solve the over-fitting problem. In order to process and aggregate data at different scales, the inception network is used instead of the traditional convolutional layer, and the dilated convolution is used to increase the receptive field. The conditional random field(CRF) can complete the contour refinement work. Compared with the traditional 3D U-Net network, the segmentation accuracy of the improved liver and tumor images increases by 2.89% and 7.66%, respectively. As a part of the image processing process, the method in this paper not only can be used for medical image segmentation, but also can lay the foundation for subsequent image 3D reconstruction work.

Key words: medical image segmentation; 3D U-Net; residual network (ResNet); inception model; conditional random field (CRF)

Introduction

In the daily mass medical image, because doctors need to frequently apply their own medical knowledge and subjective judgment to analyze images, they often feel powerless. Sometimes doctors even make misdiagnosis. Nowadays, deep learning has been widely used in computer vision, and it has been practically meaningful to gradually make computer-aided diagnosis by doctors. Wangetal.[1]added attention mechanism and dilated convolution to the U-Net network, and then improved the traditional layer-jumping structure, which achieved good results in lung and cell segmentation. Maetal.[2]used multiscale convolutional block (MCB) convolution block, hybrid down sampling block(HDSB), and a context information fetching(CIF) module to replace the traditional convolution, pooling and layer skipping operations. The U-Net network combined with MCB, HDSB, and CIF had achieved high accuracy in segmenting lungs, cell walls, and pancreas. Pengetal.[3]proposed a new feature extraction module, which improved the utilization of context information by introducing parallel dilated convolution (PDC) and local context embedding (LCE) modules. It solved the problem of over-smooth and under-segmentation of small objects, and thereby improved the accuracy of segmentation. Gridach[4]proposed a new pyramid expansion module (PDM) composed of multiple parallel stacked expansion convolutions, which achieved good results in medical image processing within a certain range. The Cornell University research team designed an efficient convolutional neural network architecture to replace the encoder and decoder, then applied the residual module to replace the skip connection between the U-Net encoder and decoder, and finally achieved a better segmentation effect[5]. In recent years, scholars have successively proposed improved models such as V-Net[6], Multi-scale densely connected U-Net (MDU-Net)[7], and Bridged U-Net[8], which have achieved good segmentation results in their respective application field.

The above models show superior performances to a certain extent, but still have certain shortcomings. Hybrid dilation and attention residual U-Net’s segmentation processing efficiency is low, and many weight parameters need to be calculated during model training, while U-Net network combined with MCB, HDSB and CIF(MHSU-Net) needs to be further improved in the model complexity and loss function selection. Local context embedding network has shown certain advantages in feature extraction, and the next step of research needs to be focused on 3D medical image work to test its application performance. The main disadvantage of pyramid dilated network (PyDiNet) is that it cannot add a large amount of feature information to the evaluation, and the amount of image information that can be processed is small. The segmentation accuracy of U-Net structure with double channel (DC-UNet) requires a large amount of data to guarantee. In response to the appeal problem, combined with the application scenarios of liver and liver tumor segmentation, this paper proposes an improved medical image segmentation model based on 3D U-Net. In general, the main contributions of the article are listed as follows.

(1) The data enhancement technology is introduced. (2) Residual network (ResNet), inception, and dilated convolution modules are added to the traditional U-Net network. (3) Conditional random field (CRF) is added after the network.

1 Theoretical Introduction

Traditional neural network medical image segmentation usually requires the following steps: data pre-processing, neural network training and prediction, and post-processing of prediction results. Neural networks and network modules commonly used in the field of computer vision include U-Net, ResNet, and CRF.

1.1 U-Net

U-Net implements end-to-end training on a small number of data sets and also has good segmentation results, so it has become a classic segmentation model for medical image segmentation[9]. The improvement and optimization of U-Net can meet the specific needs of medical image segmentation better. The structure of U-Net is shown in Fig. 1.

Fig. 1 U-Net network structure

The network structure uses an encoder and a decoder, as well as a jump connection topology. The left side of the network is a series of down-sampling operations composed of the convolution kernel and max pooling. This part mainly completes the high-order feature information extraction of the image. The right side of the network is composed of four blocks. Because the network structure of the compressed path and the expanded path are completely symmetrical, it is named U-Net.

1.2 ResNet

The proposal of ResNet is to solve the problem that the deep convolutional neural network model is difficult to train[10]. For a stacked structure, suppose that the learned feature output isH(x) when the input isx; if the target value (also known as the residual value) is set to beF(x)=H(x)-x, the original learned feature isF(x)+x. The residual learning is used instead of original feature learning because residual learning is easier to learn more new features based on smaller changes in input features, and then it has better performance. The processing process of residual learning unit has been shown in Fig. 2.

Fig. 2 Residual learning unit

Residual learning unit is similar to the “short circuit” in the circuit, so it is also called short-circuit connection.

1.3 CRF network

CRF was originally a basic model in natural language processing. With the popularity of deep learning, it has also been widely used in deep learning models[11], and the final processing effect has been significantly improved.

The processing process of CRF is mainly derived from mathematical analysis. Suppose there are two linearly represented random variable sequences as

X=(x1,x2,…,xn),Y=(y1,y2,…,yn).

Under the condition of given random sequenceX, the conditional probability distributionP(Y|X)of the sequence of random variablesYconstitutes a CRF. When this distribution satisfies the Markov property, it is a linear chain CRF. The two main undirected graph structures of linear CRF are mainly shown in Fig. 3.

Fig. 3 CRF diagrams: (a) CRF probabilistic undirected graph; (b) simplified CRF probabilistic undirected graph

In the neural network, the influence of the domain pixels on the center pixels can be enhanced by using the CRF network, to achieve the effect of enhancing the image details and deepening the outline.

2 Improvement Method and Its Introduction

In data processing, data enhancement technology is introduced to enhance the fitting ability of the model on the test set. ResNet and inception networks are introduced to replace the traditional convolutional layer in the 3D U-Net neural network structure[12], which enhances the relevance of image information and its own receptive field. Dropout operation is added in the up and down sampling processes to prevent model over-fitting[13]. The calculation of dice coefficient is used in the evaluation of model training parameters to enhance the accuracy of image training. Introducing the influence of CRF in post-processing, the edges of the segmented image can be sharpened further.

The residual calculation module is mainly applied to each block of U-Net. Because the numbers of input and output data channels are the same, the superposition method can be used directly here. The structure of the block network can be seen in Fig. 4.

Fig. 4 Block network structure

The main application is the inception V1 model to replace the convolution process in U-Net. The inception allows the network itself to decide whether to use convolution or maximum pooling, and it can also improve the receptive field of the output data.

The overall network architecture (shown in Table 1) refers to the traditional U-Net symmetric structure (shown in Fig. 1). The difference is that the convolution operation in each block is replaced with an inception module, which increases the receptive field for convolution of different sizes. When passing through each encoder stage and decoder stage, the ResNet needs to be added, which makes the context of the network closer, and makes the neural network itself easier to train.

Fig. 5 Segmentation of liver and multiple tumors: (a) original CT image; (b) split gold standard; (c) U-Net liver; (d) V-Net+CRF liver; (e) improved 3D U-Net liver; (f) U-Net tumor; (g) V-Net+CRF tumor; (h) improved 3D U-Net tumor

Table 1 Improved U-Net network structure

(Table 1 continued)

In order to avoid over-fitting problems that may be caused by too many network training parameters, the signal needs to add the dropout operation after passing through each stage.

3 Experimental Results and Analysis

3.1 Data set and experimental platform

This article mainly studies liver and liver tumor segmentation. The data set uses the liver tumor segmentation challenge (LiTS) competition data set. The set mainly contains 130 training sets and 70 test sets. Since the 70 test sets fails to release the segmentation standard, the author mainly uses the last 120 training sets as the training set, and the first 10 training sets as the test set to test the segmentation performance. The image format is .nitfi.

The main list of experimental environments used in the experiment is shown in Table 2.

Table 2 Software and hardware configuration

3.2 Results display and analysis

The experiment mainly uses the first 10 training sets of LiTS as the test set. The results of the segmentation experiment based on the improved U-Net method in this article are mainly shown in Figs. 5 and 6.

Figure 5 shows that U-Net and V-Net+CRF are not effective in segmenting the details of the liver, failing to reflect the cracks in the center of the liver. Because the method (3D U-Net improvement) in this paper has a hole filling operation on the image in the image post-processing, which leads to the improper handling of the segmentation cracks. In terms of comprehensive image shape and edges, the segmentation effect in this article is better. For tumor segmentation, U-Net has a tumor under-segmentation, and V-Net+CRF has a false positive segmentation. The method in this paper is basically correct.

It can be seen from Fig. 6 that U-Net, V-Net+CRF, and 3D U-Net improvement in this paper can extremely achieve high accuracy for liver segmentation in the test set. However, as shown in Fig. 6 (f), the U-Net segmentation result of the tumor has a serious position deviation error, and V-Net+CRF also has a certain shape error for the tumor segmentation. The result of the method in this paper for tumor segmentation is very impressive.

Fig. 6 Segmentation of liver and single tumor: (a) original CT image; (b) split gold standard; (c) U-Net liver; (d) V-Net+CRF liver; (e) improved 3D U-Net liver;(f) U-Net tumor;(g) V-Net+CRF tumor; (h) improved 3D U-Net tumor

In general, the segmentation performance of this network for liver and tumor is better than that of traditional neural network. Especially for tumor segmentation, the performance is greatly improved. This paper mainly selects dice, volumetric overlap error(VOE), relative volume difference(RVD), average symmetric surface distance(ASD), and maximum symmetric surface distance (MSD) to evaluate the segmentation results.

It can be seen from Table 3 that the method in this paper has a greater improvement in liver segmentation accuracy than the traditional method, which is 2.89% higher than U-Net, and 1.10% higher than V-Net+CRF, while the segmentation performance is basically the same as that of existing methods. The performance on the RVD, ASD, and MSD evaluation indicators has been optimized accordingly.

Table 3 Comparison of liver segmentation indicators

It can be seen from Table 4 that the method in this paper has a greater advantage in tumor segmentation accuracy than other methods. In terms of tumor segmentation accuracy, the method in this paper is 7.66% higher than U-Net. Compared with hybrid dilation and attention residual U-Net (HDA-ResUNet), the method increases by 0.53%, and compared with MHSU-Net, it increases by 1.81%. The performance on the RVD, ASD, and MSD evaluation indicators has been optimized accordingly.

Table 4 Comparison of tumor segmentation indicators

4 Conclusions

This paper proposes an improved medical image segmentation model network based on 3D U-Net network, which has achieved good results in dealing with liver and tumor segmentation issues. But the training time is too long, and the accuracy of tumor segmentation needs to be improved. The focus of the next step is to reduce the complexity of the network. According to the uneven shape and diffuseness of the tumor itself, a special neural network is needed to design to further improve the accuracy of segmentation.

In the subsequent image processing work, the 2D data sequence can be further reconstructed according to the liver and tumor nitfi data obtained by the segmentation, so that the visibility and practicability of the image can be further enhanced.

Journal of Donghua University(English Edition)2022年4期

Journal of Donghua University(English Edition)的其它文章: Classification of Preparation Methods and Wearability of Smart Textiles; Computer-Based Estimation of Spine Loading during Self-Contained Breathing Apparatus Carriage; Click-Through Rate Prediction Network Based on User Behavior Sequences and Feature Interactions; Predictive Model of Live Shopping Interest Degree Based on Eye Movement Characteristics and Deep Factorization Machine; Object Grasping Detection Based on Residual Convolutional Neural Network; Time Delay Identification in Dynamical Systems Based on Interpretable Machine Learning