亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

A comparison of deep learning methods for seismic impedance inversion

2022-07-14 09:18:20SiBoZhngHongJieSiXinMingWuShngShengYn

Petroleum Science 2022年3期

Si-Bo Zhng ,Hong-Jie Si ,Xin-Ming Wu ,Shng-Sheng Yn

a Huawei Cloud EI Product Department,Xi'an,Shaanxi,710077,China

b School of Earth and Space Sciences,University of Science and Technology of China,Hefei,Anhui,230026,China

Keywords:

ABSTRACT

1.Introduction

Seismic impedance inversion has been studied for decades as it is one of the most effective methods for reservoir characterization in seismic exploration.It aims at reconstructing impedance sequences from their corresponding seismic traces which is based on the forward model:

where s is a seismic trace which is approximated as the convolution of a wavelet w and a reflectivity sequence r.Impedance i and reflectivity r have the following relationship:

where i[k]represents the value of vertical impedance sequence at depth k.Solving i from s is an underdetermined problem(Jackson,1972),so traditional methods(Hu et al.,2009;Zhang and Castagna,2011;Zhang et al.,2013)use different regularization to constrain solution space,and also many structure-guided methods are proposed(Ma et al.,2012;Zhang and Revil,2015;Zhou et al.,2016;Wu,2017).Although these methods are widely used in the industry,some drawbacks still remain as the regularization terms need to be pertinently designed which may limit the generalization of the model.Besides,these model-driven methods usually need to solve an optimization problem which is time-consuming and often yields a smooth result.In addition,the wavelet w in Equation(1)is typically unknown and might be hard to estimate as it is often varying in time and space.In practice,the relationship between the recorded seismogram and the true impedance is much more complicated than the simple convolution model described in Equations(1)and(2).The acquisition limitations,potential measurement errors,processing errors,and noise make the impedance estimation from seismograms a highly nonlinear problem with large uncertainties.Therefore,a data-driven deep learning method is expected to estimate the complicated and nonlinear relationship between the seismic traces and impedance sequences.

In recent years,Deep Learning(DL)has an explosive development in Computer Vision(CV)(Krizhevsky et al.,2012).Various architectures and skills(Szegedy et al.,2015;He et al.,2016;Huang et al.,2017)are proposed to promote the benchmarks in this area.Other fields,such as medicine,meteorology,remote sensing as well as seismic exploration,also benefit from the development and make significant breakthroughs(Zhao,2018,2019;Di et al.,2018,2019;Wu et al.,2019,2020).Compared with traditional modeldriven methods,the advantage of DL mainly lies in that feature learning,extraction and prediction are all included in an end-toend process,which avoids tedious manual design and achieves less errors by jointly optimizing all parameters.Consequently,many DL-based methods(Das et al.,2018;Wang et al.,2019a,b;Phan and Sen,2018;Biswas et al.,2019;Alfarraj and AlRegib,2019b,a;Zheng et al.,2019)are put forward to solve the seismic impedance inversion.The critical technology is Convolutional Neural Network(CNN)whose basic idea is to hierarchically extract features using stacked convolution layers and nonlinear activations.Due to the strong feature representation ability of CNN,DL-based methods are able to more accurately approximate the relationship between the seismograms and impedance sequences and therefore can generate more accurate inversion results.However,in most cases,CNNs are used as black boxes,few work performs indepth research on how to appropriately design an effective and efficient DL mechanism for the inversion problem.

In this paper,we focus on further research on DL-based inversion methods.The in-fluence of various network hyperparameters and architectures on the inversion results is explored.Specifically,we carry out comparative experiments on three basic hyperparameters(i.e.,kernel sizes,number of channels and layers)and two multi-scale architectures,and make comprehensive analysis of the experimental results.In addition,we design a series of methods inspired by perceptual losses(Johnson et al.,2016)and Generative Adversarial Network(GAN)(Goodfellow et al.,2014)to promote the high-frequency information.The contributions of this paper can be summarized as follows:

.We provide important bases for inversion network design by revealing the influence of network hyperparameters and structures on inversion performances.

.We show a clear clue to address the inversion of high-frequency details by borrowing ideas from CV,and achieve the desired results.

2.Methods

As a data-driven technology,DL-based methods learn mapping functions from seismograms to impedance sequences.We use the conventional CNN as the baseline,based on which other architectures and techniques are developed step by step to improve the inversion performance.

2.1.Conventional CNN

2.1.1.Architecture

CNN consists of stacked convolutional layers as shown in Fig.1.A convolutional layer can be defined as follows:

Fig.1.A conventional CNN which contains 6 convolutional layers,the input is a seismic trace and the output is an impedance sequence.

where wland blrepresent the kernel and bias at the l-th layer,and xl-1and xlare the input and output,respectively.In addition,we use Parametric Rectified Linear Unit.

(PReLU)(He et al.,2015)as the nonlinear activationσ(x)which is formulated as

Fig.2.(a)a seismic-impedance training pair.(b)a crossline seismic section.(c)an inline seismic section.

The most intuitive inversion method is using 1-dimensional CNN as a mapping function from a seismic trace to an impedance sequence.Unlike model-driven methods,training the CNN requires a lot of training data.

2.1.2.Dataset

The seismic data and well logs that we use in this paper are extracted from the freely available Teapot Dome dataset(Anderson,2009).The seismic data is already converted to depth domain and matched with the well logs.Hundreds of wells are provided along with the seismic data,however,in our experiments of seismic impedance estimation,we choose only the wells that contain both velocity and density logs with significant long depth ranges.Consequently,we choose totally 27 wells and extract the seismic traces near the wells to obtain 27 pairs of impedance sequences and seismograms,in which 22 paris are randomly selected to train our DL networks for the impedance estimation and the remaining 5 pairs are used as validation set.Fig.2a shows one of the training data pairs where the smooth blue curve represents a seismogram while the red curve with more details denotes a target impedance sequence that we expect to estimate from the seismogram.Fig.2b and c shows a crossline and inline seismic sections that are extracted from the original 3D seismic volume.These two seismic sections are used in this paper to demonstrate effectiveness of our trained neural networks for the impedance estimation.

2.1.3.Experiments

Hyperparameters have a great impact on the CNN performance.In order to figure out the network design principles for inversion problem,we study three key parameters including kernel size,number of layers and channels which are related to network structure.In the experiments,we adopt the Adadelta(Zeiler,2012)optimizer with initial learning rate of 0.1,and the learning rate decays to 0.9 times every 50 epochs.The batch size is set to 8.The Mean Squared Error(MSE)is used as loss function,whose formula is as follows:

where inand f(sn)are the true and predicted impedances of the n-th training pair,‖·‖2is the■2-norm operator,N is the number of training pairs,K is the signal length.Note that all the input seismic traces and target impedance sequences are normalized by subtracting mean and being divided by standard deviation.All experiments adopt the above settings by default.

First,we fix the number of convolution layers to 5 and channels of each layer to 16,and observe the effect of the kernel size on the inversion result.The kernel size increases from 5 to 23 with steps of 6.As shown in Fig.4a and b,the larger the kernel size,the better the network convergence.We also observe that larger kernel size brings more high-frequency information as shown in Fig.3a.

We then adjust the output channel of each layer from 8 to 64,and fix the kernel size to 11 and the number of layers to 5.We observe a trend similar to the kernel size experiment shown in Fig.4c and d.Networks with more channels can converge to lower training losses,but they converge to the same level of validation loss as the epoch increases.Despite this,their visual effects on the predicted impedance sequences are quite different as shown in Fig.3b.We can see that high frequency is getting richer,as the number of channels increases,especially within the depth window between 160 and 180.

Fig.3.Inversion results of a seismic trace with different hyperparameters.The red solid curve and black dashed curve represent the true and predicted values respectively.(a)results with different kernel sizes.(b):results with different number of channels.(c)results with different number of layers.

Fig.4.Training and validation loss curves.Left column:training loss.Right column:validation loss.Top row:loss curves with different kernel sizes.Middle row:loss curves with different number of channels.Bottom row:loss curves with different number of layers.

Furthermore,we study the effect of the number of layers on the inversion results.In this experiment,kernel size and channels of each layer are fixed to 11 and 16 respectively,and the number of layers arranges from 2 to 16 in multiples of 2.Fig.4e and f shows that shallow network with 2 layers is underfitting.When the number of layers increases to 8,the network achieves the best convergence.It is worth notingthat the performance of the network with 16 layers has a great degradation.But this is not caused by overfitting since both the training and validation losses are degraded.The reason is that deeper architecture brings huge challenges to the gradient backpropagation(He et al.,2016).From the visual effects in Fig.3c,the results of #layers=2 and 16 are underfitting,and result of#layers=4 yields more details than that of#layers=8.

In general,increasing the complexity of the architecture can improve the network's representation ability,but such improvement is limited.Different hyperparameters may lead to different visual effects.Therefore,it is necessary to consider various factors when designing the network.In order to compare all the methods designed in later chapters,we use a conventional CNN with kernel size of 13,channels of 16 and layers of 6 as a baseline model.

2.2.Multi-scale architecture

A conventional CNN uses a fixed kernel size to extract seismic features at a specific scale,which limits the feature representation.To improve the multi-scale representation capability of the network,we propose two methods in this chapter.

2.2.1.Multi-scale CNN

Inspired by the inception module(Szegedy et al.,2015),a Multi-Scale CNN(MSCNN)is designed as shown in Fig.5.It is composed of a stacked multi-scale block which is marked by a red frame,where the input feature is parallelly feeded into three conventional layers with different kernel sizes,the three-way output features are then concatenated in channel dimension to form the final output of the multi-scale block.The MSCNN can extract multi-scale features of seismic traces block by block,and uses a normal conventional layer to calculate the impedance in the end of the network.

2.2.2.UNet

Fig.5.MSCNN with three stacked multi-scale blocks.Three shades of blue rectangle stand for features extracted by convolutional layers with three different kernel sizes,and the bottleneck containing the three rectangles represents the concatenated features.

Fig.6.UNet Architecture.The k,c,s stand for kernel size,number of output channels and stride respectively.The pool and tconv represent max pooling and transpose convolutional layers respectively.

UNet(Ronneberger et al.,2015)is another multi-scale architecture which is originally proposed to solve the image segmentation.As shown in Fig.6,the UNet has two basic components:encoder and decoder.The encoder is similar to the backbone of a classification network,which consists of conventional layers and max pooling layers.A max pooling layer down samples features with stride of 2 and obtains larger-scale seismic representations.The decoder acts as an upsampling process which uses transpose convolution as the upsampling operator.In the decoder,each upsampled feature is concatenated with the feature of the same scale in the encoder.This concatenation contributes to highresolution information reconstruction.

2.2.3.Experiments

To make a relatively fair comparison of the baseline CNN,MSCNN and UNet,we keep their parameter amounts at the same level.For MSCNN,we use 5 multi-scale blocks whose kernel sizes of three ways are 7,13 and 19,and each way has 5 output channels.The kernel size of the final conventional layer is 11.For UNet,the hyperparameters are shown in Fig.5.Parameter amounts of the three methods are given in Table 1.

Table1 Parameter amounts of the three methods.

The same inversion experiments are executed on the three methods.Fig.7 shows that the three methods converge to almostthe same level on the training set,but MSCNN and UNet perform better on the validation set.This means the three networks have the same learning ability since they consist of the similar number of parameters,but multi-scale architectures show better generalization.

The first column of Fig.8 shows the trace inversion results of the three methods.We can observe that MSCNN and UNet obtain relatively better results than the conventional CNN,especially within the depth window between 140 and 180 where the CNN yields highly smooth predictions.The same observation can be found in the first column of Fig.13 and Fig.14,where the layers,especially those thin ones,can be hardly resolved as the conventional CNN yields smooth predictions with limited details in the vertical dimension.

2.3.Perceptual loss

Even though the multi-scale methods achieve better results than the baseline model,they all lose much high-frequency information.This is because they all trained by using MSE loss function which is easy to produce smoothness.From the perspective of the CV,MSE only penalizes Euclidean distance between two images,but ignores the image content.To overcome this problem,we introduce the perceptual loss(Johnson et al.,2016),which measures content similarity,into the networks.

Fig.7.Training and validation loss curves of the baseline,MSCNN and UNet.

2.3.1.Definition

Seismic impedance inversion can be considered as a signal reconstruction problem which is similar to the image superresolution,where it recovers the high-frequency impedance from a low-frequency seismic trace.The perceptual loss states that the reconstructed image should be similar to the ground truth not only in pixels,but also in the feature domain.The common idea is using a pre-trained network,e.g.,VGG-16(Simonyan and Zisserman,2014),as a loss network to extract the features in different layers,and calculating the Euclidean distance between the true and predicted features to measure the content difference.The perceptual loss experimentally proves the effectiveness of reconstructing high-frequency information.

Inspired by the above ideas,we design a simple autoencoder as the loss network as shown in Fig.9.The autoencoder has the same structure with the UNet,but there are no links from encoder to decoder.Its hyperparameters of each layer are displayed in the figure.The autoencoder learns a mapping function from the impedance to itself.In other words,the input and output of the network are the same impedance.The main purpose is to extract proper features at different scales as shown in Fig.9,then we can use the features to calculate the perceptual loss which is defined as follows:

whereφl(i)is the l-th layer's feature of the impedance i extracted by the loss network.

2.3.2.Experiments

We train the autoencoder using the impedance sequences of training set with the same implementation as the previous experiments.Fig.10 shows the reconstruction performance of the autoencoder,we can see that all the curves are well fitted.It should be noted that we add small Gaussian noise on the training samples at each step to release the overfitting in all experiments,since the size of training set is small.

Fig.8.Trace inversion results by different methods.Top row:The baseline CNN.Middle row:MSCNN.Bottom row:UNet.First column:The pure networks.Second column:Networks with perceptual loss.Third column:Networks with GAN.

Fig.9.Training with perceptual loss.The inversion network in the dashed box can be any architecture.The black and red traces are predicted impedance f(s)and true impedances i respectively.Three layers are used as the endpoints to export features which is represented byφl(·),where the subscript l represents the layer index.

Fig.10.The recovered impedance sequences by the autoencoder for the validation set.The red and black curves are the input ground truth and output predictions respectively.

By combing the MSE and perceptual losses,we can train the networks by the following loss function:

whereλpis the weight factor of the perceptual loss.We conduct a series of experiments to study how to selectλpand l.The baseline model is used as the inversion network.First,theλpis fixed to 1.0,and we use different endpoints(i.e.,l=2,4,6)as the feature extraction layers.The results in Fig.11a shows that when l reaches 6,the ability to reconstruct high-frequency information is limited.The results of l=2,4 obtain relatively better details for depth around 170.Then we set l=2,and increaseλpfrom 0.01 to 10.0 by a factor of 10.We can see from Fig.11b that as the weight of the perceptual loss increases,more details are reconstructed,but some peak values may exceed the ground truth,e.g.,for depth at 100 and 120 withλp=10.0.

The above observations validate the effectiveness of the perceptual loss.But we need to make a balance between detail reconstruction ability and amplitude fitting stability.We use l=4 andλp=1.0 as the default setting to make comparisons with other methods.The first two columns of Fig.8 are the inversion results of the three architectures with or without perceptual loss,which illustrates that perceptual loss improves the reconstruction of highfrequency information a lot.Besides,we have a consistent observation on the inversion planes in Figs.13 and 14,inversion planes by using perceptual loss have more and clearer horizons than that using the pure MSE loss.

Fig.11.Inversion results by using different endpoint layers l and weightλp.

2.4.GAN

The previous methods focus on the design of backbones and loss functions to achieve desired results,which demonstrates that the developments and techniques in CV field can be used to tackle the seismic inversion problem.Following this clue,we further explore how to estimate more realistic impedance sequences by using GAN which achieves great success in image generation.

2.4.1.Architecture

GAN has two basic modules:generator and discriminator as shown in Fig.12.The generator can be any inversion network and it aims at fooling the discriminator.The discriminator is a classification network,and it should distinguish between the real traces and traces produced by the generator as much as possible.The two modules form an adversarial mechanism during the training process,as a result the generator produce realistic impedance sequences,and the discriminator could not distinguish between true and generated sequences.

Fig.12.GAN Architecture.The generator is an inversion network that generates impedance sequences from seismic traces.The discriminator distinguish between generated and real impedance sequences.The hyperparameters of the discriminator are given in the yellow frame,where h and fc stand for number of hidden node and fully connected layer respectively.

There is a strong correlation between seismic inversion and image super-resolution problems,as both of them reconstruct high-frequency signals from low-frequency signals.So we refer to Enhanced Super-Resolution GAN(ESRGAN)(Wang et al.,2018)as a reference to design an inversion GAN.The discriminator architecture and hyperparameters are given in Fig.12.The final fully connected layer has only one hidden node as it is a binary classification network.Different from the standard classification,we use the Relativistic average Discriminator(RaD)(Jolicoeur-Martineau,2019)to predict the realistic degree of generated impedance relative to true impedance.The RaD is formulated as follows:

where irand igare the real and generated impedances,fD(i)represents the output of final fully connected layer for impedance i,Eig[·]represents the average value of the generated impedances in a mini-batch.δ(·)is the sigmoid function.An ideal discriminator makes DRa(ir,ig)=1 and DRa(ig,ir)=0.Then the discriminator loss is defined as:

The adversarial loss of generator is defined as:

The total loss for generator is then defined as follows:

whereλpandλgare weight factors for perceptual and adversarial losses respectively.In the training process,the generator and discriminator are alternately updated by minimizing LGand LD.

2.4.2.Experiments

In order to speed up convergence,we first train an inversion network with MSE loss using the default setting,and then use the pre-trained model as an initial generator.The parametersλp,λg,l are empirically set to 1.0,7e-3,4.The initial learning rates of generator and discriminator are 0.7 and 0.9,and they decays by factors 0.95 with decay steps of 50 and 100 respectively.The GAN is trained for 1000 epochs.We adopt the three networks,i.e.,CNN,MSCNN,UNet,as the generator of the GAN.The trace inversion results are shown in Fig.8,we can see that GANs recover more details than pure networks and they have the similar visual effect to the networks with perceptual loss.But according to the plane inversion results in Figs.13 and 14,GANs generate finer layers than other two methods especially within the depth window between 50 and 250.In addition,GANs produce some dark layers(with low impedances)near the depth of 200 which can not be observed in the results by the other methods.

3.Discussion

The hyperparameter experiments on the conventional CNN demonstrate that networks with more parameters show stronger fitting ability from the two perspectives of the number of channels and kernel size.But this promotion tends to disappear as the amount of parameters increase as shown in Fig.4a,b,4c,and 4d.This is because the ability of each network to fit small dataset is saturated.From the layer number perspective,as shown in Fig.4e and f,excessive increase in the number of layers leads to degradation of convergence.A common view is that deeper network makes gradient back propagation difficult,and even produce the vanishing gradient problem(He et al.,2016).Fig.3a,b and 3c indicate that the curve fitting performance varies a lot with the hyperparameters,which is mainly reflected in the reconstruction of high-frequency details.

Using conventional CNN to solve the inversion problem is an intuitive way.However it is hard to choose the proper hyperparameters,since the inversion result is hyperparameter-sensitive.Multi-scale architecture can extract features at different scales and therefore is able to recover more details than the conventional CNN with the same number of parameters.As a result,multi-scale architecture relieves the cost of hyperparameter selection.But we note that even though the three methods converge to the same level as shown in Fig.7,they yield quite different visual effects in the inversion sections as shown in the first columns in Figs.13 and 14.Overall,multi-scale inversion sections show more thin layers,but MSCNN and UNet produce different high impedance areas.Therefore,it is important to adopt appropriate architectures.

From the inversion experiments,the key point is the reconstruction of the high-frequency information.In the CV field,the MSE loss is proven to produce smoothness,which can be improved by the perceptual loss in Equation(6).In order to build an impedance feature space which is used to calculated the perceptual loss,we design an autoencoder that learns a mapping function from impedance to itself as shown in Fig.9,and then extract the features by the endpoints of the autoencoder.Figs.8,13 and 14 show that perceptual loss provides a great contribution to reconstructing high-frequency information.On the other hand,the endpoint layer l and weight factorλpin Equation(6)also impose an effect on the inversion results as shown in Fig.11a and b.So a trade-off must be made between detail reconstructing and fitting stability.The GAN experiments demonstrate that adversarial training mechanism further promotes the reconstruction of details,and it generates finer layers as shown in Figs.13 and 14.Besides,some dark layers with low impedance values appear in the GAN inversion results,which may indicate its ability in recovering high-frequency information.

Fig.13.Inline inversion results by different methods.Top row:The baseline CNN.Middle row:MSCNN.Bottom row:UNet.First column:The pure networks.Second column:Networks with perceptual loss.Third column:Networks with GAN.

DL-based methods achieve promising results,but also show some limitations.Different architectures produce various vision effects,which may bring confusion to practical applications.However,there is no objective evaluation index to indicate which network should be used.The widely used MSE can provide a reference to the fitting performance,but it produces much smoothness.So it is necessary to build an evaluation functionrelated to structure and content of the impedance.The other obvious problem is the lack of training data.In practice,the number of well logs is highly limited,which often results in network overfitting.Using some tricks,such as adding Gaussian noise,cannot completely avoid the risk of overfitting.One way to address this is building realistic structure models(Wu et al.,2020)to simulate more seismic and impedance pairs.The other meaningful way is introducing the physical mechanism of Equation(1)into the network architecture to make full use of seismic traces that do not correspond to any true impedance sequences,which performs a semi-supervised learning.

4.Conclusion

This paper comprehensively studies the DL-based methods for seismic impedance inversion problem.A series of networks are designed to improve the reconstruction of high-frequency information.Through experiments,we reveal the influence of the network hyperparameter and architecture on the inversion performance.The difference between conventional CNN and multiscale architecture in convergence,trace fitting and vision effect are well studied.Inspired by the developments in the CV field,we adopt perceptual loss and GAN mechanism which are proven to be effective for enhancing high-frequency details.In spite of the success of DL-based methods,they still show the aforementioned limitations of objective evaluation index and training data.We plan to solve these two issues in the future.

Acknowledgments

This research was supported by the National Natural Science Foundation of China under Grant No.42050104.

Petroleum Science2022年3期

Petroleum Science的其它文章: Spillover of international crude oil prices on China's refined oil wholesale prices and price forecasting:Daily-frequency data of private enterprises and local refineries; On the time-varying correlations between oil-,gold-,and stock markets:The heterogeneous roles of policy uncertainty in the US and China; A covering liquid method to intensify self-preservation effect for safety of methane hydrate storage and transportation; Prediction and programming of microemulsion phase behavior simulation; Study on the mechanism of hydrodesulfurization of tetrahydrothiophene catalyzed by nickel phosphide; N-hydroxyphthalimide anchored on hexagonal boron nitride as a metal-free heterogeneous catalyst for deep oxidative desulfurization