亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

基于LSTM的DGA域名檢測算法研究與應用

2021-09-14 23:47:23查偉金

電腦知識與技術 2021年22期

查偉金

摘要：隨著互聯(lián)網(wǎng)技術的快速發(fā)展，網(wǎng)絡服務于各類行業(yè)，域名數(shù)量與日俱增的同時惡意域名的檢測也變得愈來愈困難且更加重要。惡意服務常利用域名生成算法（DGA）逃避域名檢測，DGA域名常見于一些僵尸網(wǎng)絡和APT攻擊中，針對DGA域名可以輕易地繞過傳統(tǒng)防火墻和入侵檢測設備、現(xiàn)有方法檢測速度慢、實用性不強等問題，采用深度學習技術，基于LSTM設計了DGA域名檢測方法，從海量域名樣本中分辨出異常域名，借助機器代替人力完成這樣重復性的工作。經(jīng)實驗結果證明，該方法檢測準確率高達99.1%以上，是有效可行的。同時結合流量探針構建實時監(jiān)測系統(tǒng)，實時準確地監(jiān)測流量中的DGA域名，提高網(wǎng)絡空間安全性。

關鍵詞：域名生成算法;僵尸網(wǎng)絡;深度學習;LSTM;網(wǎng)絡空間安全

Abstract： With the rapid development of Internet technology， the network had served various industries， While the number of domain names is increasing day by day， the detection of malicious domain names has become more and more difficult and more important. Domain Generate Algorithm （DGA） was used by malicious services to evade domain detection. DGA was common in some botnets and APT attacks， aiming at the problem of DGA domain can easily bypass traditional firewalls and intrusion detection devices， slow detection speed and poor real-time performance in existing detection methods. a DGA domain detection algorithm based on Long Short-Term Memory （LSTM） model was designed by using deep learning， which candistinguish abnormal domain names from a large number of domain name samples， and use machines to replace humans to complete such repetitive tasks. The experimental results prove that the detection accuracy of this method is as high as 99.1%， which is effective and feasible. Meanwhile， a Real-time Monitoring System for DGA Domain based on LSTM was proposed in combination with flow probe to monitor network traffic in real time and improve cyberspace protection capabilities.

Key words： domain generation algorithm; botnet; deep learning; LSTM; cyberspace security

1引言

目前，網(wǎng)絡安全問題日益突出。網(wǎng)絡攻擊、網(wǎng)絡恐怖主義等安全事件時有發(fā)生。隨著公共云、私有云和大型局域網(wǎng)在企業(yè)、軍隊和學校的廣泛使用，用戶在互聯(lián)網(wǎng)上的各種操作和行為每天都會產(chǎn)生大量的信息，不法分子也一直想通過網(wǎng)絡攻擊等手段獲取機密信息和情報。

惡意軟件經(jīng)常使用DGA域名來提高其與C&C服務器通信的可靠性，從而避免常規(guī)的黑名單檢測。從大量域名樣本中識別異常域名的任務應該由機器來完成，而不是由人工來完成。傳統(tǒng)的DGA域檢測方法通常有很大的缺點。黑名單過濾方法[1]雖然準確率高，但需要手工補充黑名單之外的DGA域名，難以解決DGA域名快速增長帶來的問題。機器學習檢測方法[2-7]需要通過技術人員的實驗構造特征值，并設計檢測算法，實現(xiàn)對未知DGA域名的檢測，但是也存在一些問題，如手工特征提取工作量大，無法準確提取出所需的全部特征，檢測速度慢，檢測精度低等。

近幾年深度學習[8]在自然語言處理有很好的表現(xiàn)，比傳統(tǒng)的機器學習更有優(yōu)勢。它能自動提取特征，并通過訓練大量樣本獲得較高的精度。經(jīng)典的循環(huán)神經(jīng)網(wǎng)絡（RNN）能很好地保留語言處理中的上下文信息。但是，隨著訓練過程中時間和輸入數(shù)據(jù)的增加，RNN對語句中上下文信息的感知能力就會下降，導致梯度的消失或爆炸。而在RNN基礎上改進的LSTM[9-11]可以解決上述問題，并在DGA域名檢測中取得良好的結果?；诖瞬⒔Y合流量探測器，設計了基于LSTM的DGA域名實時監(jiān)測系統(tǒng)。為保證該算法的良好檢測性能，本文選擇360和Alexa分別提供的DGA域名和合法域名進行合理的建模和評估，以獲得最優(yōu)的檢測算法。

2基于LSTM的DGA域名檢測算法

基于LSTM的DGA域名檢測算法包括域名向量化、上下文信息提取、分類輸出等三個步驟。

2.1 域名向量化

在輸入一個域名作為模型之前，有必要對域名進行向量化。常用的向量化方法有bag of words（BoW）、One hot和n-gram（n元語法）。由于域名字符串中沒有語法和詞序元素，我們選擇Bow模型從統(tǒng)計數(shù)據(jù)集中的所有字符生成一個字符字典，并以鍵值對（'a'：2）的形式存儲。