趙玉新,何永旭,徐庚,陳力恒
基于高斯過程的航天器自適應(yīng)滑模姿態(tài)控制
趙玉新,何永旭,徐庚,陳力恒
(哈爾濱工程大學(xué) 智能科學(xué)與工程學(xué)院,黑龍江 哈爾濱 150001)
針對(duì)存在模型不確定性和外界干擾的剛性航天器,提出了一種基于高斯過程回歸(GPR)的新型自適應(yīng)滑模姿態(tài)控制算法。該算法具有自學(xué)習(xí)能力,在不同的姿態(tài)控制任務(wù)下都能夠?qū)崿F(xiàn)高精度、強(qiáng)魯棒和高效率的姿態(tài)跟蹤。首先,在航天器的四元數(shù)標(biāo)稱系統(tǒng)動(dòng)態(tài)模型基礎(chǔ)上,應(yīng)用在線稀疏高斯過程回歸(SOGP)方法學(xué)習(xí)系統(tǒng)的未知?jiǎng)討B(tài);其次,結(jié)合高斯過程的預(yù)測均值設(shè)計(jì)滑模控制算法,利用高斯過程的預(yù)測方差自適應(yīng)調(diào)節(jié)控制增益,并應(yīng)用李雅普諾夫方法嚴(yán)格證明閉環(huán)系統(tǒng)的穩(wěn)定性,保證了航天器姿態(tài)跟蹤誤差的漸進(jìn)收斂性;最后,通過數(shù)值仿真驗(yàn)證了所設(shè)計(jì)控制器的有效性。結(jié)果表明,該自學(xué)習(xí)控制算法與自適應(yīng)滑??刂疲ˋSMC)與神經(jīng)網(wǎng)絡(luò)自適應(yīng)控制等算法相比,具有更快的收斂速度、更高的跟蹤精度以及更低的控制成本。
姿態(tài)跟蹤;四元數(shù);高斯過程回歸;自適應(yīng)控制;滑??刂?/p>
高精度的姿態(tài)控制是航天器成功執(zhí)行巡邏、編隊(duì)飛行與交會(huì)對(duì)接等任務(wù)的關(guān)鍵[1-3]。然而,航天器的轉(zhuǎn)動(dòng)慣量不確定性、外界干擾力矩以及其姿態(tài)動(dòng)力學(xué)的高度非線性與強(qiáng)耦合性都為高精度姿態(tài)控制帶來一定的難度。因此,研究存在模型不確定性和外界干擾的航天器姿態(tài)控制具有重要的實(shí)際意義。
針對(duì)航天器存在不確定性時(shí)的姿態(tài)控制問題已提出了許多非線性控制算法,如滑??刂疲?]、反步法控制[5]與自適應(yīng)控制[6]等。這些傳統(tǒng)的控制算法雖然通過采用觀測器或自適應(yīng)算法補(bǔ)償不確定性的方式提高了姿態(tài)控制的性能,但缺乏自學(xué)習(xí)的能力,在姿態(tài)跟蹤任務(wù)發(fā)生變化時(shí),若不人為調(diào)整控制參數(shù),姿態(tài)控制的性能可能會(huì)大大降低[7]。近年來,學(xué)者們將強(qiáng)化學(xué)習(xí)、神經(jīng)網(wǎng)絡(luò)等機(jī)器學(xué)習(xí)方法與傳統(tǒng)的控制算法結(jié)合,通過充分利用航天器的觀測數(shù)據(jù)來有效提高姿態(tài)控制的自學(xué)習(xí)能力。文獻(xiàn)[7]設(shè)計(jì)了基于強(qiáng)化學(xué)習(xí)的滑模控制算法,保證了航天器在跟蹤任務(wù)變化時(shí)的控制性能。但該算法需要采集離線觀測數(shù)據(jù)以確??刂茀?shù)學(xué)習(xí)的速度,增加了算法的實(shí)際應(yīng)用難度。文獻(xiàn)[8]將切比雪夫神經(jīng)網(wǎng)絡(luò)與終端滑??刂葡嘟Y(jié)合,解決了航天器存在不確定性時(shí)的有限時(shí)間姿態(tài)跟蹤控制問題。文獻(xiàn)[9]設(shè)計(jì)了基于徑向基神經(jīng)網(wǎng)絡(luò)的滑??刂扑惴ǎ瑪U(kuò)大了神經(jīng)網(wǎng)絡(luò)的有效作用區(qū)域,從而保證了航天器對(duì)于未知干擾的魯棒性。由于神經(jīng)網(wǎng)絡(luò)是一種確定性的機(jī)器學(xué)習(xí)方法,難以直接評(píng)估不確定性的預(yù)測可靠性,所以神經(jīng)網(wǎng)絡(luò)控制算法通常需要采用較高的反饋增益來避免不確定性的預(yù)測誤差對(duì)閉環(huán)系統(tǒng)控制性能的影響,使得這類算法的控制成本較高。
高斯過程回歸(Gaussian Process Regression, GPR)是一種基于概率的機(jī)器學(xué)習(xí)方法,具有嚴(yán)格的統(tǒng)計(jì)學(xué)理論基礎(chǔ),能夠有效處理高維度、小樣本和非線性的復(fù)雜函數(shù)建模問題[10]。GPR方法利用高斯過程(Gaussian Process,GP)描述未知函數(shù)的分布情況,并可基于輸入輸出數(shù)據(jù)對(duì)函數(shù)值進(jìn)行預(yù)測。GP模型為非參數(shù)模型,能夠有效處理數(shù)據(jù)的觀測噪聲,并且其預(yù)測的可靠性可由方差信息評(píng)估[11]。因此,GPR方法也可與傳統(tǒng)的控制算法相結(jié)合來提高不確定非線性系統(tǒng)的控制性能和自學(xué)習(xí)能力。文獻(xiàn)[12]設(shè)計(jì)了基于GPR的計(jì)算力矩控制算法,通過提高不確定性的補(bǔ)償精度,有效地降低了反饋控制增益,從而提高了控制效率。該方法雖然通過引入GP模型的預(yù)測方差降低了控制成本,但需要離線采集大量的訓(xùn)練數(shù)據(jù)以保證GP模型的有效性,所以算法的實(shí)際應(yīng)用性不強(qiáng)。文獻(xiàn)[13]將在線GPR方法與模型參考自適應(yīng)控制相結(jié)合,避免了傳統(tǒng)自適應(yīng)控制中有關(guān)輸入信號(hào)持續(xù)激勵(lì)的約束。該方法對(duì)于模型不確定性的魯棒性較強(qiáng),但難以保證外界干擾存在時(shí)的跟蹤控制性能。
針對(duì)具有模型不確定性且受外界干擾影響的航天器,提出了一種基于GPR的自適應(yīng)滑??刂疲ˋdaptive Sliding Mode Control,ASMC)算法,以保證不同姿態(tài)控制任務(wù)下都能夠?qū)崿F(xiàn)高精度、強(qiáng)魯棒和高效率的姿態(tài)跟蹤。首先,根據(jù)系統(tǒng)的觀測數(shù)據(jù),基于GPR方法學(xué)習(xí)不確定性的映射,從而利用GP模型的預(yù)測均值實(shí)現(xiàn)精準(zhǔn)的動(dòng)態(tài)補(bǔ)償。然后,結(jié)合GP模型設(shè)計(jì)ASMC算法,利用預(yù)測方差主動(dòng)調(diào)節(jié)反饋增益以及控制參數(shù)自適應(yīng)律的更新速度,使得控制參數(shù)可根據(jù)不確定性的預(yù)測可靠程度進(jìn)行自整定。最后,利用李雅普諾夫方法證明航天器的姿態(tài)與角速度跟蹤誤差在任意概率下都能夠全局漸進(jìn)收斂。通過與ASMC和神經(jīng)網(wǎng)絡(luò)滑??刂品椒ǖ姆抡娼Y(jié)果對(duì)比說明,所提出的自學(xué)習(xí)控制算法對(duì)于不同的姿態(tài)控制任務(wù)都具有更快的收斂速度、更高的跟蹤精度以及更低的控制成本。
那么,航天器的姿態(tài)跟蹤誤差運(yùn)動(dòng)學(xué)與動(dòng)力學(xué)模型可表示為[14]
式中:
定義滑模變量為
式中:
分別為預(yù)測均值向量與方差矩陣,其中各元素可根據(jù)式(12)計(jì)算得到。
針對(duì)航天器(1)的姿態(tài)跟蹤控制問題,可根據(jù)式(17)、式(20)與式(22),設(shè)計(jì)以下基于GP的ASMC(GP-ASMC)算法:
證明 選取李雅普諾夫函數(shù)為
則根據(jù)式(29)可知
本章將通過數(shù)值仿真算例說明GP-ASMC算法在航天器姿態(tài)跟蹤控制應(yīng)用中的有效性。仿真中令航天器慣性矩陣的真實(shí)值為
航天器初始姿態(tài)的矢量部分與初始角速度分別為
本仿真采用ASMC[6]與神經(jīng)網(wǎng)絡(luò)ASMC(Neural Network ASMC,NN-ASMC)算法[9]作為對(duì)比方法,來說明GP-ASMC算法的優(yōu)越性。3種方法所采用的控制參數(shù)見表1。另外,本仿真還將在不改變控制參數(shù)的情況下令航天器執(zhí)行2種不同的姿態(tài)控制任務(wù),以驗(yàn)證GP-ASMC算法的自學(xué)習(xí)能力。2種姿態(tài)控制任務(wù)所對(duì)應(yīng)的期望姿態(tài)參數(shù)見表2。
表1 3種控制算法的參數(shù)設(shè)置
表2 2種姿態(tài)控制任務(wù)的期望姿態(tài)參數(shù)
圖2 任務(wù)1下的姿態(tài)四元數(shù)跟蹤誤差
圖3 任務(wù)1下的角速度跟蹤誤差
圖4 任務(wù)1下的控制力矩
圖5 任務(wù)1下基于GP的不確定性預(yù)測
圖6 任務(wù)1下的控制增益自適應(yīng)更新曲線
圖7 任務(wù)2下的姿態(tài)四元數(shù)跟蹤誤差
圖8 任務(wù)2下的角速度跟蹤誤差
圖9 任務(wù)2下的控制力矩
表3 3種控制算法的性能比較
本文針對(duì)存在較強(qiáng)模型不確定性且受外界干擾影響的航天器,利用GPR學(xué)習(xí)算法與滑??刂瓶蚣?,設(shè)計(jì)了一種具有自學(xué)習(xí)能力的自適應(yīng)滑模姿態(tài)跟蹤控制算法,并證明了姿態(tài)四元數(shù)與角速度跟蹤誤差是全局漸進(jìn)收斂的。所提出的GP-ASMC算法利用具有概率意義的GP模型學(xué)習(xí)系統(tǒng)的總不確定性,不僅可以利用GP預(yù)測均值實(shí)現(xiàn)精準(zhǔn)的動(dòng)態(tài)補(bǔ)償,還可以基于預(yù)測方差調(diào)節(jié)控制增益,在保證跟蹤控制精度的同時(shí)可提高算法的控制效率。另外,所應(yīng)用的ASMC框架使得GP-ASMC對(duì)不確定性的預(yù)測誤差具有一定的魯棒性。仿真實(shí)驗(yàn)結(jié)果:GP-ASMC算法對(duì)于不同的姿態(tài)控制任務(wù)都具有收斂速度快、跟蹤精度高、控制成本低的優(yōu)點(diǎn),具有一定的實(shí)際工程應(yīng)用價(jià)值。為了進(jìn)一步說明該算法對(duì)控制系統(tǒng)元器件誤差的容忍能力,后續(xù)將分析執(zhí)行機(jī)構(gòu)的非線性以及傳感器的量測信息缺失與量測誤差等影響因素對(duì)航天器姿態(tài)跟蹤性能的影響,并改進(jìn)控制算法以保證其實(shí)際應(yīng)用的有效性。
[1] LIU X, MENG Z, YOU Z. Adaptive collision-free formation control for under-actuated spacecraft[J]. Aerospace Science and Technology, 2018, 79: 223-232.
[2] NASTASI K M, BLACK J T. Adaptively tracking maneuvering spacecraft with a globally distributed, diversely populated surveillance network[J]. Journal of Guidance, Control, and Dynamics, 2019, 42(5): 1033-1048.
[3] SUN L. Adaptive fault-tolerant constrained control of cooperative spacecraft rendezvous and docking[J]. IEEE Transactions on Industrial Electronics, 2020, 67(4): 3107-3115.
[4] QIAO J, LI Z, XU J, et al. Composite nonsingular terminal sliding mode attitude controller for spacecraft with actuator dynamics under matched and mismatched disturbances[J]. IEEE Transactions on Industrial Informatics, 2020, 16(2): 1153-1162.
[5] ZHUANG H, SUN Q, CHEN Z, et al. Back-stepping active disturbance rejection control for attitude control of aircraft systems based on extended state observer[J]. International Journal of Control, Automation and Systems, 2021, 19(6): 2134-2149.
[6] ZHU Z, XIA Y,F(xiàn)U M. Adaptive sliding mode control for attitude stabilization with actuator saturation[J]. IEEE Transactions on Industrial Electronics, 2011, 58(10): 4898-4907.
[7] ZHENG M, WU Y, LI C. Reinforcement learning strategy for spacecraft attitude hyperagile tracking control with uncertainties[J]. Aerospace Science and Technology, 2021, 119: 107-126.
[8] ZOU A, KUMAR K D, HOU Z, et al. Finite-time attitude tracking control for spacecraft using terminal sliding mode and Chebyshev neural network[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2011, 41(4): 950-963.
[9] ZOU Y. Attitude tracking control for spacecraft with robust adaptive RBFNN augmenting sliding mode control[J]. Aerospace Science and Technology, 2016, 56: 197-204.
[10] RASMUSSEN C E, WILLIAMS C K I. Gaussian processes for machine learning[M]. Cambridge, Mass: MIT Press, 2006: 1-83.
[11] SRINIVAS N, KRAUSE A, KAKADE S M, et al. Information-theoretic regret bounds for Gaussian process optimization in the bandit setting[J]. IEEE Transactions on Information Theory, 2012, 58(5): 3250-3265.
[12] THOMAS B, DANA K, HIRCHE S. Stable Gaussian process based tracking control of Euler-Lagrange systems[J]. Automatica, 2019, 103: 390-397.
[13] CHOWDHARY G, KINGRAVI H A, HOW J P,et al. Bayesian nonparametric adaptive control using Gaussian processes[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(3): 537-550.
[14] SHUSTER M D. A survey of attitude representations[J]. The Journal of Astronautical Sciences,1993, 41(4): 439-517.
[15] YANG Y. Spacecraft modeling, attitude determination, and control quaternion-based approach[M]. CRC Press, 2019: 43-52.
[16] SIDI M J. Spacecraft dynamics and control: a practical engineering approach[M]. New York: Cambridge University Press, 1997: 88-111.
[17] OPPER M. Sparse online Gaussian processes[J]. Neural Computation, 2002, 14(3): 641-669.
[18] FIEDLER C, SCHERER C W, TRIMPE S. Practical and rigorous uncertainty bounds for Gaussian process regression[C]// Proceedings of the AIAA Conference on Artificial Intelligence. Reston, USA: AIAA Press, 2021: 7439-7447.
[19] 鐘婧佳,趙洪,佟澤友,等.基于RBF神經(jīng)網(wǎng)絡(luò)的控制器參數(shù)優(yōu)化設(shè)計(jì)研究[J].導(dǎo)彈與航天運(yùn)載技術(shù),2020(3):76-80.
[20] KRSTIC M, KOKOTOVIC P V, KANELLAKOPOULOS I. Nonlinear and adaptive control design[M]. Hoboken, USA: John Wiley & Sons, Inc., 1995: 489-491.
Adaptive Sliding Mode Attitude Control of Spacecrafts Based on Gaussian Processes
ZHAOYuxin, HEYongxu, XUGeng, CHENLiheng
(College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, Heilongjiang, China)
A novel adaptive sliding mode attitude control algorithm based on Gaussian process regression (GPR) is proposed for rigid spacecrafts with model uncertainties and external disturbances. The proposed algorithm has the ability of self-learning, and can always achieve attitude tracking with high accuracy, robustness, and efficiency under different attitude control tasks. First, the sparse online Gaussian process (SOGP) technique is used to learn the system unknown dynamics based on the quaternion nominal dynamic model for spacecrafts, and an SOGP technique is applied to learn the system unknown dynamics. Second, a sliding mode control algorithm is designed by using the predicted means of GPs, and the control gain is adapted based on the predicted variances. Moreover, the stability of the closed-loop system is proved by using the Lyapunov approach, which guarantees the asymptotic convergence of the attitude tracking error. Finally, the effectiveness of the designed controller is verified by numerical simulation. The results show that the proposed self-learning controller has faster convergence speed, higher tracking accuracy, and lower energy cost than the adaptive sliding mode control (ASMC) and neural network adaptive control algorithms.
attitude tracking; quaternion; Gaussian process regression; adaptive control; sliding mode control
2022?04?26;
2022?06?13
國家自然科學(xué)基金(61903098)
趙玉新(1980—),男,博士,教授,主要研究方向?yàn)樗聦?dǎo)航技術(shù)及應(yīng)用、智能控制與決策。
何永旭(1993—),女,博士研究生,主要研究方向?yàn)榉蔷€性系統(tǒng)控制、智能控制與決策。
TP 273
A
10.19328/j.cnki.2096?8655.2022.04.010