Wenbo Li and Baoling Ning
Dear Editor,
This letter deals with the problem of algorithm recommendation for online fault detection of spacecraft.By transforming the time series data into distributions and introducing a distribution-aware measure, a principal method is designed for quantifying the detectabilities of fault detection algorithms over special datasets.Based on a sublinear time filtering method, an efficient algorithm for evaluating the detectabilities is designed.By combining the above techniques, RecAD is proposed for the recommendation of fault detection algorithms.Experimental results over typical datasets show that RecAD can select the detecting algorithm with better performance efficiently and the cost of the recommendation is rather small.
As a typical kind of autonomous intelligent system, spacecrafts are usually composed of many complex components, and each component is typically equipped with a certain number of sensors which will produce many kinds of telemetry data.Due to working in the extreme environment, spacecrafts tend to be failed or even damaged by the failure of a device or subsystem.To reduce the risk of those failures, a key task of spacecraft operations is anomaly detection that is to discover anomalies in the telemetry data.
There have been many research efforts focusing on anomaly detection over spacecraft telemetry data [1]–[3].Out-of-limits (OOL)method is the most popular one due to its simplicity, low-cost and understandability [4], [5].To overcome the limitations of the OOL methods, many data-driven anomaly detection methods have been introduced [6]–[9].Recently, more and more deep anomaly detection methods are designed [2].The most typical methods include reconstruction based approaches [10], generation based approaches[11], predication based approaches [4], etc.However, it has been found that no method can outperform others always [12], and a natural and feasible solution is to maintain several detection algorithms meanwhile and select the most proper one to detect anomalies according to the actual situations.Therefore, it is highly needed to study the problem of algorithm recommendation for detecting spacecraft anomalies.
Two principal challenges are identified.
1) The first one is the lack of labels and universal objective functions.Due to the limited computational resources of spacecrafts and the scarcity of anomalies, it is hardly possible to have access to any labels when online anomaly detection is processing, algorithm recommendation methods must work in an nearly total unsupervised way.Even worse, there does not exist a universal objective function that could guide algorithm recommendation.
2) The second one is the limited computation resources of online anomaly detection for spacecrafts.Because of the extreme working environments of spacecrafts, to enlarge the lifetime of spacecrafts as much as possible, only recommendation algorithms with enough high efficiency are allowed to be deployed.
In this letter, rising to the above challenges, using the ideas of measuring detectabilities by distributions and distinguishing distributions by sampling, an efficient automated algorithm recommendation method for detecting spacecraft anomalies is proposed.The main contributions include: a formal definition of the fault detection algorithm recommendation problem, a Kullback-Leibler (KL)-divergence based method for measuring the detectability of algorithms, a sublinear algorithm for efficiently estimating the measures and selecting the recommended detection algorithms, and a detailed experimental results to verify the effectiveness of the proposed method.
Notations and problem description: The telemetry data of a spacecraft is usually represented by a time seriesX={x1,x2,...,xn}where each xt∈Rm(t∈[1,n]) is anm-dimensional vector corresponding to the data on each dimension.In the following parts, for the sake of simplicity, the proposed method will be explained by assuming that the telemetry data has only one dimension, and we can denoteXas {x1,x2,...,xn}.It is not hard to verify that our method can be extended to the general cases trivially.
The goal of anomaly detection is to determine whether an observationxtis an anomaly or not.Let A={A1,A2,...,Ah} be the set of algorithms utilized for online anomaly detection.Each algorithmAiis obtained by training over a special datasetDiand previously selected among many potential algorithms as the best one.Then,given a new datasetDdec, the problem of algorithm recommendation for anomaly detection (ARAD) is to select a detector/algorithmAiwith the best performance in A.
Algorithm 1 RecAD (Recommendation for Anomaly Detection)Input: A set of algorithms and the corresponding A={A1,...,An}D={D1,...,Dn} Ddec training datasets the new dataset.Output: The recommended algorithm A.?D={?Di} S ∈N+1: Construct the discretized data with ;?Ddec S ∈N+2: Construct the discretized data with ;?D ∈?D∪?Ddec 3: for each dataset do ?D(l)4: Initialize to be empty;?xi ∈?D i ∈[1,|?D|-l+1]5: for each such that do ?x(l)i ←?xi ?xi+1···?xi+l-1 6: ;?x(l)i ?D(l)7: insert to ;A ←argmaxAi∈A Match(?D(l)i , ?D(l)dec)8: ;9: return A;
Fig.1.Overview of the RecAD method.
Detectability evaluation: Obviously, the essential part of computing Dec(Ai,D) is calculating the corresponding KL-divergence.KLdivergence can be calculated directly according to the definition, and the computation time cost can be bounded byO(nlogn).To satisfy the requirement of online algorithm recommendation and anomaly detection, the total cost of RecAD algorithm shown in Algorithm 1 should be reduced as much as possible.Therefore, propose a method to further filter unnecessary computation of Dec(Ai,D) and reduce the times of invoking KL-divergence computation.
Sublinear algorithms for filtering: In this part, a sublinear algorithm for filtering the candidate algorithms is introduced.The filtering algorithm can reduce the times of divergence computation and improve the performance of RecAD significantly with only small e xtra costs.Intuitively, if the new dataDdecis quite different from a special training datasetDi, that is the corresponding KL-divergence is quite large, the matched algorithm ofDican be filtered.
Algorithm 2 (Detectability Evaluation)Match Input: The new dataset , a threshold , a training data Ddec γ ∈(0,1)D? ? ∈(0,1), and an input.A? Ddec Output: The detectability of on.Z ←{a|pDdec(a)<γ}1: ;2: Construct D by removing data points in Z from ;Z′3: Let be the domain of D;D′ Z′ D?4: Construct by removing data points not in from ;k= 2logn+log 2 Ddec 5: Let , where n is the domain size;B0 ←{a:D′(a)< ? 1+? ? 6: ;k-1 2n}7: for j from 1 to do Bj ←{a: ?(1+?)j-1 2n ≤D′(a)< ?(1+?)j 8: ;count=0 2n }9: ;10: for each random sample d from D do Bi d ∈Bi 11: Let be the one satisfying ;S i 12: Insert d to the multiset ;|S i|==C×√n 13: if then(||pSi||22>(1+?2)/|Bi|) (D′(Bi)≥?/k)14: if and then 15: return 0;16: else count=count+1 ?2 17: ;count==k 18: if then 19: break;1/KL(pD?||pDdec)20: return ;
Experimental results: This part introduces the experimental results
Datasets: Seven real life datasets, SWaT, WADI, DMDS, SKAB,MSL, SMAP, and SMD, are used.They are also often used by previous works[4], [10], and the details can be found in [12].
Algorithms: The fault detection algorithms considered to be the candidates of the recommendation methods in this letter are PCA,UAE [15], LSTM-AE [16], TCN-AE [17], LSTM-VAE [18],MSCRED [19], BeatGAN [20], and NASALSTM [4].
Results: The comparison of algorithms utilized in the experiments and the recommendation results are shown in Table 1, where the comparison is based on the F1-scores [12].It is found that the UAE method outperforms other methods in five of the datasets used, TCNAE is the best detection algorithm on SMD, and LSTM-VAE performs best on WADI.The recommendation methods are shown in the last row of the table, and the algorithm recommended is denoted by R ecAD.It can be found that the R ecAD algorithm can not always find out the best algorithm, but it can select the best one on six datasets.Also, RecAD can indeed avoid the worst algorithm efficiently, for example, on SKAB, the algorithm recommended by RecADis not the best but it has comparative performance.
Table 1.The Result of Recommendation
To verify the efficiency of RecAD, we selected 10 data slices generated from the given datasets, and ran both the RecAD algorithms with and without filtering procedures.The time costs of them are compared and shown in Fig.2, where the labels ofx-axis represent different data slices, the time cost taken by the no-filter method is standardized to be 1, and the values ofy-axis represent the ratio of time costs between RecAD with and without filtering procedures.It can be found that the filtering procedure proposed by this paper can improve the performance of RecAD significantly in most cases.The only exceptional instance is the third data slice, where the cost using filtering is 6% more than the one not using filtering, because the data slice is too common to rule out any algorithms by the filtering procedure and the extra cost is caused by the filtering procedure.
Fig.2.Efficiency of the filtering procedure.
To verify the end-to-end performance of RecAD, three kinds of time costs are compared.The first one is All-Check representing the cost of using all algorithms to check anomalies, the second one is Rec-Check representing the cost of using only the recommended algorithm, and the third one is RecAd representing the cost of selecting the recommended algorithm.The detailed results are shown in Table 2.It can be found that the procedure of algorithm recommendation only takes few costs and the time costs can be hugely reduced by the strategy of only running the recommended algorithm.
Table 2.The Result of Time Costs (in seconds)
Conclusion: This letter has investigated the problem of algorithm recommendation for online anomaly detection of spacecrafts.Using the idea of measuring detectabilities by distributions, RedAD is proposed to support efficient automated algorithm recommendation for detecting spacecraft anomalies.Experimental results show that the proposed method is effective and efficient.
Acknowledgments: This work was supported by the National Key R&D Program of China (2021YFB1715000) and the National Natural Science Foundation of China (U1811461, 62022013, 12150007,62103450, 61832003, 62272137).
IEEE/CAA Journal of Automatica Sinica2024年1期