Kehua Yang , Tian Tan and Wei Zhang
Abstract: Dempster-Shafer (D-S) evidence theory is a key technology for integrating uncertain information from multiple sources. However, the combination rules can be paradoxical when the evidence seriously conflict with each other. In the paper, we propose a novel combination algorithm based on unsupervised Density-Based Spatial Clustering of Applications with Noise (DBSCAN) density clustering. In the proposed mechanism, firstly, the original evidence sets are preprocessed by DBSCAN density clustering, and a successfully focal element similarity criteria is used to mine the potential information between the evidence, and make a correct measure of the conflict evidence. Then, two different discount factors are adopted to revise the original evidence sets, based on the result of DBSCAN density clustering. Finally, we conduct the information fusion for the revised evidence sets by D-S combination rules. Simulation results show that the proposed method can effectively solve the synthesis problem of high-conflict evidence, with better accuracy, stability and convergence speed.
Keywords: D-S evidence theory, information fusion, DBSCAN, combination rules.
D-S evidence theory is a method of information processing to solve multi-valued mapping problems [Dempster (1967); Shafer (1979)]. Compared with the Probability Decision Theory, the D-S evidence theory can distinguish the unknown and uncertainty of problems, and the prior knowledge is not essential. Therefore, the D-S evidence theory has been widely used in many fields, such as reliability analysis [Ma, Dui and Yang(2017)], identification and decision analysis [Mokhtari, Ren, Roberts et al. (2012); Yu,Wang, Feng et al. (2013)]. In D-S evidence theory, the updating of probability value is achieved by Dempster combination rules, which is simple and is only used by machine.However, in order to combine the evidence with seriously conflict, the combination results can be paradoxical [Zadeh (1986)]. The problem has attracted many academics and research institutions in the recent years, and they have made many improvements from different perspectives and aspects.
Modifying the combination rules is an effective method to solve the paradoxical problem.Smets [Smets (1990)] thought that the conflicting mass assignments to empty set in belief function theory. In Yager [Yager (1987)], Yager proposed the closed world assumption of the identification framework, and identified the conflict as an unknown area of the objective world, and assigned the probability of supporting evidence conflict to the unknown domain, which actually negated all the conflicts between the evidence. Sun et al.[Sun, Xiu and Wei (2000)] proposed a weighted method to allocate conflict, where the conflict be allocated to all focal elements in a certain proportion. Lefevre et al. [Lefevre,Colot and Vannoorenberghe (2002)] proposed a new approach, named combination with adapted conflict (CWAC), providing a formalism preserving the initial role of the conflict as an alarm signal announcing that there was a kind of disagreement between sources. The scheme was based on dissimilarity measures and on a normalization process between belieffunctions. Lu et al. [Lu and Qin (2011)] proposed a scheme, which assigned the collision coefficients to each coke element by using the weight of the Pignistic probability distance calculations, and then synthesize them by the unified reliability function.
Changing the model of the evidence is also a valid way to deal with the paradoxical problem, which makes the conflict resolved by the complete set among evidence and then using D-S rules to synthesize. Therefore, in recent years, some researches have concentrated on the how to change the model of the evidence. In Murphy [Murphy(2000), Murphy et al. proposed a scheme to obtain the modified evidence by calculating the arithmetic mean of the N pieces of evidence based on basic trust distribution function,and then modify the evidence for N-1 D-S rule combinations. Deng et al. [Deng, Shi and Zhu (2004)] proposed the way to calculating the weight of each piece of evidence by introducing Jousselme evidence distance, and the revised evidence was obtained by calculating arithmetical average value according to the weight of the original evidence set.In Xiong et al. [Xiong, Yang and Qu (2011)], an effective method to calculating the weight of each evidence by Euclidean distance was proposed, then obtaining a reference evidence based on the weight, and using the reference evidence to modify the original evidence set, and synthesizing by using the D-S rule. Bi et al. [Bi, An and Li (2016)]abandoned the calculation of the similarity of the distance method, and introduced the idea of Tanimoto’s measure, which comprehensively considered the inclusiveness and difference among evidence to modify the original evidence set.
In the paper, we present a novel combination algorithm based on unsupervised Density-Based Spatial Clustering of Applications with Noise (DBSCAN) density clustering on account of existing methods, which reserves some excellent mathematics properties, such as commutative law and associative of D-S, from the view of correcting the evidence model. The conflict measurement method is based on the Jousselme distance. And according to the degree of trust of each focal element in the evidence, the similarity among the evidence is considered by combining the spatial distance of each evidence and the difference in the overall decision output. Then, we make the dynamic DBSCAN density clustering to preprocess the original evidence set, and analyze the clustering results, if the conflict evidence occurs, the discount factor is used to correct them. Finally,we conduct the information fusion for the revised evidence sets by D-S combination rules.Simulation results show that the proposed method is reasonable and effective, which has better performance compared to the typical existing methods.
The rest of paper is organized as follows. In Section 2, the D-S evidence theory is detailed. Next, Section 3 presents a novel combination algorithm based on unsupervised Density-Based Spatial Clustering of Applications with Noise (DBSCAN) density clustering.Some numerical results and conclusions are presented in Section 4 and Section 5.
Definition 2.1 Let Θ be a finite non-empty set of mutually exclusive elements, called the frame of discernment. And let 2Θbe the power set composed of all possible subsets of Θ.
Definition 2.2 A basic probability assignment (BPA) is a mapping of m from 2Θto[0,1] ,which assigns a mass to each subset of the power set. The BPA has the following properties.
Definition 2.3 We assume m1, m2,…,mnis n BPA function in the frame of discernment,and their focus element is Ai(i = 1, 2,…, n), and the D-S synthesis rules are as follows
When the original evidence set contains serious conflict evidence, we obtain the abnormal result if D-S synthesis rules are used.
Example 2.1 We assume the frame of discernment Θ={A, B, C}, and there are BPA function for two evidence.
We can get the value of conflict coefficient (K) by D-S synthesis rules, K=0.99, and result of m1⊕m2is as follows.
From the above results, it can be seen that the results of synthesis do not satisfy obviously the original evidence set. In evidence E1and E2, the degree of confidence of focus element B assigned is all 0.1, but it becomes 1 after the synthesis, which is abnormal.
We realize that it is not comprehensive to accurately measure the conflict among evidence from one angle by the analysis of the existing methods to measure conflicts.And we need to measure the conflicts from many criteria. In the paper, we measure the conflict of evidence comprehensively by combining evidence distance of Jousselme and the overall evidence decision.
Definition 3.1 In the frame of identification, the distance of Jousselme for the BPA function m1and m2is as follows [Jousselme, Grenier and Bossé (2001)]:
Definition 3.2 Assume the frame of discernment Θ= {θ1, θ2,… ,θi, … ,θn}, the focus elements order groupSeq = {S1, S2,… ,Sj, …Sn}is obtained for an evidence according to the ascending order of magnitude in BPA function. And in the frame of discernment,two evidence BPA function m1and m2get the two order groups Seq1and Seq2after sorting, and the focal distance of the two evidence is as follows [Li and Wang (2015)]:
Where the range of dSequenceis between 0 and 1, and the larger the value is, the smaller the similarity is between the two evidence and the more serious the conflict is.
According to the spatial distance among evidence and the overall decision output, we introduce the degree of confidence of the order similarity among the focus elements and propose a new distance function to measure the conflict among evidence based on the Jousselme’s distance.
We can measure conflict synthetically and avoid the non-uniform influence of evidence on the allocation of the degree of confidence of the focus elements by combining with the order distance of focus elements, which makes the similarity among evidence more accurate on the output of the overall decision.
DBSCAN is a density-based clustering algorithm, whose category is depended on the tightness of the sample distribution. In the same category of samples, they are closely linked. By classifying the closely linked samples as the single class, a clustering category can be obtained. By grouping all closely linked samples into different categories, we get the final results for all cluster categories. For a given data set D = {x1, x2,… xn}, some definitions are presented below [Borah and Bhattacharyya (2004)]:
Definition 3.3 (density) We use the any point xjas the center of the circle, the number of the point contained with a radius Eps is called the density of the point.
Definition 3.4 (Eps-neighborhood) For an object in data set, the region within its radius is called the Eps-neighborhood of the object. The expression is as follows.
Definition 3.5 (core object, MinPts)If the Eps-neighborhood of an object contains at least a minimum number, MinPts, of objects, then the object is called a core object. The expression is as follows.
We use the focal distance and the Jousselme evidence distance proposed in the previous section as a distance function for density clustering. The time complexity of DBSCAN is O(N ? M), where N denotes the number of points and M represents the number of points in the Eps-neighborhood. And in the worst condition, the time complexity is O( N2). The algorithm description is as follows [Zhou (2016)].
Algorithm 1 DBSCAN algorithm Input: A set of Objects { }1,2, ,n D x x x= …Neighbourhood parameters ( , )Eps MinPts Sample distance matrix n n Dist×Output: Cluster division { }1, 2, , n C C C C= …1 Initialize:, 0, ,k D C?=? = Γ= =?2 for j=1; j<=n; j++ do 3 Determine the NEpsof sample xj 4 if Eps j N x MinPts≥()then 5 Add sample xjto the core object set:{}j x?=?∪6 end 7 end 8 while ? ≠?do 9 Record the current unvisited sample set:Γ old=Γ 10 Randomly select a core object o∈?
11 Initialize the current cluster core object queue {}cur o? =12{}o Γ=Γ-13 while cur? ≠?do 14 Take the first sample o in queue cur?15 if EpsN o MinPts≥()then 16 Let ()Eps N o Δ= ∩Γ 17 18()( )cur cur NEpso? =? ?∪ ∩Γ=Γ-Δ 19 end 20 end 211k=k+22 Generate cluster k old C=Γ -Γ 23kC?=?-24 end Output: Cluster division { }1, 2, , n C C C C= …
In the processing of the original evidence set, we assume the value of the parameter Eps is, which ensures that all reliable evidence can be clustered into core clusters.Combining the previous function of distance about focus element sequence, the conflict among evidence with the same decision as a whole is little. Although there is a slight difference in the degree of confidence among evidence in the focal element, it can make the evidence with less conflict into clusters by using the proposed function of distance.Therefore, after preprocessing the original evidence set by using density clustering, the evidence in the final core clusters has a high credibility, and we only deal with the noise evidence with little conflicts and use the discount factor to revise it. If the core cluster is formed finally, it indicates that the density among the original evidence is not uniform,and most of evidence is inconsistent for the allocation of confidence in the focus element or there is the serious conflict among evidence. Then, we calculate the weight of each evidence by considering the correlation of the evidence set and the support among evidence. Meanwhile, the weight of all evidence is revised to ensure the rationality and accuracy of the final results of synthesis.
The distance function can well combine the advantages of DBSCAN density clustering based on the focal element sequence. Besides, the evidence set with the same output of decision is usually closely linked, and the clustering can be clustered after preprocessing.Therefore, the abnormal evidence can also be well distinguished. This section mainly describes how to revise and compose the conflict evidence.
Based on the new function of distance to measure the conflict mentioned in the previous chapters, the similarity among evidence can be defined as
After calculating the similarity between evidence, the degree of supporting to evidence Eican be obtained as follows about all evidence
By normalizing the support of each evidence, we can receive that the credibility of the evidence is as follows.
Based on results of the unsupervised dynamic clustering in the previous chapters, we revise the conflict evidence and calculate the discount factor of conflict evidence by the credibility of the evidence [Lefevre, Colot and Vannoorenberghe (2002)].
The above formula denotes that the clustering results can generate the core cluster, where the evidence set has the higher credibility. Therefore, we can remain the information from the original evidence as much as possible by only discounting the conflict evidence,which is helpful to enhance the reliability and stability of the result of fusion.
After DBSCAN clustering, if the core cluster is not generated, it is showed that the conflict is serious among evidence and the density distribution is not uniform. Therefore,we need to revise the all evidence, and the discount factor can be calculated as follows.
And the revised model for evidence of conflict is as follows.
Finally, the modified evidence is added to the evidence set and synthesized by using DS rules to obtain the final result of decision. To sum up, the detailed steps of the proposed algorithm are as follows.
Algorithm 2 Evidence fusion step Input: The frame of discernment:{ }1,2, ,n θ θ θ Θ= …N pieces of evidence:{ }1, 1, , n E m m m= …Neighbourhood parameters ( , )Eps MinPts Output: The fusion results 1 for i=1;i<=n;i++ do 2 for j=1;j<=n;j++ do 3 Calculate dJoussemle(mi, mj) using Eq. (3);4 Calculate dSequence(mi, mj) using Eq. (4);5 Calculate the Synthetic Distance d(mi, mj) using Eq. (5);6 Get Distance matrix Dist(i, j) = d(mi, mj)7 end 8 end 9 Cluster analysis of original evidence sets using DBSCAN algorithm 10 if Cluster division C=?|| there is noise C≠?then 11 Calculate similarity matrix using Eq. (8);12 Calculate Support matrix using Eq. (9);13 Calculate Credence matrix using Eq. (10);14 if C=?then 15 Calculate the discount factor using Eq. (12);16 end 17 else 18 Calculate the discount factor using Eq. (11);19 end 20 for i=1;i<=n;i++ do 21 Revise the evidence model using Eq. (13)22 end 23 end 24 Combine revise evidence using Eq. (2);
In the above steps, firstly, we calculate the total distance among evidence based on the formula (3), (4) and (5), and the algorithm of DBSCAN is used to perform the analysis of density clustering according to the overall distance. And if the core cluster is generated and there is noise (line 14 of the algorithm), which indicates that the less conflicts exist in the original evidence set. But the evidence in the core cluster has high reliability, and we only need to revise the conflict evidence and calculate the discount factor based on the formula (11) to reduce the impact of conflict evidence on the result of fusion. However, if the core cluster is not generated (line 16 of the algorithm), we need to revise the all evidence. Meanwhile, we need to consider the degree of support for all evidence and calculate the discount factor based on the formula (12), and finally, we use the D-S rules to synthesize the revised evidence. Through the above analysis, we propose the algorithm of composing evidence based on DBSCAN, which can well distinguish the uniformity of decision of the original evidence set. When the conflicts of evidence are little, we use the D-S rules to synthesize directly. And when there are the serious conflicts among evidence,we use different discount factors to revise the evidence model, which can reduce the impact of conflict evidence and improve the accuracy and rationality of result of the fusion.
In order to verify the availability of the novel evidence synthesis method, some typical examples are provided to absolutely compare the ability to handle conflict evidence. The proposed approach is compared with Dempster’s [Dempster (1967)], Yager’s [Xiong,Yang and Qu (2011)], Murphy’s [Murphy (2000)], Deng et al.’s [Deng, Shi and Zhu(2004)], Lu et al.’s [Lu and Qin (2011)], Xiong et al.’s [Xiong, Yang and Qu (2011)] and Bi et al.’s [Bi, An and Li (2016)] methods. There are six BPA functions on the frame of discernment Θ={A, B, C}defined as.
From six evidence above, it is obvious that E1, E2, E4and E6prefer to target A and they have the higher similarity with target A. Besides, E3prefers target B and E5prefers target C because that E5has more conflict with other targets. Next, we make use of the 8 methods above to focus the evidence set. The results are shown in Tab. 1.
According to the results from the combination with E1and E2, 8 methods are all ideal above.Although the Yager’s method gives the highest degree of confidence, the value of m(A) is much larger than m(B) and m(C), and it can accurately identify the target A. Therefore, 8 methods can effectively combine the evidence when the conflict is relatively small among evidence. When E3adds the fusion, different methods have the various results because E3has the serious conflict with E1and E2. D-S combination rule has an unnatural result.Conflicts are assigned to the complete set in Yager’s combination rules, which makes the uncertainty of the fusion result increase so that the target is recognized accurately.
Table 1: Comparison of synthesis results of 8 methods
Fig. 1 shows that the BPA of different methods are compared about the combination results of evidence set for target A. It is obvious that the proposed algorithm makes target A has more trust and the result of fusion is prime. According to the results of analysis of E1, E2and E3, E1and E2prefer to target A, but E3has the serious conflict with E1and E2.In the proposed algorithm, it can identify the conflict of evidence E3and reduce the impact of the conflict evidence E3about results of the evidence fusion. Therefore, the proposed algorithm can identify the target A more accurately and has better performance compared with the other methods. When E5is added, some differences occur, we can learn that Deng’s results of fusion can identify the correct target more accurately from Fig. 1. Because the evidence E5prefers to target C, which has a great conflict with the previous evidence. However, Deng’s synthesis method is improved from the average revision idea proposed by Murphy. Although Deng’s method, based on Murphy,considers the correlation among evidence by calculating the Jousselme distance among evidence. But Deng’s method only repeats and combines with the weighted average evidence and abandons the original evidence.
Figure 1: Comparison of different methods about target A
Figure 2: Comparison of results for evidence m1~5
Fig. 2 shows results of combination the evidence E1~5. It is obvious that m(B) is greater than m(C) by using Murphy’s combination method, which is unnatural. However,although Deng improved the Murphy, the results of identification are unreasonable with conflict evidence affected for target B and C, which make more evidence provided to eliminate the impact of conflict evidence.
From the above data, we realize that the confidence of the correct target increases as more reliable evidence, and the fluctuations of the correct target are also relatively small under the influence of conflict evidence. Based on the analysis of relevant documents, the evaluation of the advantages and disadvantages of the combined method proposed in[Han, Deng, Han et al. (2011)], and the results of the fusion data in the chapter, it is obvious that the proposed algorithm, based on unsupervised dynamic aggregation DBSCAN by preprocessing evidence set, is effective. Moreover, the proposed algorithm can reduce the uncertainty of the results and get the correct consequence of fusion with or without conflict evidence.
In the paper, we propose a novel DBSCAN algorithm based on unsupervised dynamic clustering in order to solve the abnormal problem that the D-S evidence theory causes high conflict evidence in synthesizing. In the proposed algorithm, we preprocess the evidence sets and use D-S rules to combine evidence according to the conflict conditions.During dynamic clustering, a scheme of similarity based on the order of focal elements is introduced, and we consider the mutual support of evidence on the basis of the difference in the overall output of each evidence about decision. Moreover, we also mine accurately the potential information among evidence by combining with Jousselme's spatial distance.Besides, the proposed function of calculating distance can fully combine the characteristics of DBSCAN density clustering, which can identify accurately the distribution of the density of evidence sets by dealing with the original evidence sets, transform the reliable evidence into the core clusters by clustering and recognize the conflict evidence in noise. And in the final combining process, we use two different methods to calculate the discount factor in order to correct the original evidence set and ensure the accuracy and stability of result of fusion according to the difference of dynamic density clustering results. The comparison of experimental results shows the proposed algorithm can minimize the impact of conflict evidence on the synthesis results, and it has the ability of anti-interference and fast convergence speed. Besides, the proposed algorithm can solve the abnormal problem that the D-S evidence theory brings about high conflict evidence in synthesizing, which is great significance to the reliability of information fusion.
Acknowledgement:The authors gratefully acknowledge financial support from China Scholarship Council.
Computers Materials&Continua2018年11期