Ejay Nsugbe| Oluwarotimi William Samuel | Mojisola Grace Asogbon |Guanglin Li
1Independent Scholar
2Key Laboratory of Human‐Machine Intelligence‐Synergy Systems, Chinese Academy of Sciences(CAS),Shenzhen Institutes of Advanced Technology,Shenzhen, China
Abstract In signal processing, multiresolution decomposition techniques allow for the separation of an acquired signal into sub levels,where the optimal level within the signal minimises redundancy,uncertainties,and contains the information required for the characterisation of the sensed phenomena. In the area of physiological signal processing for prosthesis control, scenarios where a signal decomposition analysis are required: the wavelet decomposition(WD)has been seen to be the favoured time‐frequency approach for the decomposition of non‐stationary signals.From a research perspective,the WD in certain cases has allowed for a more accurate motion intent decoding process following feature extraction and classification.Despite this,there is yet to be a widespread adaptation of the WD in a practical setting due to perceived computational complexity. Here, for neuromuscular (electromyography) and brainwave (electroencephalography) signals acquired from a transhumeral amputee, a computationally efficient time domain signal decomposition method based on a series of heuristics was applied to process the acquired signals before feature extraction. The results showed an improvement in motion intent decoding prowess for the proposed time‐domain‐based signal decomposition across four different classifiers for both the neuromuscular and brain wave signals when compared to the WD and the raw signal.
The functional bionic arm, commonly referred to as a myoelectric prosthesis, represents the most technologically advanced replacement for the loss of an upper‐limb[1,2].The myoelectric prosthesis limb comprises of a control system that is central to the effective functioning of the arm as it serves as the intermediary between an acquired bio‐potential signal from an amputated individual. This, in turn, represents an encoding of a desired gesture motion,and a respective machine actuation signal which drives the motors and actuators in the prosthesis arm towards the completion of the desired motion [1, 2].Different control schemes have been investigated for the prosthesis controller, and the pattern recognition‐based controller is seen to be the favoured scheme due to its intuitiveness and facilitation of control of the prosthesis limb, as reported by sources in the literature [1, 2]. The pattern recognition control scheme is based around the association and decoding of various bio‐potential signals into a relevant respective control signal,which drives the prosthesis limb into performing an inferred gesture motion [3, 4]. Assuming the acquisition of a bio‐potential signal typically in the form of a time‐series, the pattern recognition process can be broken down into the feature extraction phase where relevant parametric descriptors are computed from the signal, and also be used as a means of dimensionality reduction, from which a feature vector is formed [3, 4].
The feature vector is used to train a classifier whose role is to form the best representation that maps a set of input features to an associated label and motion intent.The supervised learning methods are the popular classification methods used in this area of research. Results have shown that the range of features extracted to build a feature vector alongside the chosen classifier influences the extent to which motion intent signals can be robustly decoded[5].The signal analysis and projection domain typically dictate what category of features can be extracted from the signal and,to a lesser extent,has been subject to investigation in this area of research[6]. As a means of the method by which signals are acquired,and in terms of computational efficiency, the time domain can be said to be mostly the default analysis domain for the majority of signals[6].
Depending on the application in question, the frequency domain has also been seen to be a favoured analysis domain,as this allows for the deconvolution of signals into a spectrum comprising the various frequency contributions in the signal,and the extraction of frequency‐based features [7, 8]. The fast Fourier transform (FFT), which uses a geometric series of sines and cosines of increasing frequencies as its base function,is the widely used frequency transformation method that decomposes time‐series signals into their frequency components but has a shortcoming of discarding the temporal information associated with the signal [7, 8]. The short time Fourier transform (STFT) has been seen to be a workaround for this shortcoming as it allows for a time‐frequency representation of the signal, since it is a windowed Fourier transform across all time intervals in a time‐series [9, 10].
The windowing requirement for the STFT implies that a fixed window is used for all frequency components, which in practicality causes inaccuracies in the transform depending on the windowing option used[9,10].An uncertainty principle in the STFT exists, where either accuracy regarding the signal resolution or time localisation, is lost [9, 10]. The wavelet decomposition (WD) is a multiresolution method which provides time‐frequency information on a signal, and is an improvement of the STFT's windowing problem as it performs windowing‐based adaptation as a means of customising its analysis approach to varying frequency levels within a signal[11]. Unlike the FFT, which decomposes a signal using sines and cosines, the WD allows for flexibility of ‘mother wavelet’,which serves a basis function for the decomposition process,and as a result, has made this approach favourable for stationary and non‐stationary time‐series signals [11].
In the area of prosthesis control,the wavelet transform has been used for intent decoding purposes from acquired bio‐potential signals, although the results have shown that this is an effective domain for signal analysis and feature extraction,its widespread adoption in real‐time prosthesis control interfaces are constrained by its computational complexity [12].From the signal processing literature,Nsugbe et al. designed a time‐domain‐based algorithm applied for source separation of a mixing process using a resulting non‐stationary signal from the process [13–19]. This approach is a time‐domain‐based decomposition method that is centred around an iterative optimisation flow and uses heuristic reasoning and tuned linear amplitude thresholds as a basis for the signal separation process [13–21]. This method has been seen to be more computationally efficient than the WD, since once the optimal threshold level is determined,it can be implemented using the relevant parameters, or using analogue circuitry to rectify the signal and reduce the redundancies, leaving just the optimal segment of the signal for further processing [13–19].
The main shortcoming of this signal processing method is based around the lack of frequency information offered as part of the decomposition, as it is purely a time‐domain‐based approach, and hence a further computation of the FFT would be necessary for applications where frequency information is relevant [13–19]. This is not deemed to be an immediate limitation for the case of the pattern recognition‐based control scheme considered as part of this manuscript. Thus,here,the specific investigation and contributions are as follows:
‐ A mathematical formalism of the proposed time‐domain‐based decomposition algorithm.
‐ Investigation of the effectiveness of a motion intent decoding approach from bio‐potentials acquired from electromyography (EMG) and electroencephalography (EEG),with the application of the proposed time‐domain‐based method alongside a reduced set of features.‐ A results benchmarking exercise between the proposed method, raw signal and wavelet decomposed version of the signal across four different classifiers.
The data used for the signal processing case study presented were acquired from the study conducted by Li et al.[22].This section describes the theoretical model behind the various sensing modules used to acquire data,assuming the solution to the forward problem, and according to the data collection procedure by Li et al. [22]; followed by the signal processing and classification architectures employed here.
The theoretical formulations behind action potentials are useful numerical tools that help to describe the individualistic anatomical and physiological contributors which, under fixed conditions, produce superimposed extracellular recordable action potentials acquired using a set of surface electrodes.Assuming the forward problem framework,an overview of the theoretical concept behind the EMG and EEG signals,alongside the recording instruments used by Li et al. [22], is described as follows:
2.1.1 | EMG underlying principle
EMG signals are electrophysiological signals present within muscle tissue and represent superimposed motor unit action potentials (MUAPs). During muscular contraction, the simultaneous firing of MUAPs occurs,thus the resulting EMG signals are in the form of non‐stationary time‐series, dependent on intrinsic anatomical properties [23, 24]. The dynamics of electrophysiology can be modelled as electrical current flowing through tissue using the principle of volume conduction and a three‐dimensional(3D)view of Ohm's law[23,24].
For a biological tissue with uniform conductivityσirecorded at point P0 across (x0,y0,z0), and generated by a source currentIs at point P (x,y, z), the 3D electric potential can be described as seen in Equation (1) [23, 24]:
whereVp0is the electrical voltage potential, andrirepresents the shortest distance between pointsP0 andP.
Equation (1)helps to show the dependency of the voltage potential recorded at a specific point with the source current,and the inverse relationship withσi andrivalues[23–25].However,due to the biophysics of motor units and their simultaneous firing patterns,these sorts of bioelectric phenomena are represented as linear superpositions emanating from multiple sources[23–25].Using the theory of dipoles and the framework of the variation of electric potential membranes within an electrical field in an extracellular medium,Wilson et al.[26]and Plonsey and Barr [27] postulated that the behaviour of bioelectricity is similar to the fields produced by a lumped dipole.
On this note,a fibre element with lengthdxwith a focused current flow in the vicinity of an extracellular potential, is expressed asp?.dx,wherep?is the dipole current/unit length[26, 27]. As the current propagates from the source into an unbounded space, the resulting extracellular potential can be expressed as seen in Equation (2) [26, 27]:
whereφeis an extracellular potential,σeis the conductivity of the extracellular medium, andris the distance from the excitation source to the recording pointPo.Ifris located along the co‐ordinates ofPandPothen the distance can be calculated using Equation (3) [26, 27]:
From these equations, an integral sum of the various potentials can then be computed to obtain the resulting field from the lumped dipole element as shown in Equation (4) [26, 27]:
wheretis time.
2.1.2 | EMG sensors and signal acquisition
Li et al. [22] utilised 32‐channel high‐density surface electrodes with the REFA 128 model, TMS International BV,Netherlands, distributed around the stump and deltoid of the amputee subject. The data was sampled at 1024 Hz,and the acquisition electronics contained a bandpass filter in the frequency region of 10–500 Hz, and a 24‐bit resolution [22].
2.1.3 | EEG underlying principle
The human brain comprises billions of neurons of varying geometries, depending on the section. An action potential from a single neuronal cell can be said to produce a negligible amount of electrical potential, which in turn is challenging to record with acquisition electrodes [28, 29].When a cluster of neuronal cells activate simultaneously, the resulting electric potential is substantial enough to be acquired with electrodes to form an EEG signal [28, 29]. The flow of bioelectrical current within the brain tissue causes the production of an electromagnetic field within the skull which, like the EMG phenomena, can be modelled using dipole theory [28, 29]. The forward problem in the case of EEG can be numerically established assuming a set of acquisition electrodes, tissue conductivity and model of the head [28, 29].
For a dipole at a pointS, a multilayer head modelLwith radius spanning 0 where ∝is the angle betweenSandq,βis the angle betweenSand the signal acquisition point,x,γis the angle between two vectors denoted by pointSandqon one side andSandxon the other side, andPnandP1nrepresent the Legendre polynomial coefficient associated with the series. 2.1.4 | EEG sensors and signal acquisition The EEG sensors used by Li et al. [22] were the 64‐channel EasyCap, Herrsching, Germany, with the Al‐AgCl electrodes,and Neuroscan system version 4.3. The signals were acquired at a sample rate of 1024 Hz with bandpass filtering at 0.05–100 Hz [22]. The data collection process by Li et al. [22] included the acquisition of EMG and EEG from a group of subjects who had been amputated for traumatic reasons, and the study was granted ethical approval by the Institutional Review Board of Shenzhen Institutes of Advanced Technology, with a unique reference number of SIAT‐IRB‐150515‐H0077. A single amputee's data set has been used for the work presented. The subject is a transhumeral amputee of 49 years of age whose left side has been amputated,three years post amputation with the stump length of 20 cm, measured from the shoulder downwards [22]. Data was collected for five gesture motions, namely Hand Open (HO), Hand Close (HC), Wrist Pronation (WP), Wrist Supination (WS) and No Movement (NM). A sum of 10 repetitions was performed for each gesture set,with as close as possible to a constant contraction level, and with breaks factored in as required.The experimental setup can be seen in Figure 1. 2.2.1 | Electrode channel selection An electrode channel selection process was employed by Li et al. [22] as part of a pre‐processing prior to the signal processing, as a means of a channel reduction which can also be considered a data dimensionality reduction phase, allowing for a quicker computation time [32]. A greedy search algorithm termed the sequential forward selection (SFS), was applied by Li et al. [22] to prune out 10 optimal electrode channels, a summary of which can be seen in the flow sequence below: Step 1: Iteration loop initialisationS0={?} Step2:Maximise acquisition during selection as in;Acc(Sk+x?) = argmax Acc(Sk+xj?) Step 3: Repeat for all k∈{1,2, … n} Step 4: Terminate loop afternth item in setkhas been iterated whereSkis an already selected electrode, Acc is the classification accuracy,xj?is an electrode channel,krepresents the full set of electrode channels, andjis thejth element in setk.From this, a reduction from 96 electrode channels (EMG and EEG)to a sum of 20 electrodes was achieved,with the optimal channels for the EMG spread across the deltoids, bicep and triceps. It was seen that a number of the optimal EEG electrode channels were spread around the visual portion of the cerebral cortex responsible for the projection of visual information, thereby implying the importance of motor imagery in the control of phantom motions [22, 32, 33]. For the gesture motion data from each electrode,each one was divided into 10,corresponding to the number of repetitions made during each acquisition exercise,with each division comprising 512 sample points each. 2.3.1 | Analysis of domains and signal decomposition Time domain The proposed method works with the concept that, for an absolute representation of a non‐stationary time series signal,a transduced manifestation of a single event in time can be denoted by a unit impulse peak and exponential decay: Wheretistheimpulsetime,u(t)isastepfunction,aindicates that the function is 0 untila≥0,ζis additive white noise. The unit impulse can be characterised by its time, amplitude and characteristics related to its decay [34]. For signals that have consecutive events occurring, the time‐series contains overlapping impulse peaks which cause the decay characteristics associated with each impulse peak to become challenging to localise and characterise [16]. Due to this, the time and amplitude characteristics represent key properties to the signal, as described by Nsugbe et al. [16]. Thus, it can be said that an optimal amplitude region within a given time‐series is one where source information is maximised, and interferences (noise and uncertainties) are minimal [16]. As mentioned, the proposed method applies a series of heuristically tuned linear thresholds of varying amplitudes where, for each amplitude region, the amplitude of the peaks are localised, from which a sub time‐series is formed [13–19].That is to say, given signals(t), and a single amplitude threshold, a sub time‐series is formed:Xij=tls(t). WhereXijis a sub time‐series for theiththreshold iteration andjth division within the signal fori=(1,2,…,n) andj=(1,2,…,n),andtlis the amplitude level of the threshold. For the first iteration of a single heuristically tuned threshold within a time‐series which yields two time‐series divisions:X={X11,X12}, whereX11represents the first threshold iteration and the first time‐series division, andX12represents the first threshold iteration and the second time‐series division. From this, features are extracted for each item in setXand an iterative improvement exercise is conducted to find an optimal region within the signal using a chosen performance indexJ, and thejth division which yields theargmax(J). The resultingX11,X12are further decomposed using a successively tuned linear threshold based on heuristic reasoning for furtherith threshold iteration,until a conditional minimum is found mimicking a convex optimisation problem. The amplitude threshold parameter used to compute the signal level which produces the minimum is referred to asXopt.The threshold parameters ofXoptare referred to as the optimal amplitude decomposition level within the time‐series, and are used for the analysis and further processing of subsequent signals from the same source, as it is believed that this generalises across further kinds of signals acquired from the source, assuming the recording instrumentation remains unchanged.A resulting time‐series signal from a tuned threshold region accompanied by a peak detection process represents a filtered and decomposed reduced dimensional time‐series. F I G U R E 1 A representative subject performing the arm tasks during experimental session [22] The following are the heuristics and steps taken to implement the proposed method and determine the optimal region in the time‐series: Step1 Assuming a multitude of time‐series:Sn=x1,x2,…,xN, containing information for various signals that need to be classified, express each time in its absolute form denoted by |Sn|. Step3 For each of the subsequent sub time‐series,X={X11,X12} obtained for the variousSn, identify the peaks within the signal and form a filtered sub time‐seriesXf ilt={X11f ilt,X12f ilt}, where a peak can be said to be a data sample whose amplitude is either greater than or equal to its nearest neighbours,that is for a time‐seriesSn. The peaks within this time‐series can be said to be samples that satisfy the criteria, expressed mathematically as: At this stage,a selection process is conducted to assess the information quality and‘goodness’within each sub time‐series.For the purpose of this exercise, the mean of the peaks (MP)and cumulative sum of peaks(SP)are extracted from to form a feature vector.Jwas chosen to be the Euclidean distance metric used to compute the distance between points in Euclidean space,normalised by the standard deviation of each of the time series in question, and can be mathematically formulated as seen in Equations (6)–(8) for two distinct time‐seriesSnandTn(assuming steps one to two have also been applied to this time‐series) [35]: whereEDis the Euclidean distance,pandqare co‐ordinates of the features in the feature vector projected in a Euclidean space from select electrode channels,wis thewth feature within the feature vectorNw,rwis a feature within the feature vector,μis the mean of the feature vector,andσmis the mean of the standard deviations of the feature vectors from the two time‐series [35]. Step4 All further signal decompositions ofX11f ilt(upper threshold),and X12f ilt(lower threshold), should be done to the threshold levels defined in Table 1 prior to achieving stopping criteria. The equations and scale factors in the table represent a mathematical formalisation of the threshold tuning methods adopted in previous works[13–19].The stopping criterion for the iterations is where the performance indexJindicatesXij>Xi?1,j?1,andXij>Xi+1,j+1.This stopping criterion is put in place with the postulation that any local solution is also a global minimum, and hence a convex optimisation formulation.The assumption of a convex optimisation formulation simplifies the search for the maximum value forJ,thus reducing the computational load in contrast to other optimisers in the literature,such as the grey wolf and conflict monitoringoptimisation heuristic [36, 37]. To ensure a standardised basis of comparison,Jshould only be computed for equivalent threshold regions and iteration pairings that isXxyf ilt class1andXxyf ilt class2. T A B L E 1 The various adjustment factors for the thresholds For all further decompositions ofX11f ilt(upper threshold),and X12f ilt(lower threshold), in the case of the upper threshold, only the peaks above the re‐tuned upper thresholds are considered for subsequent analysis; while in the case of the lower threshold, only the peaks below the re‐tuned lower threshold are taken into account. This framework and heuristic was put in place as a means of ‘best practice' for the algorithm as determined from previous studies and should be followed for all further threshold iterations and decompositions [13–19]. An illustration diagram of the algorithm alongside various iterations of the decomposition can be seen in Figure 2,while a flowchart of the process can be seen in Figure 3 and the optimisation objective of the algorithm can be formulated as follows whereRis a set of real numbers and xi is a value within the set for values ofJwhich satisfyXij>Xi?1,j?1, andXij>Xi+1,j+1 Step5 For the various arrays ofJ(upper threshold)andJ(lower threshold),Jmaxshould be identified from each array followed by a subsequent maxima seeking exercise to identifyJoptimal. The accompanying threshold parameters are stored and therein serve as the parameters of the optimal decomposition region within the signal, which maximises its information quality and represents the region from subsequent signals which are to be decomposed for and used for classification exercises. Time frequency Wavelet decompositionA wavelet can be described as a highly dynamic function that can be localised in time and frequency.The wavelet transform projects a time‐frequency representation of a signal and decomposes it into sub‐bands that can be characterised by their frequency contents in time [38–40]. A continuous representation of a wavelet can be seen in Equation (9):whereψis the mother wavelet,aandbrepresent scale and translation values respectively(witha,b∈R,anda≠0),andtis the time parameter.A sampled and computationally efficient version of the continuous wavelet transform is termed the discrete wavelet transform (DWT), and this allows for the decomposition of the signal into a pair of orthogonal wavelets using a set of high and low pass filters (LPF and HPF) which contribute to signal compression, dimensional reduction and elimination of noise and uncertainties within the signal[38–40].The DWT discretises the continuous version of the wavelet transform shown in Equation(6)using a dyadic sampling,and an equation of the DWT can be seen in Equation (10). wherejis the wavelet level andkis the location. The WD process involves the splitting of the signal into the low and high‐frequency components named approximate(A) and detail (D), from which the approximate and detail coefficients are formed [38–40]. Mother wavelet As part of the WD process,the first step is the selection of an appropriate mother wavelet which serves as the basis function.The Daubechies (db) wavelets were selected as the mother wavelet for the work done here due to findings from previous related work, and are used for complex signals which exhibit self‐similarities [11]. The db wavelets are a group of orthogonal wavelets noted for having a high number of vanishing points. 2.3.2 | Feature extraction and classification methods Feature extraction As part of the feature extraction stage, two groups of features were extracted and used to train the chosen classifiers, as described below. Group 1Due to the steps and heuristics employed as part of the proposed method, a number of the typical features frequently extracted from these sorts of signals are inapplicable, thus a list of four features was identified and extracted from acquired signals post decomposing with the proposed method, as seen as follows for a sub time‐series with peaks Xopt={xpeak.1,xpeak.2…….xpeak.n}. ‐ MP:this feature has been frequently used in previous studies alongside the proposed method and involves taking the arithmetic mean of the peaks in the optimal signal region,as expressed in Equation (11) [41]: ‐ SP: the peaks in a time‐series give an indication of bio‐potentials of anatomical activation, the degree of which is expected to vary amongst different gesture intent motions.An expression for this can be seen in Equation (12): ‐ Simple square integral (SSI): this feature provides a quantification of the power within a time‐series by computing the sum of squares as expressed in Equation (13) [42]: ‐ Enhanced mean absolute value (EMAV): this feature imposes amplification weightings on various regions of the signal as a means of increasing its contributions to the final value, as seen in Equation (14) [42]:Group 2The feature sets in this group were used as a basis of comparison and comprise the original four feature sets used by Li et al. [22], as seen in Equations(15)–(18): whereN=number of samples,andxnis thenth sample of the time series/signal. For the data collected which spanned 5 gestures, 10 repetitions, 10 electrode channels and 4 features(for each feature group), the total amount of samples was 2000 for the EMG/EEG sensor configuration, and summed up to 4000 samples for the case of EMG‐EEG across the various classification techniques which were applied here and described in the subsequent section. Gesture classification methods Four variants of classifiers were tested for the final part of the intent decoding process. These candidate classifiers include computationally efficient classifiers, iterative classifiers and a black box‐based classifier [41, 43]. This step was taken to allow for a contrast of the intent decoding accuracy across different classifier variants for the different multiscale methods and accompanying features extracted. For the validation phase of each of the listed classifier, ak‐fold cross validation was used withkchosen as 10, and a mean calculated for allk‐fold repetitions to form a final metric(referred to as the classification accuracy), which was used to characterise the classification prowess and thus motion intent decoding capability. (i) Discriminant analysis (LDA and QDA): the discriminant analysis method has been seen to be computationally effective and provide acceptable classification performance in this area of research, and as a result is a favoured classification method [44, 45]. The class boundaries can be chosen to be either linear (LDA) or non‐linear (QDA), as was implemented here [44, 45]. This classification framework is centred on the dimensional reduction of a high dimensional feature vector into a lower sub‐space where separation boundaries are implemented while preserving its overall structure and maximising class separability [44, 45].The discriminant function for the LDA and QDA can be seen in Equations (19) and (20), respectively: whereekis the QDA discriminant function,μkis the mean vector for a specific classk,πkis a prior probability value for each classkand Σkis the pooled covariance matrix. (ii) Support vector machine (SVM): the objective of this classification method is to find an optimal separation boundary,knownas hyperplane,fora classificationproblemusing a key subset of the training data(referred to as support vectors),andahighdimensionalprojectionofthedatainfeaturespace[46–49]. The support vectors are used to create margins around the hyperplane which contribute towards separability betweenclasses[46–49].This technique workswiththe intuition that class separability is further maximised when the data is projected into a higher dimensional feature space whereitsetsthehyperplane margins,preservesthestructure,and projects back down to a low dimensional representation through a process referred to as a‘kernel trick’,where the choice of the kernel is user dependent[46–49]. The SVM is an iterative classifier which solves an optimisation problem as formulated in Equation (21), assuming a binary‐based classification with a linear model: wherewis the vector weight,Φ(x)represents the kernel map,andbis the offset. For a feature vector containing training samplesx,i=1,…,N, the goal is to solve an optimisation problem framed as follows in Equation (22): whereζis a slack variable introduced to ensure that distributions with overlapping classes have solutions,Ris a regularisation parameter whose role is to help minimise boundary overfits,andyis an indication vector.The parameters used for the SVM implementation and the associated kernel were the quadratic kernel function with a one‐vs‐one multiclass method. (iii) Multi‐layer perceptron neural network (MLP): is a feedforward neural network comprising of an input layer where weights are applied to the various inputs, a hidden layer with a designated activation function through which weighted and summed inputs are interpolated, and an output layer which is responsible for outputting the relevant class labels [41, 50]. These classifiers are regarded as data‐driven black‐boxes as they are non‐parametric function approximators whose full understanding of their decision‐making process is not yet known,especially with regards to the hidden layers [41, 50]. Equation (23)shows an interpolation of a weighted sum from an input layer interpolated with an activation functions: wherean= denotes a node in a hidden layer,s= activation function,m= set of values,wi=ith weight,xi=ith input andb= bias. The implemented architecture of the MLP contained a variation of features depending on which feature group was being considered, with 30 units in the hidden layer with a sigmoid activation function, thus making it a non‐linear function approximation variant which has been seen to be beneficial when classifying neuromuscular data, and which is said to be a non‐linear phenomenon [41, 51]. The softmax function was used in the output layer of the network, and the training was done using the iterative back propagation with cross‐entropy used as the loss function [51]. For all classifiers, the split for the dataset was 70% for training,15%for the validation,and 15%for the test set,with the validation set in place to prevent any overfit from occurring. For the EMG signals, the results from the proposed method show that the optimal region in the signal is within the upper portion of the first iteration of the proposed method process. In a sense, it can be inferred that acquired EMG signals have a broadband information scale associated with them. Results from the proposed method showing the threshold iterations and associated performance index can be seen in Table 2. For the EEG signals,the results in Table 3 show that more iterations of the proposed method were required to find the optimal region in the signal, in contrast to the EMG signal.This set of results reinforces the fine‐scaled nature of neural oscillations when compared with a neuromuscular signal,which appears to be more broadband as supposed to fine‐scaled. Contrasting the convergence properties for both the EMG and EEG, it can be seen that the algorithm converges within two iterations for the EMG while four iterations are required for convergence for the case of the EEG. This difference in convergence times is thought to be due to the respective signal characteristics of the various signals. In the case of the wavelet,from previous work in the literature,the optimal parameter for the EMG was seen to be the detail wavelet coefficient from the second decomposition level,using the db7 mother wavelet [12]. For the EEG signal, prior literature has suggested that neural oscillations associated with various degrees of cognitive functions span across various low frequencies(0.1–43 Hz)and are broken down into delta,theta,alpha, beta and gamma brainwave frequencies, as can be seen in Table 4[52–54].Although various literature have associated specific frequency bands with enhanced cognitive function,it isstill unclear if this generalises towards amputated individuals due to the neuroplasticity‐induced reorganisation of the cerebral cortex [52–54]. Due to this, a concatenation of the frequencies associated with various levels of cognitive functions was used for analysis, thus using the db7 wavelet. The approximate wavelet coefficient from the fourth decomposition level in the frequency region of 0–31 Hz was used for the feature extraction stage.Details of this can be seen in Table 5. T A B L E 2 Performance index for EMG signal with proposed method T A B L E 3 Performance index for EEG signal with proposed method The results across all four classifiers can be seen in Figure 4.It can be seen from this that the more superior classification accuracy is obtained from the MLP and SVM classifiers, both of which are black‐box and iterative classifiers, which require greater computation than their counterparts. This is followed by the QDA,which applies non‐linear decision boundaries for class separations, thereby reinforcing that neuromuscular and brainwave‐based physiological signals are non‐linear phenomena that also require non‐linear separation boundaries.Amongst the various classifiers and different biosensor combinations, it can be seen that a combination of the proposed method and the Group 1 features provide the best classification performance amongst all classifiers. All other feature groups for both the original signals and WD appear to be within the range of each other for all classifiers, with a slight variation depending on the classifier. For the proposed method Group 1 features,and the EMG only results, the highest classification accuracy was seen to be in the region of 93 % for the MLP, while the lowest accuracy was seen to be in the region of 70 % for the LDA classifier,thereby exhibiting a relatively high accuracy across all classifiers.For the EEG‐only results,the highest accuracy was 63%while the lowest was 50%,showing a notably reduced accuracy when compared with the EMG only results.This is thought to be due to EEG predominantly being a high channel monitoring biosensor that requires more channels to operate when compared with EMG, thus resulting in a lower obtained classification accuracy. T A B L E 4 Neural oscillation frequencies and their associated cognitive functions T A B L E 5 Information regarding chosen decomposition levels and mother wavelets F I G U R E 4 Results across four different classifiers In the case of the sensor fusion comprising of EMG‐EEG,a 13% increase for the LDA, 6% for the QDA, 3% for the MLP and 2 % for the SVM in classification accuracy were observed when compared to the results from the EMG only.The LDA provided the most notable enhancement for the case of the biosensor fusion, closely followed by the QDA, thus providing the notion that linear and non‐iterative classifiers benefit the most from a sensor fusion,in contrast to black‐box(MLP) and iterative classifiers (SVM). From a practical perspective, the results suggest that a combination of parsimonious classifiers,with a combination of a fusion of biosensors information, provides enhanced classification, and is also likely to be a robust pattern recognition control scheme due to the classification decision being made from information from a number of sources.The downside of this is the implied additional sensing channels required,alongside the supporting signal acquisition and conditioning electronic hardware. For the case where a more powerful classifier is applied, and assuming an EMG biosensor, the results show that there is not a notable benefit in having a multi‐modal sensing architecture as far as the classification accuracy is concerned.This would allow for a simpler and more affordable electronic hardware requirement, but with the downside of a longer computation and thus response time due to the nature of the classifiers, along with susceptibility to classification degradation due to the limitations in sensing principles of the candidate biosensor. These factors make it challenging to recommend a single biosensor and classifier architecture to adopt,but from a signal processing perspective provide evidence to suggest that the proposed method allows for enhanced classification of physiologically‐based signals for prosthesis control. It is also worth mentioning that for the proposed method, all the classifiers exhibited a highly repeatable classification accuracy,aside from the MLP.This has been attributed to the black‐box architecture of the classification method for the time being but is likely an area for further research. A confusion matrix plot for the LDA classifier for the EMG‐EEG sensor configuration can be seen in Figure 5.From this result, it can be noted from the perspective of a linear classifier that the gesture intent signals which have the highest classification accuracies belong to HO and NM, with the lowest HC producing the lowest classification accuracy.At this stage, it is not clear if the relatively lower classification accuracy for this gesture is due to physiological changes in the amputee post‐surgery or due to uncertainties from the data collection process.Subsequent work will involve the extraction of further key features from the signal which can contribute towards an enhanced motion intent decoding and greater linear classification accuracies for this gesture. As the MLP classifier provided the best classification results for the proposed method, a further investigation was carried out to observe the individual contributions and weighting of the various features from Group 1 for boththe EMG and EEG. The results of this can be seen in Figure 6. F I G U R E 5 Confusion matrix plot showing results for the LDA for proposed method features and EMG sensor. EMG, electromyography; LDA linear discriminant analysis For the MLP‐EMG results shown in Figure 6, it can be seen that the amplitude‐based features produced the biggest contribution, thereby implying that amplitude quantification features would be the more impactful feature groups for this sensor and classifier configuration, should further feature extraction be required. The same was seen to happen for the MLP‐EEG sensor configuration,with the only notable change being the greater contribution of the number of peaks within the signals,showing a greater contribution relative to the EMG results. This suggests that, in addition to amplitude quantification based features, features similar to the SP which are capable of characterising the level of activity and dynamic behaviour within the signals would be impactful for the classification of the EEG signal. Using the principal component analysis (PCA), a visualisation of the projection of the various hand gestures in feature space can be seen in Figure 7. Figure 7 provides a visual illustration of the degree of separation existing between the various gesture intent motions with the first two principal components. From this, it can be seen that the HO and NM qualitatively show a reasonable amount of separation and distinction in the feature space. A notable overlap can be seen to exist between HC and WP motion intents, which provides further intuition behind the low classification accuracy for the HC motion intent in Figure 7. Although a shortcoming from the data collection is that the data emanates from a single amputee participant, to help compensate for this, the signal processing involved the extraction of multiple groups of features across multiple domains and decomposition methods (time domain, WD,proposed method) to allow for further sample generation alongside a comparison of the obtained results across four distinct classifiers. The proposed software‐based implementation for the proposed method is similar to that of the wavelet transform.First,an offline process would need to be carried out to determine the optimal region of the signal. Once this is achieved, subsequent signals would need to be analysed and have the relevant features extracted from within the optimal region of the signal before classification (assuming a trained classifier). In addition to greater classification accuracy, it is expected that there would be computational benefits in terms of the response time and selection time of the control system,due to a low dimensional signal being used for the pattern recognition process. In a situation where the optimal signal level is known after offline processing with the proposed method,analogue circuity could be used to clip and rectify the signal to remove redundancy before feature extraction, to further minimise computation and maximise response time,and therein make a further enhancement of the connection between the amputee and the prosthesis limb. For this, clip electronic circuitry comprising a mixture of linear and non‐linear analogue components could be used,such as resistors and diodes whose purpose would be for amplitude selection of the signal based on the recommendations obtained from the proposed process [59]. F I G U R E 6 Classification results for single features for the MLP‐EMG setting. MLP, Multi‐layer perceptron neural network; EMG, electromyography F I G U R E 7 Principal component analysis plot for proposed method, electromyography only for the first principal components,accounting for 95%of the variability in the data Here, a time‐domain signal decomposition method has been applied for the separation of the signal into what is deemed to be an optimal region with respect to a performance index,before feature extraction and classification for motion intent decoding for transhumeral prosthesis control.Using a selection of four features extracted from the resulting signals from the proposed method, it was seen that this allowed for enhanced gesture classification, and therein a superior motion intent decoding when compared with results from the original signal and the wavelet transform across four different types of classifiers. This performance was seen to be consistent across both the EMG and EEG sensor configurations.The proposed method has also been seen to be computationally efficient due to the algorithmic process, and can be implemented using analogue electronic circuitry, thus allowing for economic hardware requirements,computational demands,and enhanced motion intent decoding accuracy [13–19]. Based on the findings from this work and prior research done with the proposed method, the limitations of the technique include: (i) Lack of insight into frequency content of the signal or optimal region despite being a separation technique.Thus, a spectral analysis using the FFT would need to be done to further observe and contrast the redundant portion of the signal against the optimal content from a frequency perspective for both the EMG and EEG signals.(ii) EMG signals have been seen to degrade due to the muscle physiology resulting from fatigue and a concurrent build‐up of lactic acid. This could perhaps affect the tuned threshold for the proposed method. As a means of dealing with this, further work would involve attempts to increase the robustness of the proposed method with an adaptive‐based threshold which polls for the level of contraction force,and then follows up with a parameter estimation process,which would help correlate and adjust the threshold level of the proposed method to the state of the anatomy before motion intent decoding [43]. Another approach towards making the method robust to fatigue and variation of contraction force is a data‐driven approach with a broader set of samples with varying contraction forces, and then conversely finding the optimum for the threshold by taking into account the unique properties of the signals from each contraction level [60]. To improve aspects of the pattern recognition control scheme, further work would involve the exploration of additional features that can be extracted from the decomposed signal from the proposed method to boost classification accuracy, in particular the case of the EEG signal and LDA classifier. Furthermore, now that the results from the proposed method have been compared to that of both the raw signal and WT, and its intent decoding capabilities have been seen, further validation of the proposed method across a broader cohort of transhumeral amputees with varying degrees of amputations to observe the overall performance and consistency of the proposed method would now be conducted [61–64]. ACKNOWLEDGEMENTS The research work was supported in part by the National Natural Science Foundation of China under Grants (#U1613222, #81850410557, #8201101443), the Shenzhen Science and Technology Program (#SGLH201 80625142402055), and CAS President's International Fellowship Initiative Grant (#2019PB0036). Mojisola G. Asogbon sincerely appreciates the support of Chinese Government Scholarship in the pursuit of a PhD degree at the University of Chinese Academy of Sciences,Beijing. The authors would like to thank Brian Kerr from Kerr Editing for proofreading the manuscript and Dr Li Xiangxin for the assistance in the data acquisition process.Also,the authors would like to extend their appreciation to Dr Cristobal Ruiz‐Carcel and a number of other academics from WP6s on the CHARIOT project who guided Dr Ejay Nsugbe in the scientific formulation of the proposed method incepted as part of his Doctoral Research. ORCID Ejay Nsugbehttps://orcid.org/0000-0003-0674-16112.2 | Data collection process and electrode channel selection
2.3 | Signal processing and classification
3 | RESULTS
3.1 | Optimal signal decomposition region for proposed method
3.2 | Optimal signal decomposition region for wavelet transform
3.3 | Classification results
3.4 | Proposed implementation of the proposed method
4 | CONCLUSION AND FUTURE WORK
CAAI Transactions on Intelligence Technology2021年3期