A Real-Time Method for Decoding the Neural Drive to Muscles Using Single-Channel Intra-Muscular EMG Recordings

The neural command from motor neurons to muscles — sometimes referred to as the neural drive to muscle — can be identiﬁed by decomposition of electromyographic (EMG) signals. This approach can be used for inferring the voluntary commands in neural interfaces in patients with limb amputations. This paper proposes for the ﬁrst time an innovative method for fully automatic and real-time intramuscular EMG (iEMG) decomposition. The method is based on online single-pass density-based clustering and adaptive classiﬁcation of bivariate features, using the concept of potential measure. No attempt was made to resolve superimposed motor unit action potentials. The proposed algorithm was validated on sets of simulated and experimental iEMG signals. Signals were recorded from the biceps femoris long-head, vastus medialis and lateralis and tibialis anterior muscles during low-to-moderate isometric constant-force and linearly-varying force contractions. The average number of missed, duplicated and erroneous clusters for the examined signals was 0 . 5 ± 0 . 8, 1 . 2 ± 1 . 0, and 1 . 0 ± 0 . 8, respectively. The average decomposition accuracy (deﬁned similar to signal detection theory but without using True Negatives in the denominator) and coeﬃcient of determination (variance accounted for) for the cumulative discharge rate estimation were 70 ± 9%, and 94 ± 5%, respectively. The time cost for processing each 200 ms iEMG interval was 43 ± 16 (21–97) ms. However, computational time generally increases over time as a function of frames/signal epochs. Meanwhile, the incremental accuracy deﬁned as the accuracy of real-time analysis of each signal epoch, was 74 ± 18% for epochs recorded after initial one second. The proposed algorithm is thus a promising new tool for neural decoding in the next-generation of prosthetic control.


Introduction
Muscles are typically controlled by a few hundred motor neurons whose cell bodies are in motor nuclei in the spinal cord or brain stem. 1 The electrical activity of a muscle -electromyogram (EMG) -is the algebraic sum of the muscle fiber action potentials activated by the neural activation of the innervating motor neurons. A motor neuron and the muscle fibers it innervates is the smallest functional voluntary unit by which the nervous system controls the movements and is called motor unit. The neural command from motor neurons conveying information about the control of muscle voluntary contractions is often referred to as the neural drive to muscle. 2 The neural drive to muscles can be identified from EMG signals recorded either noninvasively (surface EMG, sEMG) or invasively with needle and wire electrodes (intramuscular EMG, iEMG). iEMG has high selectivity for individual motor unit action potentials, compared with sEMG, and is thus used to measure motor unit activity. 3 The identification of the neural drive to muscles is obtained by the so called decomposition of EMG signals. According to this procedure, the motor unit action potential (MUAP) trains are extracted from the EMG. 4 Applications of EMG decomposition are in diagnosing neuromuscular diseases based on morphological measures of MUAPs, 5 estimating MU architectural properties, physiological and anatomical studies of the neuromuscular system, 6 studies on MU control [7][8][9] and neural connectivity, 10,11 and neurological assessments. 12 Although, Brain-Computer interface (BCI) is usually performed using electroencephalogram (EEG), 13,14 it is possible to use EMG decomposition for human-machine interfacing to control external devices. 2 The automatic decomposing of EMG signals was pioneered by LeFever and De Luca. 15,16 Subsequently, several other EMG decomposition algorithms have been developed. For example, Stashuk proposed a quantitative approach for decomposing EMG signals. 17 A wavelet-based feature extraction for multichannel EMG decomposition was introduced by Zennaro and collaborators. 18 Breakthroughs in EMG decomposition have been the EMGLAB software for decomposing intramuscular EMG signals which is freely available online 19 and two commercial methods for decomposing surface EMG signals. 20,21 Recently single-channel intramuscular EMG signal decomposition methods have been proposed, 22,23 as well as multichannel approaches. 24,25 For most applications, offline decomposition of the EMG is sufficient and indeed the vast majority of decomposition approaches did not deal with constraints in computation time. 26 However, recently, the use of EMG decomposition has been proposed for establishing neural interfaces in patients with limb amputations. 27 In these patients, nerve activity can be indirectly recorded by guiding nerves that previously innervated the missing limb muscles into accessary muscles that are used as biological amplifiers of nerve activity. 28 The resulting EMG signals are usually processed as global interference signals for extracting their amplitude as a crude estimate of nerve activity. [29][30][31][32] However, recently, it has been proposed that the EMG signals from reinnervated muscles can be decomposed so that the efferent neural information of the innervating nerves can be extracted directly. 2,27 In this way, a neural interface is established which provides the same information as if the nerve would be directly interfaced with neural intrafascicular electrodes. This approach has been previously discussed as a theoretical possible way to control active prostheses 33 and is based on a strong evidence of direct association between the neural drive to muscles estimated from EMG decomposition and muscle force. 34 Nonetheless, for this approach to be feasible in practice, the EMG decomposition needs to be performed in real-time, whereas, all current decomposition methods are in fact time consuming (for instance, the run-time of the state-of-the art decomposition program PD II for different muscles and force levels are listed in Table 1 in). 35 Actually EMG decomposition is in most cases not even fully automatic but requires a long interaction with expert operators. These restrictions are not acceptable for controlling active prostheses which requires a maximum processing delay of few hundreds milliseconds. 36 Therefore, the present study proposes an innovative method for fully automatic and online intramuscular EMG decomposition, which is not possible with current methods. We focus on invasive EMG rather than surface EMG because future prosthetic systems will require implantable technology that has the advantage of not requiring electrode replacement.
Here, we propose for the first time a method for the decomposition of intramuscular EMG signals with processing time limited to 200 ms for each processing interval of 200 ms. Therefore, this method can be applied for the online extraction of MUAP trains to control a new generation of active prostheses based on direct decoding of the neural drive to muscles. Since the control variable would be a series of action potentials discharged by motor neurons, this approach is theoretically equivalent to recording nerve activity but with the advantage of requiring an implantation into muscles instead of nerves, with greater signal-to-noise ratio and less invasive surgery. This paper presents in detail the proposed EMG decomposition algorithm and its implementation, with specific emphasis on the online aspects, as well as a full validation of the online decomposition for both synthetic and experimental signals. The focus is on the decomposition while its use for controlling external devices in man-machine interfacing is left to subsequent studies.

Materials
We assessed the performance of the proposed EMG decomposition method on simulated as well as experimental iEMG signals.

Simulated signals
The model of intramuscular EMG signal generation has been previously described 37 and used for validating EMG decomposition algorithms. 38 This model generates synthetic signals with the possibility to vary the number of MUs, percentage of superimposition between MUAPs, inter-and intraclass variability in the MUAP shapes and recruitment conditions. Twelve simulated signals were generated in this study. Five of these signals represented linearly increasing muscle activation over 10 s and the remaining signals corresponded to 20 s linearly increasing and decreasing activation. The average number of motor units (MUs) and mean discharge rate (MDR) in the simulated dataset was 6.6 ± 1.3, and 32.8 ± 14.3 pulses/s, respectively. Other parameters used for generating the simulated signals are provided in Table 1.

Experimental signals
Experimental signals were recorded during isometric contractions from the vastus medialis obliquus (VM), vastus lateralis (VL), biceps femoris long-head (BFlh), and tibialis anterior (TA). The subjects were healthy, without any neuromuscular disorder, pain, or regular training of the lower limb. All subjects were informed about the procedures of the study, which was conducted in accordance with the Declaration of Helsinki and approved by the local ethics committee. The reference maximal voluntary contraction (MVC) for the definition of the submaximal force levels was selected as the highest value of two knee extensions for VM and VL, flexion for BFlh, dorsiflexion of the ankle for TA over a period of 5 s, separated by 2 min of rest.
Experimental Protocol VM and VL. Five men (mean ± SD, age: 26 ± 2.9 years; stature: 1.83 ± 0.03 m; body mass: 73 ± 12.2 kg) participated in this experiment. Each subject seated relaxed on an Isokinetic dynamometer (KinCom Dynamometer-Chattanooga, TN, USA) with the hip and distal thigh firmly strapped to the chair, the lower right leg secured to the dynamometer lever arm above the lateral malleolus, and the rotational axis of the dynamometer aligned with the right lateral femoral epicondyle. The knee was flexed at 90 • .
Signals were recorded from VM and VL with a pair of Teflon-coated stainless steel wires (diameter: 0.1 mm; A-M Systems, Carlsborg, W) inserted with 23-gauge hypodermic needles. The wires were cut to expose only their cross sections, and provided bipolar signals which were amplified (Counterpoint EMG, DANTEC Medical, Skovlunde, Denmark), band-pass filtered (500 Hz-5 kHz), sampled at 10 kHz, and stored after 12-bit A/D conversion. The ground electrode was mounted at the right ankle.
The subjects performed two isometric knee extensions at 10% and 30% MVC (random order) for 10 s, with 2 min of rest in between. An oscilloscope provided a visual feedback on the force output for the subjects.
BFlh. Five men participated in this experiment (mean ± SD, age: 35 ± 5.2 years; stature: 1.79 ± 0.05 m; body mass: 80 ± 7.5 kg). Each subject laid prone on a bed with the knee of the left leg flexed at 45 • and the thigh in slight lateral rotation according to SENIAM recommendations. 39 The torque in isometric knee flexion was measured by a cuff, placed around the ankle and connected to a load cell. A monopolar needle electrode (27 gauge, 37 mm, Gilroy, CA, US) was inserted into the muscle. The monopolar iEMG signal was amplified with filter settings of 5 Hz-5 kHz (Nicolet Viking, Madison, WI), sampled at 10 kHz, and stored on a computer. The monopolar reference was located proximal to a surface electrode on the thigh, while the ground electrode was located on the medial knee. A load cell and custom made amplifier measured the knee flexion force, recorded concurrently with the iEMG signals. The force signal was also provided as a feedback to the subject on a circular bar graph display. The subjects performed one-minute isometric constantforce knee flexion contractions at 5% and 10% MVC, each lasting 20 s.

TA.
Five men participated in the experiment (mean ± SD, age: 36.4 ± 12.8 years; stature: 1.77 ± 0.10 m; body mass: 81 ± 11.0 kg). While supine on a bed, they were asked to dorsiflex the ankle against a cuff around the foot connected to a load cell, with the foot inverted and without extending the great toe. A monopolar needle electrode (27 gauge, 37 mm, Gilroy, CA, US) was placed in the third distal part of the muscle. The monopolar iEMG signals were amplified in the same way as for BFlh, with the monopolar reference located on the distal tendon and the ground electrode located on the patella. The iEMG signals were also high-pass filtered at 1 kHz and displayed in real time to enable the investigators to visualize the signal complexity and quality during the experiment. The subjects were asked to increase the strength of the contraction until the iEMG signal contained from 6 to 12 active MU trains, as judged by the investigators. Also, signals were recorded during ramp contractions up to 40% MVC. Audio feedback of the iEMG signals was provided to help the subjects maintain steady contractions. Each subject performed one-minute isometric contractions twice.

Algorithm
The flowchart of the proposed algorithm is depicted in Fig. 1. The details of this method are provided in the following:

Signal Conditioning
The proposed recursive algorithm uses 200 ms iEMG signal epochs for the analysis. The frame size was selected as to be suitable for prosthesis control. 36,40 The input signal frame is passed to the first-order high-pass Butterworth filter with the cut-off frequency of 1 kHz in the forward and reverse direction. Moreover the initial condition of the filter was set as to have zero-phase distortion and minimize start-up and ending transients. 41 This filter decreases the duration of the Active Segments (AcS's). Moreover, it reduces the temporal overlaps and removes the low-frequency and nondiscriminating components. 19

Segmentation
The segmentation is performed by a detection threshold of Kσ noise (with K set to 4.0), 19 on 2 ms intervals. The standard deviation of the background noise, σ noise , is estimated using the algorithm introduced in Ref. 42. The noise properties are adaptively estimated for each signal epoch.

Feature Extraction
The feature extraction consists of two steps (Fig. 1). The first step is the alignment phase in which the detected active segments are aligned on the highest peak using a high-resolution peak alignment method introduced in Ref. 43. The alignment is performed to sub-sampling-interval using trigonometric interpolation to eliminate residual error due to time quantization. This makes it possible to analyze EMG signals sampled at the Nyquist rate without oversampling. The time instants of the aligned peaks were considered as the discharge times of the active motor units. In the second step, each AcS was projected in a two-dimensional feature space. The two features were the Root Mean Square (RMS) and the Difference Absolute Standard Deviation Value (DASDV), 44 defined as follows: where x is the vector of the AcS time samples and N is the number of samples in each AcS.

Online clustering and classification
An online single-pass density-based clustering method was developed to detect the number of MUs and their corresponding clusters in the feature space ( Fig. 1). The pseudo-code of the algorithm was provided in the appendix.
In the proposed algorithm, the representative of each cluster is the time average of all the MUAPs in that cluster and the neighborhood of a cluster center is defined by two numbers, NB1 and NB2, in the 2-D feature space ( Fig. 2A). If there is only one point in a cluster (i.e. a new cluster has been just defined), that point is the cluster center and the neighborhood borders are then calculated using Eq. (3).
where f 1 var and f 2 var are the variability of RMS and DASDV values based on a high resolution shift, respectively. In fact, we used high-resolution alignment of 0.2 samples for the AcS using the algorithm proposed by McGill and Dorfman 43 and the maximum variability was estimated in the feature space. The parameters f 1 noise and f 2 noise are the variability of RMS and DASDV value for background noise, respectively. 19 When a cluster contains more than one member (i.e., it starts to grow), the neighborhood borders are adaptively calculated according to Eq. (4).
where µ and δ are the mean and standard deviation of the cluster members in the feature space. Accordingly, the neighborhood of each cluster is defined independently. Clusters with higher amplitude MUAPs or higher MUAP variability, have wider neighborhood with respect to smaller MUAPs. The correlation parameters used in Algorithm 1 are defined as follows. Pseudo-correlation and Correlation are measures of similarity between time samples of two AcS's, defined by Eqs. (5)-(6): 45 where m is the length of the sought-after cluster representative x, equal to that of the AcS y in our algorirthm. In fact, the lower bound of the correlations was set to zero. Meanwhile, the threshold for pseudo-correlation (Thr PsC) was set to 0.5, in the algorithm 1. It was tuned based on the sensitivity analysis. The pseudo-correlation function was implemented in C++ with Optivec vectorization package 46 for vector and matrix operations (http://www.optivec.com/). The optimal performance of the proposed algorithm is achieved when using pseudo-correlation instead of correlation in Algorithm 1. However, we used correlation in AcS assignment to decrease the running time. It did not significantly degrade the performance of the program.
The density-based clustering method proposed here is based on the concept of the potential measure. 47 The potential of an AcS x in a group of M AcS's is defined as where α is a weight constant fixed to 0.5 in this study. Any values between 0.3 and 0.8 were acceptable. From this definition, any point with many close neighbors (densely surrounded) has high potential while sparsely located points have low potential.
Using the concept of potential, the center of any cluster is defined as the point in the cluster with the highest potential value. When a new point (x k ) is added (removed) to (from) a cluster, the potentials of all the points (x i ) within the cluster are increased (decreased) with one term as in Eq. (8) Thus, it is possible to save the previous potential of all the points and only add or remove one term to update the potential of any point in the cluster. When a new point is added to a cluster, if its potential is greater than the potential of the previous cluster center it becomes the new center, otherwise the cluster center is unchanged.
The following two sections of the algorithm start after five initial frames and run at each iteration.

Sparse cluster re-assignment
The parameter cluster age indicates the elapsed time from the birth of a cluster. Because the minimum firing rate for motor units in humans is usually not below 5 Hz, 48 a cluster with less than 5 assigned AcS's and with cluster age of 1 s is considered as a sparse cluster and is eliminated (Fig. 2B). The elements of this cluster are then reassigned to the existing clusters. If no cluster is assigned to such AcS's, they are considered as outliers (i.e. superpositions).

Cluster merging
Two or more clusters are merged together (Fig. 2C), under the following two conditions: Clusters with Pseudo-correlation of the representatives >0.5 (Thr merge) and up to 5 inter-spike interval violations (or namely as firing-time inconsistencies) (n inc ) in the merged firing times. A firing time is inconsistent if it follows the previous one with a separation interval <1/60 s. 22,49

CDR calculation and report
The estimate of the neural drive to the muscle is obtained from the Cumulative Discharge Rate (CDR). 50 After decomposition of each epoch, the firing rate of each motor unit is estimated using a robust method proposed by McGill et al. 42 and Marateb et al. 22 and the CDR is estimated in pulse/s (pps). In order to decrease the fluctuations of CDR in different frames, the CDR is low-pass filtered with an exponential smoothing filter (Eq. 9)) with the smoothing factor equal to 0.8. 51 Since the CDR estimation is based on the firing rate distribution, it is robust to missed firings. (9) where x, and y are the input and output of the filter, respectively. The parameter β is the smoothing factor.

Validation
For the experimental iEMG recordings, the decomposition with the proposed algorithm was compared with that performed by an experienced investigator using the EMGlab decomposition software (http://www.emglab.net). 19 The investigator checked and edited the results to make sure that the identified firing patterns were valid. Then the manual results of the constant-force isometric contractions were assessed using a rigorous a posteriori statistical analysis 52 rating each discharge as highly confident if it was found to be accurate within ± 0.5 ms with a confidence level of >99% (LTP: likely true positive), as approximate when it was accurate within ± 5.0 ms with a confidence level of >95%, and as uncertain otherwise. 22 LTP firings were used as the gold standard. The quality of the manual decomposition was assessed based on the percentage of the LTPs in the decomposition. These annotations were considered as the gold standard. 45,53 Each identified MUAP train was assessed as valid if at least 50% of its discharges were time-locked to the discharges of a valid manually identified train within ± 0.5 ms.
Three classes of validation measures were used: (i) Measures to assess the quality of clustering: Sensitivity (Se%), Precision (Pr%) and Accuracy (Acc%) (ii) Measures to assess the quality of decomposition: Assignment accuracy (A c %), Assignment rate (A r %) and General Assignment Performance (GAP%) (iii) Measures to assess the quality of CDR estimation: R-squared (R 2 ), Variance Accounted For (VAF) and Pearson Correlation Coefficient (Pea) For each identified MUAP, the measures of clustering quality are according to Eqs. (10)- (12), where TP (true positive) is the number of highly confident discharges that the algorithm identified within ± 1 ms, FN (false negative) is the number of highly confident discharges that the algorithm failed to identify within ± 1 ms, and FP (false positive) is the number of discharges identified by the algorithm that did not match any manually identified discharge within ± 1 ms. The assignment accuracy (A c ), assignment rate (A r ) and general assignment performance (GAP) were defined as follows: 54 A c % = number of correctly assigned MUAPs total number of MUAPs assigned A r % = total number of MUAPs assigned total number of MUAPs detected × 100, The complexity of each EMG signal was characterized in terms of the signal-to-interference ratio (SIR), 55 which is an estimate of the percentage of the signal energy explained by the manual decomposition results. The MDR and coefficient of inter-spike interval (ISI) variation (CoV) of each MUAP train were estimated using the algorithm described in Ref. 43. The distinguishability of each MUAP was characterized in terms of the decomposability index (DI), 56 which is a measure of how different the MUAP is from all the other MUAPs in the signal, defined as below: where mu k is the k th MUAP and mu k * is the most similar template to mu k among the other templates in the signal and V RMS is the Root-Mean-Square value of the entire signal.
For assessing the quality of CDR estimation, the following goodness-of-fit measures were used: 57,58 where CDR orig is the CDR from the manually decomposed signal,CDR is the mean value of CDR, and CDR est is the estimated CDR from the algorithm. Also, n is the length of the CDR vector. In fact, Precision (defined in Eq. (11)) and Assignment Accuracy (Eq. (13)) are identical since the number of correctly assigned MUAPs is equal to TP and the total number of MUAPs assigned equals TP + FP. The total number of detected MUAPs is calculated in the segmentation step.
Further, we used an online measure named Incremental Accuracy (IA). 59 The definition of IA is the same as in Eq. (11) but it is calculated in each frame during the algorithm progression.
The analysis was performed on an Intel Core i7 2.4 GHz CPU with 4 GB of RAM. The algorithm was mainly implemented in MATLAB 8.2 (The Math-Works Inc., Natick, MA, 2013).

Statistical analysis
Parameter CDR R 2 was analyzed with a three-way ANOVA with factors the muscle (BFlh, TA, VM, VL, Sim) and the contraction type (isometric or ramp) and level. When the ANOVA identified a significant difference, the Tukey's honestly significant difference (HSD) post-hoc test was used for pair-wise comparisons. The Pearson's correlation coefficient (Pea) was calculated to investigate whether the number of missed, duplicate, erroneous MUs and SIR influenced CDR R. 2 Moreover, for all correctly identified MUs, the parameter r was calculated among Se, Sp, Acc and DI. The level of significance was set to 0.05. Data was analyzed using STATA 10 (StataCorp, 2007).

Results
The analyzed EMG signals are listed in Table 2. They were of low to moderate complexity, containing from 2 to 14 manually detected MUAP trains (average 7.5 ± 2.5 units). The value of mean discharge rate and coefficient of variation for the signals in Table 2 were 12.0 ± 2.1 pps and 17.0 ± 10.5%, respectively. The signal to interference ratio and DI of the analyzed signals were 87.0 ± 10.3 and 7.7 ± 4.2, respectively. The average LTP percentage of the gold standard was 85 ± 8.
The results of the proposed algorithm for a representative signal are depicted in Fig. 3. The SIR of this signal is 91% and the DI is 4.8 ± 2.4. The algorithm detected all 7 MUs without any miss, duplicate or erroneous cluster.
The overall performance measures of the algorithm are presented in Tables 3 and 4. The average number of missed, duplicated and erroneous clusters for the examined signals was 0.5 ± 0.8, 1.2 ± 1.0, and 1.0 ± 0.8, respectively. Overall, the sensitivity of the algorithm was 75 ± 7% while the assignment accuracy was 79 ± 6%. The accuracy was 70 ± 9% and the assignment rate was 91 ± 4%. The precision for all the evaluated signals was 73 ± 7%. Overall, eighty percent of the energy of raw iEMG signals was explained by the decomposition results. However, higher percentages shown for VM and VL could be due to the fact that the analog high-pass filter used in the recording system had a relatively high cut-off frequency (500 Hz) compared to 50 Hz used for TA and BFlh. Statistical analysis showed that the parameter Acc was strongly correlated with Se (Pea = 0.9; p < 0.05) and moderately correlated with Pr and DI (Pea = 0.5; p < 0.05). Table 4 shows the ability of the proposed algorithm in the online estimation of CDR. Over all the analyzed signals, the R-squared, Variance Accounted For and Pearson correlation coefficient were 69 ± 21%, 94 ± 5% and 95 ± 4%, respectively. The time cost for processing individual 200 ms intervals was 43 ± 16 ms. Statistical analysis revealed that none of the factors duplicate, erroneous, missed MUs and SIR influenced the CDR R 2 (|Pea| < 0.3; p < 0.05). Meanwhile, the only significant difference for CDR R 2 was between 40% MVC and 30% MVC contraction level (F = 49.4; p < 0.05). Figure 4 shows the estimate of CDR in a representative experimental signal with two consecutive     30% MVC ramp contractions. In this example, no MU was missed, thus the CDR estimation was unbiased.
In an online algorithm it is essential that the accuracy is high for the entire duration of the signal. Figure 5 shows the incremental accuracy (index IA) over time for a representative EMG signal from BFlh at 5% MVC.
In the proposed algorithm, each interval of analysis was 200 ms, therefore it is important that the computational cost for each interval is smaller than 200 ms. Figure 6 shows the computational cost for an experimental ramp signal at 30% MVC contraction level. The average time for processing each 200-ms interval over the 50 s long signal was 61 ms, and the maximum computational time was <92 ms. For this signal the number of identified MUs was 5 MUs with MDR of 10.6 ± 0.5 pps.

Discussion
The validation results (Table 3) showed that the proposed online decomposition method could decompose single-channel iEMG recordings during low to moderate isometric constant-force and ramp contractions with an average accuracy of 70%. The processing time of signals recorded at 30% MVC was in average twice of what obtained at 10% MVC for VM and VL muscles (Table 4). Thus, the algorithm could not operate real-time during higher contraction levels.
An attempt was made to implement resolving superimposed MUAPs. The resolving algorithms proposed by Florestal et al. 45 and Marateb and McGill 46 were implemented in Vectorized C++. Although the decomposition accuracy significantly increased (an average increase of 19%), the average running time was 287 ms for a 200 ms epoch. Thus, it was not practical in real-time implementation.
Since the program is online, superimposed MUAPs were not resolved. In fact, ten percent of the entirely analyzed AcS's in the experimental data were marked as outlier by our algorithm. Such superpositions frequently occur at higher contractions, among which constructive and destructive cases could reduce the performance of the proposed decomposition algorithm. Thus, the accuracy of our online algorithm is lower than typical offline complete decomposition programs. 22,35 However, its performance is comparable to what obtained with other offline incomplete decomposition programs, in terms of the assignment rate, accuracy and general assignment performance. 23,54,60 In fact, the later parameter is not only dependent on the number of superimposed MUAPs but also on the performance of the decomposition program. 23 Overall, 261 MUs (92% of the entire identified MUs) had at least 50% of their discharges time-locked to those of the manually identified trains, and where therefore considered for performance evaluation. Seventy-five percent of the missed MUs had DI lower than 5 dB. Thus, they were less distinguishable MUs. Meanwhile, 70% of the MUs with DI higher than 7 dB had Acc higher than 75%.
Among the entire performance measures reported in this study, the most important criteria are the CDR estimation quality measures (Table 4). Indeed, the estimate of the neural drive to the muscle is obtained from the CDR parameter. 50 Three goodness-of-fit measures R, 2 VAF and Pea were reported for CDR estimation. Although the parameter R 2 is usually known as Coefficient Of Determination (COD) to show the proportion of explained variance in the data by the predictor, VAF was reported as an alternative formula 61 and has been also used in the literature for the same purpose. 58,[62][63][64] Pea was also reported to show the linear dependence between two variables. The CDR estimation was better for the experimental ramp data (Table 4). This could be due to the fact that the level of force is not always high and the CDR estimation error is biased toward higher MVC levels (Fig. 5). On the other hand, the CDR estimation is robust enough to take into account the variations of MDR and CoV during the ramp contraction. Such a trend is bolded during ramp contraction; shown by COD.
The parameter CDR R 2 was not dependent on SIR in our study. Thus, CDR could be correctly estimated using incomplete decomposition where robust measures of MDR and CoV are used for its prediction. However, low-to-moderate force levels/complex signals ( Table 2) were used in our study. It is expected that with the increased force, the CDR goodness-of-fit drops.
In our study, the clustering validity measures Se, Pr and Acc as defined in Eqs. (10)-(12), were reported for each identified MU. The parameter TNs (True Negatives) defined based on information theory as the number of firings not assigned to each cluster and not belonging to that cluster was relatively high. Any criteria including this parameter is not used in multi-class classification problems. 65 Meanwhile, the classical accuracy measure overestimates the performance of the algorithm and thus it was not used. The accuracy measure, Acc, used in our study was similar to that proposed by Holobar et al.
and Negro et al. as the rate of agreement between two decomposition results. 24,55 Other similar accuracy measures have been used in the literature. 22,35 Although the classification performance of the proposed algorithm is not high (Table 3), the quality of clustering and also CDR estimation is high enough to use in prosthesis control.
The MU assignment threshold of 50% was used in our study. This threshold sets a lower bound for highly confident discharges, TP, assignment accuracy and assignment rate. However, lowering such a threshold, it may be difficult to determine whether the two trains relate to the same MU. Also, there is a trade-off between the number of reported missed MUs and the reported classification accuracy. If the threshold is high, identification accuracy is biased toward higher values while the number of missed MUs increases due to unassigned identified trains. Moreover, the assignment of identified MUAPs to the gold standard MUAPs was based on the agreement rate between their firing times. However, the correlation between the MUAP shapes extracted by our algorithm and those obtained in EMGLAB ranged between 0.66 and 1.00.
Moreover, the A-posteriori accuracy assessment method used to check the accuracy of the manuallydecomposed signals, works on single-channel, timeinvariant signals. Thus, it was not used for the ramp signals. In fact, the performance measures reported for the ramp signals reflect the agreement rate between the manually and automatically decomposed signals not the absolute accuracy.
In our study, the manual decomposition results were used as the gold standard. Like other studies, it is probable the superpositions were not correctly resolved by manual decomposition. Thus, the same gap would appear in the output of the algorithm and also the gold standard, resulting in overestimation of the decomposition accuracy. However, the detection probability (P d ) of the LTP IPIs, defined as the percentage of regular IPI in the IPI histogram, was estimated using the method proposed by McGill. 66 The average P d was 93 ± 10 (%) in constant-force isometric contractions, showing that in average 93% of the firing times of such signals were complete.
In our algorithm a high-pass filter of 1 kHz was used for signal conditioning. Thus, most of the volume-conducted distant MUAPs are removed from the iEMG signal. 26 However, these MUAPs are generally small, and have similar shapes. Therefore, it is difficult to correctly classify such MUAPs. Also, when they are in close temporal proximity to larger MUAPs, they are missed. Thus, it is practical to only detect MUAPs that can be consistently correctly assigned using such a high-pass filtering. 67 Such a filtering, on the other hand, limits the decomposition to MUs that have at least one muscle fiber within a certain radius of the recording electrode. This approach thus yields the activity of the entire MUs having muscle fibers within a given crosssectional region of the muscle. The amplitudes of MUAPs in the unfiltered signal depend both on the size of the MU (the number of muscle fibers in the MU, which is related to the recruitment threshold) and also on the distance of the MU from the recording electrode. The amplitude of the spike in the highpass filtered signal depends less on the total number of muscle fibers in the MU and more on the distance of the fibers that are closest to the electrode. Given two MUs, the MU with a smaller MUAP (in the unfiltered signal) can have a larger spike (in the high-pass-filtered signal) if it has a fiber closer to the electrode. This is one advantage of filteringit selects MUs that have at least one fiber within a certain radius of the electrode, regardless of their recruitment thresholds. Thus it provides a reasonably objective sampling of the MU activity in the vicinity of the recording electrode.
Low-amplitude MUs are not necessarily low threshold MUs, they can also be higher-threshold MUs that are distant from the electrode. In a case of homogeneous force generation of the nervous system across the muscle cross section, the decomposed signal would be a reasonable characterization of the neural drive to the muscle.
The neural drive to the muscle is the ensemble of action potential trains from the entire pool of motor neurons innervating the muscle. The larger the population of motor units considered, the better the estimation of the neural drive received by the muscle. Owing to their selectivity, intramuscular electrodes can detect only a small fraction of the MUs active during a certain contraction. However, relative changes in neural drive can also be estimated from a limited sample of MUs as those obtained from intramuscular recordings. 34 In fact, the estimate of the neural drive would not be influenced much assuming that the motor neurons receive a common synaptic input. 68 A further improvement of the neural drive estimation could be obtained using multichannel intramuscular electrodes. 69 Real-time processing should prevent erroneous classification in a given segment to be corrected later on. Procedures mentioned in Secs. 3.1.5 and 3.1.6 are related to such modifications. However, estimation of CDR (as mentioned in Sec. 3.1.7) is real-time and no correction is subsequently performed. In fact CDR is the single real-time output of our algorithm for prosthesis control. Meanwhile, the average IA (as a true real-time performance index) was 74 ± 18 (%) for the entire frames except the first 5 frames in which Secs. 3.1.5 and 3.1. 6 were not yet activated.
In our algorithm, the epoch length of 200 ms was used. This could be problematic in real-time applications. We used sensitivity analysis in which the epoch length changed from 100 ms to 300 ms in the step of 50 ms (Table 5). It is possible to use shorter epochs e.g. 150 ms or even 100 ms to improve the entire recording and analysis time for real-time applications. The overall performance of the algorithm did not significantly decrease compared with what obtains using 200 ms analysis epochs. It is also possible to further reduce the running time of the algorithm by implementing the entire algorithm in Vectorized C++ which is the focus of our future work.
Two features namely RMS and DASDV were used in our study. We further analyzed 94 features used in EMG signal processing. 44,[70][71][72][73] Different combinations of such features were analyzed using a ranking method. It was shown that among bivariate features, those used in our study were optimal in terms of accuracy and efficiency. We also added peak-to-peak and then the irregularity coefficient to our features to form the set of three and four features. The overall accuracy of the algorithm when using 2, 3, and 4 features was identical to 70%. The average running time was 43, 44 and 44 ms for analyzing 200 ms signal epoch. The improvement was not significant when moving from bivariate to quadrivariate features. The improvement of the accuracy of the method was however significant when using up to 14 features that was not practical in online implementation. Moreover, incorporating superposition resolution algorithms, an average increase of 19% in accuracy was obtained using our bivariate features. Thus, there is a higher bound on the expected accuracy when superimposed MUAPs are not resolved. A fixed threshold was used in our study for merging clusters (Fig. 2C). Adaptive thresholding such as those proposed by Stashuk and Qu 54 or soft thresholding based on an expert-based fuzzy system proposed by Marateb et al. 22 if implemented online, could in principle improve the performance of the algorithm. On the other hand, the assignment of an AcS to a cluster was performed adaptively in our algorithm (Eqs. (3) and (4)), in which the neighborhood of each cluster was estimated based on its MUAP variability and general background noise characteristics resulting in higher classification accuracy. Also, the proposed algorithm is used to decompose single-channel iEMG signals. Multiple channels provide multiple views of the MUAPs, so that MUAPs difficult to identify in one channel can often be identified more reliably in another. The online implementation of the multiple-channel version of the proposed algorithm is expected to improve the decomposition accuracy which is the focus of our future activity. It could be possible to use the proposed algorithm for decomposing other multi-unit biosignals, such as cortical recordings. 74

Conclusion
In conclusion, we have presented a novel online method for the decomposition of single-channel iEMG recordings. The method has been tested on different muscles (BFlh, VM, VL and TA) as well as simulated data during low-to-moderate isometric constant-force or ramp contractions. The proposed algorithm is thus a promising new tool for neural decoding in the next-generation of prosthetic control.
Update potential and center of the k-th cluster Else Make a new cluster and put AcS(i) as its cluster center End End Else Find the most similar cluster center (Cntr(k)) to AcS(i) using PsC between cluster representatives and AcS(i) If PsC(Clus Rep(k), AcS(i)) > Thr PsC Add AcS(i) to k-th cluster Update potential and center of the k-th cluster Else Make a new cluster and put AcS(i) as its cluster center End End End Note: The above algorithm is used for the first 10 s signal recordings. Then the below information is retained, the signal buffer is emptied and the next frame is analyzed. This procedure continues for the entire signal.