Extremum Seeking-based Adaptive PID Control applied to Neuromuscular Electrical Stimulation

Amultivariable deterministic extremum seeking (ES) is being evaluated to construct an adaptive Proportional-Integral-Derivative (PID) control law for the functional Neuromuscular Electrical Stimulation (NMES) of stroke patients. The developed scheme is applied to control the position of the patient’s arm so that movements of flexion/extension for its elbow can be produced. The true limitations of a PID controller for these types of applications is that a PID controller is designed for linear systems, but the system which is being controlled is nonlinear. Moreover, it is worth mention that clinicians’knowledge of control systems is limited. Therefore, their expertise in tuning controllers is limited. Also, in NMES applications each patient is unique and requires a unique set of PID parameters. Since it can be time consuming and difficult to find proper parameters for each patient, a better procedure, or a more intelligent adaptive controller, is needed. The PID parameters are updated by means of ES in order to minimize a cost function which brings the desired performance attributes. Experiments are performed with healthy volunteers and stroke patients, including significant advances based on real data and validation. Quantitative results show a reduction of 64.1% in terms of RMSE (Root-Mean-Square Error) – from 8.94◦ to 3.21◦ – when comparing the tracking curves of the last cycle to the first cycle in the experiments with all stroke patients.


INTRODUCTION
Neuromuscular Electrical Stimulation (NMES) is a technique based on the artificial activation of the second motor neurons using exogenous electrical impulses (Sheffler and Chae 2007).This activation may be used to increase muscle fatigue resistance, strength and in subjects with some neurological disorders such as stroke it can help them to make movements that they would not be able to perform.
NMES can be divided in two branches: the applications used as functional substitute and those ones intended to therapeutic intervention.For example, in a stroke patient with drop-foot (Seel et al. 2016), the NMES can activate the tibialis anterior muscle of the patient during swing phase.In this case, it is used as a functional substitute of a damaged central nervous system that is not able to recruit the necessary muscles during the gait.On the other hand, when a patient makes repetitive and voluntary training with his paretic arm it is shown that the central nervous system can be able to adapt and recover some functions, in a process called motor relearning (Sheffler and Chae 2007).The sensorimotor experience is believed to be of paramount relevance to neural plasticity and motor relearning and it has been shown the NMES effects are enhanced when it is used concomitantly with voluntary effort (Lynch andPopovic 2008, Maffiuletti 2010).
Most NMES devices used at clinics works in an open-loop approach are adjusted at the beginning of the therapy (Lynch and Popovic 2008).The amount of stimulation follows a pre-determined profile and demands user control to change stimulation parameters.This allows protocols aiming to enhance muscle contraction, sometimes simultaneously to the execution of intended contractions (Knutson et al. 2015).The drawback of this approach is that the device always gives the same amount of aid to the subject unless the therapist intervenes (Hara 2008, Freeman et al. 2009).Also, it has been shown that the open-loop devices are not well suited to promote an adequate association between the subject's intended movement and the artificial activation produced by NMES.
In this context, Proportional-Integral-Derivative (PID) controllers continue to represent a good option to closed-loop control the NMES electrical current amplitude based on the angular displacement of the arms because it has a simple implementation and its behavior is well known (Freeman et al. 2009).Although PID controllers are widely used in many distinct and general control processes, their effectiveness is often limited due to poor tuning.On the other hand, manual tuning is a time-consuming task and systematic methods rely on knowledge of the plant model or require special experiments to identify a suitable plant model.However, in NMES an exact plant model is not known, and it is not desirable to perform long system identification procedures with the patients in open loop (Oliveira et al. 2017).Note that the neuromuscular model is highly nonlinear and time-varying (Lynch and Popovic 2008), which means those tests may be often unfruitful.This adverse scenario of modeling motivates the application of robust and adaptive control techniques (Oliveira et al. 2017).
As a model-free real time optimization approach, extremum seeking (ES) is well suited for systems with unknown dynamics or those that are affected by high levels of uncertainty and or external dynamics (Krstic 2014).Thus, the method does not rely on the knowledge of system modeling parameters being robust to parametric uncertainties and unmodeled dynamics.In particular, extremum seeking does not merely monitor the direction of the output response but explores the measured response to estimate the gradient of the map and update the control input in proportion to the gradient of the map.Extremum seeking has the dual benefit of rigorously provable convergence and the simplicity of implementation, by employing only an integrator as well as an optional high-pass filter (Krstic and Wang 2000).For dynamic systems, it is enough to select the extremum seeking probing frequency reasonably smaller than the highest frequency that can pass the system without significant attenuation.This is justifiable in order to preserve the time-scale separation between plant and controller dynamics.Then, it can be shown that an attractive manifold is revealed in the new time scale by using a singular perturbation method (Khalil 2002), which essentially reduces the order of the considered dynamical system to a static nonlinear map perturbed by some fast stable dynamics, which in turn ultimately converges to a small residual set.In a nutshell, any stable plant dynamics (such as that in the human neuromuscular model) can be neglected for analysis and design purposes when the ES probing frequency is chosen sufficiently small.In this case, the price to be paid is the time dilation which slows down the closed-loop system responses.
From the early days of PID control, there has, additionally, been interest in providing recipes to the user for not only safe choices, but the best choices of the three parameters.The Ziegler-Nichols procedure is the best-known recipe, but also known to exhibit interior transients while asymptotically rejecting constant disturbances (Krstic 2017).Most of the existing methods are developed based on the assumption of linearity of the plant.The only method known to us in which the parameter optimization is conducted regardless of the plant's linearity or nonlinearity, or event the plant's dimension, is the ES method, as discussed by Killingsworth and Krstic 2006.This method views the system's response as a map from the three gains into a functional of the system's response over a time interval of interest.For linear systems, ES matches or beats the best performance attainable by linearity-based methods (Krstic 2017).
The paper presents a deterministic multivariable ES to adapt the gains of a PID controller used in upper-limb flexion/extension tasks using neuromuscular electrical stimulation.Differently of our previous publication based on healthy volunteers (Oliveira et al. 2016), experimental results are presented for stroke patients.We believe this is a meaningful and relevant contribution since the self-tuning of PID controllers for NMES-based rehabilitation would make their use much more practical in clinical settings by means of closed-loop control.Unlike (Oliveira et al. 2016), a full statistical analysis of the results of the proposed ES based PID controller is presented to show its efficacy in terms of Root-Mean-Square Error (RMSE).Quantitative results show a reduction of 64% of the total RMSE (from 8.94 • to 3.21 • ).
In this sense, optimization and control seem to be fundamental tools to face the unknown neuromuscular system (Alibeji et al. 2015, Bellman et al 2017, Merad et al. 2016, Sharma et al. 2009, 2011, 2012).In particular, our ES approach for simple adaptation of PID controller parameters in NMES is model-free having the interesting ability of controlling on multiple subjects without tediously tuning the controller, which is the main novelty of the paper.
It is also worth mention that ES has already been used in NMES literature to generate the desired trajectory (Zhang et al. 2006) or identify stimulation parameters for set-point regulation (Stegath et al. 2007) in open-loop tests.On the other hand, we can even find adaptive feedback strategies based on recursive least squares or general iterative learning control (Seel et al. 2016, Freeman et al. 2009, Freeman 2015), multiplemodel switched adaptive control (Brend et al. 2015) and robust feedback control (Sharma et al. 2012) applied to NMES or rehabilitation.However, to the best of our knowledge, the present paper is the first work which proposes the use of deterministic ES as a tool for adaptation of NMES closed-loop controllers experimental validation with stroke patients.

MATERIALS AND METHODS
A custom NMES device with a USB communication has been developed.Briefly, its analog module is a transconductance amplifier, which produces rectangular biphasic current with an amplitude controlled by a voltage at its input.A computer has been used to control the applied amplitude, pulse width and frequency of the electrical current stimulation.To restrain the arm movement of the subjects, a lightweight device has been built (Figure 1).It measures the elbow-joint angle using a goniometer (a) and allows mechanical adjustments to arm length (c) and lateral distance between two limbs (d).shows that the wrist has an attachment with linear freedom of movement along the aluminium square rod, while d points out that there is an adjustment for the lateral distance of the elbows.In picture on the bottom, the controlled joint angle, denoted by y, and the equipment are shown in a different view.A pair of electrodes per muscle are been used for the electrical stimulation.For instance, the pair of electrodes on the biceps muscle of the volunteer can be seen in picture on the top.
Patients were comfortably seated and had their arms adjusted to the device of Figure 1.Then, a pair of self-adhesive square electrodes (with 5 cm each side) was positioned at the distal portion of the biceps brachii muscle (BB) and another pair at the triceps brachii muscle (TB).The motor point was detected using a small round electrode of 1 cm 2 and defined as the point where the smallest amount of current could produce a muscle twitch when 1 Hz stimulation was used.
With the electrodes placed, each muscle is briefly stimulated using progressively increasing current up to the point that the subjects understood it was their limit before discomfort or when there was enough current to produce full elbow flexion or extension.All subjects are instructed not to exceed their limits because the results could be affected if they were not comfortable during the experimental protocols.The NMES pulses were balanced symmetrical biphasic with 400 µs pulse width at 50 Hz.The controller only modulated the current amplitude at each pulse.
Unilateral movements were made with one of the arms receiving NMES, while the output error signal e(t) := r(t) − y(t) , (1) to be decreased was calculated using the difference between the reference signal (r) and the angular position of controlled elbow (denoted by y), measured with a goniometer.
The same trapezoidal-shape reference was used in all experimental conditions (Figure 2).Each flexion and extension ramp has an angular velocity of 15 o per second up to a maximum angle of 45 o , and then a return to the baseline angle at same angular velocity.

PRINCIPLES OF DETERMINISTIC EXTREMUM SEEKING
Many versions of extremum seeking exist, with various approaches to their stability study.The most common version employs deterministic (periodic) perturbation signals for the purpose of estimating the gradient of the unknown map that is being optimized (Ariyur andKrstic 2003, Krstic 2014).To understand the basic idea of extremum seeking, we start our discussion with the case of static maps and then we move on to the more general setup of dynamic maps.For the sake of simplicity and clarity, it is best to first consider the case of a static single-input map of the quadratic form, as shown in Figure 3: where f * , f > 0, θ * are all unknown scalars.The map ( 2) is simply the local quadratic approximation (via Taylor series) for any C 2 convex function in our optimization (minimum seeking) problem, where (θ * , f * ) is the unknown extremum point and f is the unknown Hessian of the static map.
Three different θ 's appear in Figure 3: θ * is the unknown optimizer of the map, θ (t) is the real-time estimate of θ * , and θ (t) is the actual input into the map.The actual input θ (t) is based on the estimate θ (t) but is perturbed by the signal a sin(ωt) for the purpose of estimating the unknown gradient f • (θ − θ * ) of the map f (θ ).The estimate θ (t) is generated with the integrator −k/s with the adaptation gain k controlling the speed of estimation.
, where f * , f , θ * are all unknown.The user has to only know the sign of f (> 0), namely, whether the quadratic map has a minimum, and has to choose the adaptation gain k such that sgn(−k) = −sgn( f ).The user has to also choose the frequency ω as relatively large compared to a, k, and f .Regarding the Laplace symbol s, with some abuse of notation, we mix the frequency and time domains for brevity and conceptual clarity, i.e., the integrator transfer function k/s in the block diagram should be understood as an operator on a time-domain function.

DETERMINISTIC EXTREMUM SEEKING FOR NEUROMUSCULAR ELECTRICAL STIMULATION
The ES algorithm is successful if the error between the estimate θ (t) and the unknown θ * , namely the signal θ (t) := θ (t) − θ * (3) converges towards zero.Based on Figure 3, the estimate is governed by the differential equation θ (t) = −ka sin(ωt) f (θ ), which means that the estimation error is governed by By expanding the right-hand side one obtains A theoretically rigorous time-averaging procedure (Ariyur and Krstic 2003, Krstic 2014) allows to replace the above sinusoidal signals by their means, yielding the "average system" (Khalil 2002): which is exponentially stable since k f > 0. The averaging theory guarantees that there exists sufficiently large ω such that, if the initial estimate θ (0) is sufficiently close to the unknown θ * , ∀t ≥ 0: For the user, (7) guarantees that, if a is chosen small and ω is chosen large, the input θ (t) exponentially converges to a small interval around the unknown θ * and, consequently, the output f (θ (t)) converges to the vicinity of the optimal output f * .

FROM STATIC TO DYNAMIC MAPS
When we are interested in minimizing (w.l.o.g.) the output of an arbitrary, unknown, nonlinear dynamic map, we can consider the following general representation where x ∈ R m is the m-dimensional state vector, u ∈ R and y ∈ R represent the scalar input and output, respectively, and f : R m × R → R m as well as g : R m → R are smooth (Krstic and Wang 2000).Establishing that a smooth control law α : parametrized by a scalar parameter θ is acting upon the plant 1 , one obtains the closed-loop system 1 For simplicity, we assume that we have a static state-feedback control law; it would be trivial to extend the result to dynamic output feedback since y = g(x) in ( 9) (Krstic and Wang 2000).

DETERMINISTIC EXTREMUM SEEKING FOR NEUROMUSCULAR ELECTRICAL STIMULATION
Its equilibria, characterized by the scalar parameter θ , are specified through the following assumptions (Krstic and Wang 2000).Assumption A1: There exists a smooth vector field l : R → R m such that if and only if Assumption A2: For every value of the parameter θ ∈ R, the equilibrium of the system ( 11) is locally exponentially stable with decay and overshoot constants uniform in θ .
Assumption A2 is not restrictive since in a general context we can encompass a complete wide class of nonlinear plants given by ( 8)-( 9).Basically, we assume that we have a control law (10) which is robust with respect to its own parameter θ in the sense that it exponentially stabilizes any of the equilibria that θ may produce.It simply means that we have a control law designed for local stabilization and this control law need not be based on modeling knowledge of either f (x , u) or l(θ ).
Thus, we assume that the output equilibrium map is according to Assumptions A1 and A2.Defining J : R → R as the composition of the SISO output function g in ( 9) and the state vector function l in ( 13) we formulate our optimization problem as where the corresponding minimizing value is denoted by θ * , according to the next assumption.Assumption A3: There exists θ * ∈ R in the interior of some (local Hence, we assume the output equilibrium map y = J(θ ) has a minimum at θ = θ * , and the objective is to develop the feedback mechanism which minimizes the steady-state value of y, but without knowing either θ * or the functions g and l.
Under the conditions above, ES extends in a relatively straightforward manner from static maps to dynamic systems, provided the dynamics are stable (or stabilized by the control law u = α(x, θ )) and the ES algorithm's parameters are chosen so that the algorithm's dynamics are slower than those of the closed-loop plant.In the presence of dynamics, the equilibrium map ( 14)-( 15) will satisfy at least the same conditions as in the static map (2) and, therefore, convergence is guaranteed.The main difference is that upper bound for |θ (t) − θ * | in (7) would contain an exponential term with slower decaying rate and the residual set of order O(1/ω) is replaced by a term of order O(ω).
The stability analysis in the presence of dynamics employs both averaging and singular perturbations, in a specific order.The design guidelines for the selection of the algorithm's parameters follow the analysis.Though the guidelines are too lengthy to state here, they ensure that the plant's dynamics are on a fast time scale, the perturbations are on a medium time scale, and the ES algorithm is on a slow time scale (Krstic and Wang 2000).

ADAPTIVE PID CONTROL
In this section, we present a method for optimizing in real time the trapezoidal response of a closed-loop system consisting of a PID-NMES controller and the unknown human neuro-motor system with a discrete version of extremum seeking.
Specifically, ES minimizes a cost function which quantifies the performance of the PID controller and iteratively modifies the arguments of the cost function (the PID parameters) so that it's output reaches a local minimum.

COST FUNCTION AND PID CONTROLLER
ES is used to tune the parameters of a PID controller so as to minimize a given cost function.The cost function, which quantifies the effectiveness of a given PID controller, is evaluated at the conclusion of a trapezoidal-response experiment.We use the integrated square error (ISE) cost function 2 : where the output error in (1), parametrized in θ , is the difference between the reference and the output signal of the closed-loop system e(t, θ ) = r(t) − y(t, θ ) in (1), and contains the PID parameters.The PID controller structure and the meaning of K, T i , and T d are given in what follows.Equation ( 15) is the general cost function for the generic nonlinear model ( 8)-(9).Equation ( 18) is the cost function used in our particular NMES problem assuming that the neuromuscular plant is a specific class of system which falls into (8)-( 9).The cost function J(θ ) defined in (18) takes into account the error over the time interval [t 0 , T ].By setting t 0 to approximate the time at which the trapezoidal response of the closed-loop system reaches the flexion plateau (see Figure 2), the cost function J(θ ) effectively places zero weighting on the initial transient portion of the response.Hence, the controller is tuned to minimize the error beyond this phase without constraints on the initial transient.
We use a standard PID controller, in particular the derivative term acts on the measured output error (including the reference signal).This PID controller avoids large control effort once the reference signal is chosen as a trapezoidal function with bounded first-order derivative.
2 A comparative study involving other cost functions, such as IAE, ITAE and ITSE was already carried out in (Killingsworth and Krstic 2006) by showing that ISE produces responses with the smallest overshoots and fastest settling times.Considering that our manuscript deals with a real-world application involving stroke patients, it is not trivial to perform innumerable comparisons with these patients, which supports our choice for ISE as the unique cost function for experimental evaluation.

DETERMINISTIC EXTREMUM SEEKING FOR NEUROMUSCULAR ELECTRICAL STIMULATION
The optimal operation of this kind of process is complicated to reach, mainly due to their highly nonlinear nature and by the substantial unmodeled dynamics (such as fatigue) present in the neuromuscular system.In this way, optimization and control are fundamental tools to face such unknown plants.This research proposes an extremum-seeking control based on periodic perturbation signals to adapt the PID control parameters plugged into the closed-loop system and minimize the cost function which is chosen to reflect the desired performance attributes.The controller is parametrized as: where u(t) is the control signal and the constants are the proportional, integral and derivative gains.

EXTREMUM-SEEKING ONLINE TUNING
The cost function J(θ ) should be understood as a mapping from the PID parameters K, T i , and T d to the tracking performance.Another possibility would be to optimize K p , K i , and K d directly.ES seeks to tune the PID controller by finding a minimum of J(θ ).However, since ES is a gradient method, the PID parameters found by ES are not necessarily a global minimum of J(θ ).
where k is the discrete iteration number, and the subscript i indicates the ith entry of a vector.The adaptation gain is γ and α i are the perturbation amplitudes.Stability and convergence are influenced by the values of γ, α i and the shape of the cost function J(θ ) near the minimizer, as explained in section Principles of Deterministic Extremum Seeking.
Since the system is multivariable, it has high dimensionality (i = 1, . . ., 3), thus the orthogonality requirements on the elements of the periodic perturbation vector pose an implementation challenge.According to Ghaffari et al. 2012, Ariyur andKrstic 2003, in order to guarantee convergence, the user should choose appropriate frequencies ω i = ω j .The former is a key condition that differentiates the multi-input case (Ghaffari et al. 2012) from the single-input case (Krstic and Wang 2000).
In a general context, the probing frequencies ω i 's can be selected as where 0 < ω < 1 is a positive constant and ω i is a rational number.One classical choice for the perturbation vector (dither signals) in multivariable extremum seeking is given in (Ghaffari et al. 2012) as for all distinct i, j, k and l.Here, for the three-dimensional case (i ∈ 1, 2, 3), the modulation frequency ω i is simply chosen such that which satisfy (26) above.

ON THE ES PARAMETERS, THE SEARCH SPACE, COMPUTATIONAL COST AND HOW ES-BASED ADAPTATION WORKS
• In general the deterministic method seems to behave better/faster with increasing dither frequencies ω i (or ω) for static maps.However, in the presence of plant dynamics, it is prohibited in theory to increase arbitrarily the perturbation frequency ω due to the application of the singular perturbation reduction (Krstic and Wang 2000) in order to "freeze" the dynamic state in its quasi-steady state value.In fact, the frequency of the dither is forced to be sufficiently slow to guarantee the time-scale separation between the plant and controller dynamics.Of course, the price to be paid in these cases is the decrease of the algorithm's convergence speed.On the other hand, the adaptation gain is γ and α i is the perturbation amplitude.
Basically the speed of convergence and the ultimate residual sets of the ES algorithm are influenced by the values of α i and γ in (24) and ω in (27).By reducing the parameters α i , γ and ω, lead us to smaller convergence rates, but also smaller residual sets around the unknown extremum point.
• A major concern is that ES algorithm must constrain the PID gains such that they are always positive.This is a crucial point because these gains are required to be positive for stability, and instability of the PID controller could cause harm to the participants.These dangers were carefully addressed in our implementation by including them as optimization constraints for the search space.In a nutshell, every time the adapted parameters achieve negative values, we have set randomly them to zero or to the value of the previous iteration cycle, such that the output of the ES algorithm ensures the control gains are always positive.
• As a model-free, real time optimization approach, ES is well suited for systems with unknown dynamics or those that are affected by high levels of uncertainty or external dynamics.Thus, the method does not rely on the knowledge of system modeling parameters being robust to parametric uncertainties and unmodeled dynamics.In particular, ES does not merely monitor the direction of the output response but explores the measured response to estimate the gradient of the map and update the control input in proportion to the gradient of the map.ES has the dual benefit of rigorously provable convergence and the simplicity of implementation.
• Regarding the computational cost, ES employs continuous oscillatory perturbation, also known as a "probing function" and only an integrator (as well as optional high-pass and low-pass filters).For dynamic systems, it is enough to select the extremum seeking probing frequency reasonably smaller than the highest frequency that can pass the system without significant attenuation.This choice allows for the time-scale separation (discussed before) between plant and controller dynamics facilitating the convergence proof via singular perturbation and averaging theory.
• The trapezoidal-response experiment is run iteratively.The cost J(θ (k)) is calculated at the conclusion of the trapezoidal-response experiment.The ES algorithm uses the value J(θ (k)) of the cost function to compute new controller parameters θ (k).Another trapezoidal function experiment (cycle) is then performed with the new controller parameters, and the process continues iteratively.
ES is a nonmodel-based method that iteratively modifies the input θ of the cost function J(θ (k)) to reach a local minimizer.ES achieves this optimization by sinusoidally perturbing the input parameters θ (k) of the system and then estimating the gradient ∇J(θ (k)).Note that k is the index of the trapezoidal-response experiment, whereas t is the continuous-time variable within an individual trapezoidal-response experiment.The gradient is determined by the multiplication of the discrete-time signal J(θ (k)) with a discrete-time sinusoid of the same frequency as the perturbation signal.This procedure estimates the gradient by picking off the portion of J(θ (k)) that arises due to perturbation of the parameter estimate θ (k) (see section Principles of Deterministic Extremum Seeking).The gradient information is then used to modify the input parameters in the next iteration; specifically, the gradient estimate is integrated with a step size γ, yielding a new parameter estimate θ (k).The integrator both performs the adaptation function and acts as a low-pass filter.

EXPERIMENTS
Experimental results are presented in order to validate the proposed NMES controller following the protocol in section Materials and Methods.A detailed discussion of the closed-loop responses is also provided.In the next, all tests performed with healthy volunteers and stroke patients are given for the proposed NMES controller in order to show the robust-adaptive nature of the proposed scheme.
A trapezoidal reference with multiple cycles depicted in Figure 2 is explored in the experiments.In particular, the second derivative of the trapezoidal curve contains impulse which offer an additional challenge to the controller since it sets the desired angular acceleration of the controlled elbow.In general, we only find in the literature of NMES sufficiently smooth trajectories (with smooth derivatives), which facilitates the tracking control problem.It is worth mentioning that the final objective of our control application is for motor rehabilitation of stroke patients.Thus, the use of more sophisticated trajectories is not the focus in this kind of therapy, where simple and cyclic movements are applied to motor training.Despite the large number of research papers working on more complex control algorithms for producing coordinate movements, we can find few works focusing on physiotherapy.In this sense, the use of NMES oriented to physiotherapy until the occurrence of fatigue should be avoided in a real scenario since the idea here is motor relearning.
The cost function (18) used in the minimization problem is defined with t 0 = 5 and T = 20 for both healthy volunteers and stroke patients.By increasing the time-range of evaluation in the cost function (making t 0 = 0), transient errors would impact the results in terms of larger errors to be minimized along of the overall interval and, consequently, also in the steady-state portion of the output response (flexion plateau).Since our final objective is oriented to stroke patients, the most important part in the execution of the physiotherapy movement is the flexion segment, we have focused on minimizing this flexion errors to avoid undesired oscillations rather than the tracking error along of the entire trapezoidal cycle.In addition, by increasing the errors along of the overall trapezoidal cycle would make necessary a increased number of iterations to achieve a better tracking performance, which may not be advantageous in the physiotherapy procedure.Finally, for stroke patients, spasticity clearly helps the control task against the gravity effects because they naturally present flexor patterns, thus, smaller overshoots in the transient responses are expected in comparison with healthy volunteers due to this "natural damping".The joint stiffness in these patients and their inability to make the movement with the injured limb reduce significantly their mobility, which means the disturbance effect of gravity itself is not enough to produce the angular momentum necessary to move the arm down, thus facilitating the control actuation.Even if the tracking accuracy for stroke patients were not perfectly achieved as in healthy volunteers, it would not mean the proposed NMES controller could not be potentially used for treatment since the objective in these cases is motor relearning for physiotherapytherapeutic application, as mentioned before.In a more general context, we can also find publications which indicate that by choosing t 0 beyond the rising time of the reference signal will bring advantages in terms of achieving faster settling times and smaller overshoots in the output responses.Please see (Lequin et al. 1999, Killingsworth andKrstic 2006) for more details.Initially, two volunteers with no orthopedic nor neurological injuries were recruited.Prior to participation, written informed consent was obtained from each individual.The plots of Figures 5 and 6 present the representative signals and movements for the subject 1, whereas Figures 7 and 8 show the results for the subject 2. Although all experiments were performed in a longer time-window, we have just shown the first cycles of interested in our tests for the sake of clarity.
The control signal u was saturated according to the discomfort of each volunteer.Note that, when the biceps and triceps are active, the control signals are always greater than some minimal value (non zero lower bound) due to the dead-zone nature of the muscle actuator.Although the manuscript does not bring explicitly statistical results showing the advantages of the proposed adaptive PID control scheme in comparison with classical fixed PID, quantitative/qualitative advances can be clearly verified by comparing the tracking responses obtained in the first cycle (which would be the permanent results if we have fixed the PID gains with those values) and the last cycle of adaptation displayed in the plots.The response curves ratify the improved behavior of the proposed control scheme even in this adversary scenario for NMES.In a few words, the adverse environment of modeling uncertainties inspires the application of adaptive-robust control methodologies and automatic tuning techniques (Oliveira et al. 2017).In addition, results for PID controllers with fixed gains can be found in some of our previous publications (Oliveira et al. 2017, Catunda 2016).
Observing the performance of the controller in Figures 5 to 8, it shows a low error in steady state and low percentage overshoot for a practical scenario after the adaptation law had minimized the cost function for each cycle of the trapezoidal curve given in Figure 2. The responses obtained during the experiments were certainly influenced by the effects from external disturbances and the effects of nonlinearities in the actuators (BB and TB), such as saturation and dead zones.These ingredients were ignored in the initial modeling of the problem.Note that the control signal of volunteer 2 has saturated less than the control signal of volunteer 1, making necessary a reduced number of iterations (k = 3) to achieve a better tracking performance -see Figure 8.
The oscillatory movement ("chattering") in the first 50 seconds in Figure 6 is due to the saturated control signal observed in the current plots applied to biceps and triceps muscles.For instance, the graphics at the bottom of Figure 6 show that the limits of current allowed for volunteer 1 are smaller than those used for volunteer 2 in Figure 8.Hence, the stringent saturation of the stimulation signals is responsible for this larger oscillatory behavior.However, it does not imply the volunteers are subject to any pain nor discomfort, as discussed in Materials and Methods.
The most important condition in our experiments with healthy volunteers is that they need to be relaxed and apart of the desired trajectory for the movement in order to minimize their active participation.Although there is no guarantee that the volunteers are or are not adapting their reaction to the controller, our results present an interesting approach to refine the performance of PID NMES control in a strongly nonlinear and time-varying scenario for use in rehabilitation.This will be particularly important in the next tests with stroke patients since the idea in this case is that the electrical stimulation increases progressively the participation of the patient in the execution of the movement.In other words, if the proposed NMES controller helps the stroke patient to participate in the movement execution, the treatment can be rated as highly successful.I, the stroke patients were classified according to Ashworth and Rankin scales and Fugl Meyer assessment (see Bohannon and Smith 1987, Wilson et al 2002, Meyer et al 1975, respectively).The plots of Figure 9 present the tracking signals for the stroke patients P1 to P8.
As a metric to evaluate the improvement of the proposed adaptive controller, the RMSE of the trajectory was compared for both the first and last cycles using Wilcoxon signed rank test, a non parametric statistical hypothesis test to compare small groups with repeated samples.In this case, it is not necessary to evaluate if the data is normally distributed.The Wilcoxon signed rank test had p-value of 0.01, indicating significant difference between the RMSEs.
The RMSE of the tracking signals for the eight stroke patients were calculated and are available in

CONCLUSIONS
In spite of the nonlinearities and time-varying properties of the NMES process, it was satisfactorily approached by the proposed adaptive PID controller via ES.
The results achieved in experiments with stroke patients are commendable since in general only healthy volunteers are recruited for closed-loop control tests.One of the main points of this paper was to introduce a technique which requires little to none tuning procedures.Since there is a clear subject-to-subject variability, the control parameters of a non-adaptive approach should be tuned for each person, which would take a long time or even induce muscle fatigue, reducing the efficacy of the process and its clinical viability.On the other hand, manual tuning is a time-consuming task and analytical methods are based on an exaggerated knowledge of the plant, requiring particular experimental validations to the identification of an acceptable plant model.Nevertheless, a precise plant model in NMES is not known, and very long identification procedures are not desirable with the patients in open loop (Oliveira et al. 2017).Notice that a mathematical model for the neuromuscular plant would be time-varying and highly nonlinear (Lynch and Popovic 2008), which means these modeling/identification tasks may be totally unproductive.These drawbacks are not shared by our model-free adaptive approach.
From the point of view of the proposed controller, it is not important if a linear or nonlinear model is assumed for the neuromuscular system but whether its open-loop input-to-state stability (ISS) properties (Khalil 2002) can be preserved.The time-scaling separation is the next key element to the solution of the problem.This procedure reduces the order of the stable dynamic system to be studied and, hence, allows the analysis and design of the ES based controller regardless of the order or relative degree of the model and the exact knowledge of its parameters.
Good tracking performance can be achieved after a reduced number of the algorithm iterations.According to our experiments, the proposed adaptive control approach had presented good performance results to reach the target angle and had ultimately assured comfort conditions for both healthy volunteers and stroke patients with reduced tracking errors, by presenting an average of the RMSEs close to 5 • in the latter group.
For future research, an alternative to deterministic ES (Krstic and Wang 2000) is the stochastic ES method in (Liu and Krstic 2016).Hence, the papers (Krstic and Wang 2000, Liu andKrstic 2016) can be viewed as companion references for the practical user of the results in the highlighted paper.However, the user should be aware that, for nonlinear plants, optimal parameter choices will be dependent on the plant's initial condition and the setpoint value.Morevoer, other extremum seeking algorithms based on periodic switching signals (Oliveira et al. 2011) or monitoring functions (Aminde et al. 2013) could be explored for PID tuning applied to neuromuscular electrical stimulation.

Figure 1 -
Figure 1 -Mechanical apparatus for NMES experimental tests.The point a in the image indicates a goniometer (simple potentiometer) linked to a steel axis b allowing angular displacement readings.Letter cshows that the wrist has an attachment with linear freedom of movement along the aluminium square rod, while d points out that there is an adjustment for the lateral distance of the elbows.In picture on the bottom, the controlled joint angle, denoted by y, and the equipment are shown in a different view.A pair of electrodes per muscle are been used for the electrical stimulation.For instance, the pair of electrodes on the biceps muscle of the volunteer can be seen in picture on the top.

Figure 2 -
Figure 2 -Graphic of the reference signal and its divisions.
TIAGO ROUX-OLIVEIRA et al.DETERMINISTIC EXTREMUM SEEKING FOR NEUROMUSCULAR ELECTRICAL STIMULATION EXTREMUM SEEKING FOR STATIC AND LOCALLY QUADRACTIC MAPS

Figure 3 -
Figure 3 -Block diagram of deterministic ES for a static map using periodic perturbations.The simplest perturbation-based extremum seeking scheme for a quadratic single-input map J(θ ) = f * + f 2 (θ − θ * ) 2, where f * , f , θ * are all unknown.The user has to only know the sign of f (> 0), namely, whether the quadratic map has a minimum, and has to choose the adaptation gain k such that sgn(−k) = −sgn( f ).The user has to also choose the frequency ω as relatively large compared to a, k, and f .Regarding the Laplace symbol s, with some abuse of notation, we mix the frequency and time domains for brevity and conceptual clarity, i.e., the integrator transfer function k/s in the block diagram should be understood as an operator on a time-domain function.

Figure 4 -
Figure 4 -Block diagram of the closed-loop system for NMES using deterministic discrete-time ES, where α i sin(ω i k) is the perturbation vector.
TIAGO ROUX-OLIVEIRA et al.DETERMINISTIC EXTREMUM SEEKING FOR NEUROMUSCULAR ELECTRICAL STIMULATION TESTS WITH HEALTHY VOLUNTEERS

Figure 5 -
Figure 5 -Adaptive PID (volunteer 1): PID parameters during ES adaptation of the closed-loop system, the evolution of the cost function, and the best response y ultimately obtained after k = 4 iterations.ES reduces the cost function J(θ (k)), which produces a more favorable tracking response.Note that the derivative portion is barely activated during the experiments.

Figure 9 -
Figure 9 -Result of angular elbow joint movement with deterministic extremum seeking for eight stroke patients and their individual RMS errors.The reference signal is represented by the dashed line.
TIAGO ROUX-OLIVEIRA et al.DETERMINISTIC EXTREMUM SEEKING FOR NEUROMUSCULAR ELECTRICAL STIMULATION

TABLE II .
The tests indicate that the RMSE are statistically less for the last cycle of evaluation for the proposed ES based PID controller when compared to the first cycle, representing a reduction of 64% in terms of the RMSE (from 8.94 • to 3.21 • ).The tests that indicate the individual RMSE for the patients are also satisfactory by presenting a mean value of the RMSEs around 5 • .Although good results have been obtained in our new experiments with stroke patients (rather than healthy volunteers only), the main clinical use of the proposed controller is indeed for rehabilitation and not for assistance.At the present stage of our research, it means we are not pursuing to outperform other adaptive strategies, but introduce a new one which potentially achieves the control objectives.