A Pilot Study on the Functional Stability of Phonation in EEG Bands After Repetitive Transcranial Magnetic Stimulation in Parkinson’s Disease
Abstract
Parkinson’s disease (PD) is a neurodegenerative condition with constantly increasing prevalence rates, affecting strongly life quality in terms of neuromotor and cognitive performance. PD symptoms include voice and speech alterations, known as hypokinetic dysarthria (HD). Unstable phonation is one of the manifestations of HD. Repetitive transcranial magnetic stimulation (rTMS) is a rehabilitative treatment thathas been shown to improve some motor and non-motor symptoms of persons with PD (PwP). This study analyzed the phonation functional behavior of 18 participants (13 males, 5 females) with PD diagnosis before (one pre-stimulus) and after (four post-stimulus) evaluation sessions of rTMS treatment, to assess the extent of changes in their phonation stability. Participants were randomized 1:1 to receive either rTMS or sham stimulation. Voice recordings of a sustained vowel [a:] taken immediately before and after the treatment, and at follow-up evaluation sessions (immediately after, at six, ten, and fourteen weeks after the baseline assessment) were processed by inverse filtering to estimate a biomechanical correlate of vocal fold tension. This estimate was further band-pass filtered into EEG-related frequency bands. Log-likelihood ratios (LLRs) between pre- and post-stimulus amplitude distributions of each frequency band showed significant differences in five cases actively stimulated. Seven cases submitted to the sham protocol did not show relevant improvements in phonation instability. Conversely, four active cases did not show phonation improvements, whereas two sham cases did. The study provides early preliminary insights into the capability of phonation quality assessment by monitoring neuromechanical activity from acoustic signals in frequency bands aligned with EEG ones.
1. Introduction
Parkinson’s disease (PD) is a neurodegenerative disease first described by James Parkinson in 1817.1 Recent studies quantify its incidence in 15 cases per 100,000 people, with a prevalence ranging from 100 to 200 cases per 100,000.2 The impact of PD on the quality of life of persons with PD (PwP) is associated with motor symptoms such as in effective neuromotor control, difficulty in walking and handling objects, resting tremor, facial rigidity, etc.,3 and other nonmotor symptoms which are also affecting PwPs’ ability to carry an independent life. PwPs also experience alterations in respiration, phonation, articulation, and prosody, collectively referred to as hypokinetic dysarthria (HD), which is a complex motor speech disorder characterized by manifestations such as mono-pitch and monoloudness, imprecise articulation, impaired speech rate and rhythm, and irregular pitch fluctuations.4 PD symptoms may be mitigated by medication, neurostimulation, and rehabilitation, and stabilized temporarily to facilitate PwPs’ motor functions.5 Transcranial magnetic stimulation (TMS) is a noninvasive method for the stimulation of neural tissue, including the cerebral cortex, spinal roots, and cranial and peripheral nerves. In this sense, repetitive TMS (rTMS) has been proposed as a therapy to improve patients’ neuromotor conditions.6
The main objective of this study is to explore the potential of assessing the functional improvement of phonation conditions of PwPs after rTMS treatment, using electroencephalography (EEG)-related frequency bands of glottal neuromechanical correlates. Phonation symptoms such as thyroarytenoid muscle hypotonia, vocal fold imbalance, and tremor in voice (altered neuromotor feedback) are some manifestations of PD-related neurodegeneration on speech.7,8 The process of phonation is driven by lung air pressure forcing flow through the vocal folds. These structures are composed of muscle (thyroarytenoid) and connective tissues, fixed to the thyroarytenoid cartilage structure.9 The thyroarytenoid muscle, also known as musculusvocalis together with the cricothyroid, and transverse and oblique arytenoid muscles, is responsible for vocal production. Especially, the cricothyroid, and thyroarytenoid muscles are involved in contracting and stretching the vocal folds, thereby controlling phonation pitch. The inferior laryngeal nerve (recurrent) is the principal pathway in neuromotor innervation of the mentioned muscles in the larynx, except for the cricothyroid, which is innervated by the superior laryngeal nerve. The function of these nerves is to induce contraction and adduction of vocal folds to induce vibration under lung pressure.10 Therefore, it can be hypothesized that vocal fold tension (a biomechanical correlate of phonation) might be the direct consequence of the combined NMA of the inferior and superior laryngeal nerves (neuromechanical correlates), both being branches of the cranial nerve (vagus). The control of larynx neuromechanics can be directly related to the activity of premotor (PM), motor (M), and supplementary motor (SM) areas of the brain, which is ultimately related to the nature of speech being produced (voiced or voiceless). Therefore, meaningful cognitive traits determining prosody (energy, intonation, rhythm) will intervene in the control of vocal fold adduction, abduction, and tension. There are many brain areas implied in the laryngeal motor control, both in phonation and nonphonation tasks. Primary vocalM areas include the laryngeal motor cortex,11 PM cortex, SM area, and cerebellar lobule VI. Secondary areas comprehend the cingulate M area, the ventral nuclei of the thalamus, the putamen, the frontal operculum, and the anterior insula.12 The structural and functional relationship among them is still a matter of ongoing research, out of the reach of the present contribution, although there is a consensus in that as far as vocalization is concerned, the direct pathway is mainly controlled by the ventromedial central sulcus peak (VmCSP) corresponding to Broadmann area 4, and the dorsolateral peak (DsP) in area 6, both on the precentral gyrus next to articulator control areas. In close relationship with this paper’s primary objective, Rödel et al.13 reported selective stimulation of the vocal fold tensor (cricothyroid) or relaxer (thyroarytenoid) using TMS. In this sense, Brown et al.11 suggested that both areas mentioned before may have functional roles representing different muscles in the larynx because both muscular systems are innervated differently, specifically by the external laryngeal (cricothyroid) and the recurrent laryngeal nerves (thyroarytenoid).4 The working assumption in this study will consider that the main neuromotor activity (NMA) projected to the lower neuromotor units through the direct activation pathway of laryngeal neuromotor control is mainly related to VmCSP and DsP activity, given the fact that the exercises examined are solely the phonation (sustained vocalization) of a maintained vowel [a:], therefore, the role of secondary areas (indirect activation pathway) more relevant in other dynamic speech tasks would play a subsidiaryrole. The vocal fold stiffness (VFS) resulting from the tension and stretching of larynx muscles will condition the patterns of the glottal source (GS), a correlate of the pressure in the supraglottal ridge of the vocal folds, and the GS will induce the vocal sound pressure (VSP) at the radiation extreme when propagated through the oro-nasopharyngeal tract (ONPT). A simplified view of the information flow would be . Specifically, the relationship is known as cortical muscle coupling (CMC).14 As it is well known, alterations in NMA coherence seen in the EEG may be present in motor-related disorder treatment and rehabilitation,15,16,17 a consequence of the working hypothesis is that these alterations might be in part modeled from VSP as frequency bands related to classical EEG activity using signal model inversion to reconstruct the way-back path as . This possibility would benefit from ongoing studies on CMC, as it can be used to explore the coupling relationship of different functional frequency bands.18,19 Therefore, this study proposes to conduct the reconstruction of the way back path from VSP to EEG-related frequency bands using long-lasting utterances of an open vowel as [a:] to produce functional phonation descriptions in terms of NMA and to use them in detecting functional phonation improvements after rTMS treatment. The use of maintained phonations of [a:] in speech pathology studies is well suited for functional phonation evaluation, this fact being recognized by its wide application in clinical practice, given that slight variations in NMA will be immediately reflected in GS, and thus, easily tracked and monitored.
Having fixed the conducting narrative justifying the reconstruction process from VSP to NMA, it should be decided which frequency bands would be of higher interest for being explored further. It would seem reasonable to focus on the activity in the - and -bands following the description of the nonlinear character of motor unit recruitment in muscular agonist–antagonist activation in Darbin and Montgomery,20 taking into account the direct neuromotor pathways involving the cortex-thalamus-basal ganglia (Ctx-Th-BG) circuitry activating the cricothyroid and thyroarytenoid muscles, and the projection of the organized oscillators that explain the ultimate nonlinear character of motor unit activity involved in the laryngeal function, because the -band is strongly related to unstable nonlinear NMA,21 whereas, on the other hand, the – coupling should be related to the Ctx-Th-BG activity.22
Thus, a primary objective of the study would be to evaluate the functional competence of phonation in TMS participants. A secondary objective is based on the existence of strong relationships between muscular contraction under biomechanical drive and neuromotor EEG activity on the brain areas responsible for PM and motor control18,23,24 in the sense that laryngeal motor activity is transferred by larynx muscles, inducing the contraction of the thyroarytenoid muscle, measured by the unbiased vocal fold stiffness (UVFS) through nonlinear projection operators transforming neural discharges into muscle contraction. This modeling would allow reverse system inversion, provided that adequate inverse operators could be designed based on system identification methodologies. Therefore, it would be possible to advance in the projection of the NMA estimated from phonation biomechanics over the brain area activity, modeled as EEG frequency bands.
According to the primary objective, this work is intended to assess the validity of EEG-aligned frequency contents of glottal neuromechanics, but using only indirect estimations from vowel phonation acoustic recordings, to characterize the stability of phonation in pre-stimulus and post-stimulus vocal emissions from a limited set of PD participants, using a methodology aligned with the secondary objective. This paper is structured as follows. Section 2 gives a description of the participants in the experimental setup, the stimulation protocol, and the signal processing methods to estimate the signal correlates used in the study, with special attention to EEG-related frequency bands of the vocal fold biomechanical stiffness, and the statistical methods to evaluate phonation stability conditions from each participant, evaluation session, and frequency band. Section 3 presents a descriptive example of the results of processing a segment from a pre-stimulus phonation, the statistical distributions of frequency band estimates, the results from comparing pre- and post-stimuli distributions using log-likelihood ratios (LLRs) and hypothesis tests, and the global scores helping to assess the effects of stimulation on phonation stability conditions of participants. Section 4 discusses the meaning of results in terms of treatment efficiency and methodological accuracy, highlighting achievements and limitations. Section 5 summarizes the findings, contributions, and conclusions derived from the study. An Appendix has been added to include supplementary data tables.
2. Materials and Methods
2.1. Sample description
This study has been conducted on PwP participants showing mild to moderate HD directly related to PD, all of them right-handed, and native speakers of Czech. All were on stable dopaminergic medication for the duration of the whole study. The patients were tested in the ON state, which means that they had received the adequate dosage of dopaminergic medication according to their respective prescriptions 2h before the evaluations were conducted (patients showing levodopa-induced dyskinesias were excluded from the study). They were informed of the nature of the research and gave their written consent. The study considered only those cases presenting a pre-stimulus and four post-stimulus evaluations spaced in time. Eighteen cases were selected from the 33 participants included in the original study database5 because those were the ones fulfilling the before-mentioned condition. All other cases completed less than four post-stimulus examinations, and therefore, were not considered for this first study. The demographic description of the participants is given in Table 1. The cohort distributions are broadly similar in terms of UPDRS grade (females: ; males: ) and age (females: ; males: ).
PwP code | Active/Sham | Gender | Age (Y) | UPDRS-III | DS |
---|---|---|---|---|---|
0100 | F | 71 | 10 | 74 | |
0800 | M | 58 | 9 | 83 | |
1100 | M | 73 | 14 | 66 | |
1200 | M | 72 | 21 | 57 | |
1400 | M | 64 | 10 | 71 | |
1600 | F | 79 | 20 | 58 | |
1700 | M | 70 | 16 | 61 | |
1800 | M | 61 | 9 | 66 | |
1900 | M | 77 | 8 | 75 | |
2000 | F | 76 | 28 | 54 | |
2200 | M | 66 | 13 | 73 | |
2300 | M | 55 | 7 | 76 | |
2400 | M | 72 | 10 | 81 | |
2500 | M | 81 | 14 | 63 | |
2600 | F | 73 | 16 | 57 | |
2700 | M | 77 | 14 | 47 | |
2800 | M | 80 | 15 | 84 | |
2900 | F | 74 | 17 | 65 |
2.2. Pre- and post-tTMS audio recordings
Each participant went through a baseline assessment (pre-stimulus evaluation at T0) before being submitted to 10 stimulation sessions (stimulation process) within the interval of two weeks; a follow-up evaluation session two weeks after stimulation (post-stimulus at T1); additional follow-up evaluations around six weeks (post-stimulus at T2), and around 10 weeks (post-stimulus at T3). The 18 participants of the subset included in this study were submitted also to a fourth post-treatment evaluation session around fourteen weeks after the stimulation process (post-stimulus at T4). The evaluation dates are listed in Appendix A in Table A.1. Each participant in the study was randomly assigned to active or sham stimulation. A perceptual assessment was conducted by a speech therapist rating speech performance (evaluation of dysarthria profile), faciokinesis, phonorespiration, and phonetic competence at each evaluation step. Audio recordings of the following utterances from each participant were taken before and after the stimulation process (pre- and post-stimuls): one free-topic monologue; one short neutral reading; one short emission of vowels [a:], [i:], [u:] (of s); one long emission of a sustained [a:] (of around 15s); one long emission of a diadochokinetic exercise consisting in the repetition of [pataka] (s), and one single emission of 10 different selected tri-syllabic words. The trial was registered on clinicaltrials.gov (Number NCT04203615).
2.3. Speech processing methods
The study was conducted on 4s fragments of the long emissions of a sustained [a:], selected 2s after the vocal onset, to avoid phonation start transients and also fatigue effects on phonation.25 The stimulation protocol and speech recording conditions are described in detail in Brabenec et al.5 Participants were submitted to rTMS (DuoMAGXT-100, Deymed Diagnostic) in 10 stimulation sessions over two weeks at CEITEC, Masaryk University. Each stimulation session took 40min. to complete, during which an eight-shaped coil over the right posterior superior temporal gyrus (STG, MNI coordinates , , ) applied trains of pulses at a frequency of 1Hz, 100% intensity of the pre-estimated resting motor threshold (1800 pulses per stimulation session). In the case of sham stimulation sessions, the same coil was used in quite a similar fixture, producing the same sounds as in the active case described, but no magnetic field was applied. The stimulation thresholds and settings were established in a preliminary study, and because of their complexity, the interested reader will be referred to it for further clarification.5 Audio recordings were taken using a large capsule cardioid microphone M-AUDIO Nova mounted on a boom arm RODE PSA1 at a distance of approximately 20cm from the patient’s mouth. Acoustic signals were digitized by the M-AUDIO Fast Track Pro audio interface with a 48kHz sampling frequency and 16-bit resolution. In what follows the processing methods for the estimation of NMA descriptions consisting of EEG-aligned frequency bands of the biomechanical VFS estimated from speech recordings will be commented on
• | Fragments of 4s long from the recordings of the vowel [a:], were down-sampled at a sampling rate of 16kHz, selected between the time instants at 2s and 6s, to skip potential onset and decay effects. This sampling rate preserves the frequency contents of glottal signals. | ||||
• | The ONPT transfer function was evaluated by a 24-pole inverse adaptive lattice-ladder filter26 based on the iterative adaptive inverse filtering (IAIF) algorithm.27 The size of the filter has been fixed according to Akaike’s criterion,28 using a factor of 1.5 over the sampling frequency divided by 1000 for an optimal filter size on linear predictive error estimation, therefore, for a sampling frequency of 16kHz, the size of the filter has been set to order 24. The adaptive lattice-ladder inverse filter estimates a prediction-error polynomial reducing the speech segment being analyzed to a residual by classical deconvolution as , where is the impulse response of the prediction-error polynomial emulating the inverse transfer function of the ONPT, such that , where , is the filter coefficient order, and is the angular frequency, therefore, the lattice structure reconstructs the tube chain structure of the ONPT, and its associated transfer function is removed from the spectral contents of the speech signal. A description of the inversion filter details can be found in Gómez et al.29 | ||||
• | The GS was estimated in pitch-synchronous cycles30 by numerically integrating the inverse filter residual. | ||||
• | The VFS (), given in was estimated from the spectral tilt of the GS adjusted on a 2-mass model of the vocal fold biomechanics.29,31 | ||||
• | The VFS was de-biased and de-trended by a moving-average filter. | ||||
• | The working hypothesis established that VFS is the direct consequence of the neuromotor activation of the cricothyroid and thyroarytenoid muscles. To sustain a given stable phonation frequency , a delicate equilibrium between both activations is necessary.11 This equilibrium is represented by an average baseline value of VFS (trend). Oscillations around this trend would reproduce small-signal alterations of the neuromotor activation, therefore, a de-trended VFS would produce a good estimate of neuromotor instability due to agonist–antagonist balance misadjustment. To obtain a frequency-band description of neuromotor instability, the de-trended VFS was processed by a bank of fifth-order Butterworth band-pass filters tuned at the respective EEG-related frequency bands, (: Hz; : Hz; : Hz; : Hz; : Hz; : Hz). As a result, a set of de-trended VFS frequency-band time signals is produced, where is the evaluation session index (), is the participant index (), and is the frequency band index (), pointing to the six frequency bands defined above, and is the time index. |
2.4. Data analysis
An estimation of the VFS was produced using the methods described in subsection 2.3, as VFS is assumed to be directly related to the NMA of the areas responsible for laryngeal control during phonation. Under the hypothesis stated in Sec. 1, this signal has been decomposed in frequency bands corresponding to EEG () activity. The amplitude distributions of each EEG-related frequency were estimated from their histograms. Distributions from post-stimulus recordings were compared with the corresponding ones from pre-stimulus conditions to produce explainable interpretations of potential behavioral changes in the phonation function.
The comparison methods proposed were based on LLRs and hypothesis tests. The LLR between two given probability density functions (pdfs) and may be defined as
It may be seen that the sign of one integral is the opposite of the other, therefore, it will be expected that in the case of unimodal distributions when there is a strong difference in variance , will be much narrower than , and consequently will be much smaller in size than , and . Conversely, when the opposite condition will prevail (). This is especially evident in normal and near-normal distributions. To put it otherwise, given the properties of probability densities, if on the interval , most likely it will be expected to be narrower (lower variance) than , or in other words, that the generating process of would produce less dispersed outcomes (more stable) than that of . Therefore, it is expected that functional improvements in phonation will produce lower variance post-stimulus frequency band distributions, and therefore, positive LLRs, and on the contrary, worsening phonation conditions will produce negative LLRs.
The significance of the comparisons among pre- and post-stimulus distributions was assessed by Mann–Whitney (MW) -tests, on the null hypothesis of equal medians, because typically distribution patterns might differ from normality. The tests were carried out on EEG-related band-frequency feature samples , therefore, hypothesis tests for a given participant and a given feature would be conducted on each post-stimulus sample () concerning the pre-stimulus one (), would be given by
The above-described methods allow us to determine the behavior of each frequency band based on LLR estimations (1) to explore whether these features improve significantly as a result of the intervention by rTMS, with special attention to which frequency bands would be more sensitive to changes in functional behavior. A global score to describe the general behavior of potential improvements regarding a specific participant may be defined on the averages over time of each feature sample . As is an unbiased estimation of VFS, it may be associated with the tremor (oscillating instability) of the vocal fold. Its amplitude is expected to be larger the more acute the functional disorder affection of each participant’s phonation. Therefore, lower values of will be associated with a less unstable phonation, and therefore, with larger stimulation beneficial effects. Having this consideration in mind, the following definition was used as a normalized weighted score associated with each frequency band feature per evaluation session :
Another relevant score was defined on the progression trend of improvements, based on the first difference of average estimations between successive recordings, as
3. Results
An example from a sustained emission of the vowel [a:] during 4 s by one of the participants actively stimulated with rTMS is shown in Fig. 1. This specific example is included as a prototype to describe the speech processing protocol, not to be taken as a mark of generalized behavior, but as a particular phenomenological description of the examination procedures conducted on each phonation analyzed, worth of being examined in detail as a singular case of study. The VFS was estimated from the GS using the methods described in subsection 2.3.

Fig. 1. (Color online) EEG-band description of a 4 s segment of phonation from the pre-stimulus recording of active case 1400, during the utterance of a sustained vowel [a:]: (a) original speech signal, with the line superimposed in red; (b) estimation of the UVFS; (c) UVFS logarithmic power spectrogram; (d), (e) activity on the -band and its linear spectrogram; (f), (g) id. on the -band; (h), (i) id. on the -band; (j), (k) id. on the -band; (l), (m) id. on the -band; (n), (o) id. on the -band. The activity in the -band is especially relevant following the event in the interval 4.0–4.2s. Clarification note: the labeling “Rel. Amp.” in template (h) applies to all the left-hand side vertical templates, from (b) to (n), whereas the label “Frequency (Hz) in template (i) applies to all right-hand side vertical templates, from (c) to (o).
The -band and -band frequency distributions, corresponding to the best-behaving active case (1400) are presented in Figs. 2(a) and 2(b). A similar set of -band and -band distributions, corresponding to the worst-behaving sham case (2200) is shown in Figs. 2(c) and 2(d).

Fig. 2. Tremor amplitude distribution boxplots for the best (active 1400) and worst (sham 2200) behaving cases; (a) case 1400 -band; (b) case 1400 -band; (c) case 2200 -band; (d) case 2200 -band. When two relevant features related to phonation stability are considered, such as the medians of the tremor amplitudes, and their dispersion measured by the interquartile range, it may be seen that in the active stimulation case (1400), a strong reduction in amplitude and dispersion is observed in the evaluations T1–T4 following the pre-stimulus evaluation session (T0) in both bands considered ( and ), whereas case 2200 experiences a clear deterioration in subsequent evaluation sessions (T1–T4) concerning the pre-stimulus one (T0) in both bands.
The results of evaluating the LLRs between the pre-stimulus (T0) and the four post-stimulus on the frequency bands following expression (1) are given in Table 2.
Part. Code | T1 LLR | T2 LLR | T3 LLR | T4 LLR | T1 LLR | T2 LLR | T3 LLR | T4 LLR |
---|---|---|---|---|---|---|---|---|
0100 () | 0.323 | 1.825 | 0.014 | 0.102 | 0.055 | 0.843 | 0.217 | 0.113 |
0800 () | 0.156 | 0.160 | 0.241 | 0.002 | 0.224 | 0.306 | 0.316 | 0.310 |
1100 () | 0.072 | 0.320 | 0.093 | 0.214 | 0.109 | 0.189 | 0.095 | 0.249 |
1200 () | 0.177 | 0.228 | 0.113 | 0.068 | 0.463 | 0.446 | 0.444 | 0.422 |
1400 () | 0.047 | 0.162 | 0.092 | 0.077 | 0.155 | 0.158 | 0.136 | 0.184 |
1600 () | 0.367 | 0.146 | 0.044 | 0.284 | 0.012 | 0.023 | 0.067 | 0.021 |
1700 () | 0.455 | 0.228 | 0.686 | 0.000 | 0.009 | 0.086 | 0.240 | 0.049 |
1800 () | 0.836 | 0.334 | 0.877 | 1.108 | 0.914 | 0.744 | 1.439 | 0.890 |
1900 () | 0.302 | 0.064 | 0.289 | 0.183 | 0.116 | 0.009 | 0.102 | 0.032 |
2000 () | 0.133 | 0.441 | 0.254 | 0.113 | 0.079 | 0.444 | 0.087 | 0.030 |
2200 () | 0.575 | 0.362 | 0.325 | 1.317 | 0.461 | 0.543 | 0.365 | 1.432 |
2300 () | 0.100 | 0.062 | 0.033 | 0.068 | 0.223 | 0.323 | 0.214 | 0.098 |
2400 () | 0.608 | 0.593 | 0.333 | 0.139 | 0.216 | 0.115 | 0.374 | 0.306 |
2500 () | 0.802 | 0.867 | 0.952 | 0.576 | 0.075 | 0.064 | 0.256 | 0.928 |
2600 () | 0.098 | 0.198 | 0.029 | 0.115 | 0.104 | 0.172 | 0.043 | 0.068 |
2700 () | 1.029 | 0.101 | 1.391 | -1.906 | 0.586 | 0.025 | 0.442 | 0.556 |
2800 () | 0.453 | 0.433 | 0.946 | -0.699 | 0.505 | 0.036 | 0.326 | 0.976 |
2900 () | 0.113 | 0.203 | 0.220 | 0.717 | 0.252 | 0.222 | 0.252 | 0.514 |
The results of the corresponding MW tests following expression (4), availing the relevance of the LLRs are given in Table 3, andthe global scores (7) are given in Table 4.
Part. Pre-Code | T1 pvMW | T2 pvMW | T3 pvMW | T4 pvMW | T1 pvMW | T2 pvMW | T3 pvMW | T4 pvMW |
---|---|---|---|---|---|---|---|---|
0100 () | 0.001 | 0.001 | 0.899 | 0.006 | 0.101 | 0.001 | 0.001 | 0.001 |
0800 () | 0.001 | 0.001 | 0.001 | 0.006 | 0.001 | 0.001 | 0.001 | 0.001 |
1100 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
1200 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
1400 () | 0.001 | 0.001 | 0.001 | 0.006 | 0.001 | 0.001 | 0.001 | 0.001 |
1600 () | 0.001 | 0.001 | 0.934 | 0.001 | 0.172 | 0.007 | 0.001 | 0.010 |
1700 () | 0.001 | 0.001 | 0.001 | 0.281 | 0.574 | 0.007 | 0.001 | 0.632 |
1800 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
1900 () | 0.001 | 0.001 | 0.001 | 0.802 | 0.001 | 0.006 | 0.001 | 0.219 |
2000 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.985 | 0.001 | 0.730 | 0.026 |
2200 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
2300 () | 0.001 | 0.001 | 0.114 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
2400 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
2500 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.089 | 0.001 | 0.001 |
2600 () | 0.001 | 0.001 | 0.921 | 0.004 | 0.001 | 0.001 | 0.078 | 0.013 |
2700 () | 0.001 | 0.018 | 0.001 | 0.001 | 0.001 | 0.216 | 0.001 | 0.001 |
2800 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
2900 () | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 | 0.001 |
Code | Active/Sham | Gender | Global Score () | Agreement | TP | TN | FP | FN |
---|---|---|---|---|---|---|---|---|
0100 | F | 0.243 | 0 | 0 | 0 | 0 | 1 | |
0800 | M | 0.343 | 1 | 1 | 0 | 0 | 0 | |
1100 | M | 0.392 | 1 | 1 | 0 | 0 | 0 | |
1200 | M | 0.533 | 1 | 1 | 0 | 0 | 0 | |
1400 | M | 0.625 | 1 | 1 | 0 | 0 | 0 | |
1600 | F | 0.220 | 0 | 0 | 0 | 1 | 0 | |
1700 | M | 0.174 | 1 | 0 | 1 | 0 | 0 | |
1800 | M | 1.870 | 1 | 0 | 1 | 0 | 0 | |
1900 | M | 0.091 | 1 | 0 | 1 | 0 | 0 | |
2000 | F | 0.128 | 1 | 1 | 0 | 0 | 0 | |
2200 | M | 3.374 | 1 | 0 | 1 | 0 | 0 | |
2300 | M | 0.154 | 0 | 0 | 0 | 1 | 0 | |
2400 | M | 0.336 | 1 | 0 | 1 | 0 | 0 | |
2500 | M | 1.528 | 1 | 0 | 1 | 0 | 0 | |
2600 | F | 0.046 | 1 | 0 | 1 | 0 | 0 | |
2700 | M | 2.116 | 0 | 0 | 0 | 0 | 1 | |
2800 | M | 1.456 | 0 | 0 | 0 | 0 | 1 | |
2900 | F | 1.328 | 0 | 0 | 0 | 0 | 1 | |
Total | 0.67 | 5 | 7 | 2 | 4 | |||
Threshold () | 0.1 | |||||||
Sensitivity (St) | 0.56 | |||||||
Specificity (Sp) | 0.78 | |||||||
Accuracy (Ac) | 0.67 | |||||||
F1 score | 0.63 |
A very relevant question has to see with the effects of rTMS on perceptual assessment of dysarthric behavior performance. This meta-information might serve to complement other studies, therefore, it was evaluated by a speech therapist and a neurologist for each participant at each evaluation session. For such, a three-function battery of tests (3F) was used to evaluate faciokinesis, phonorespiration, and phonetics (articulation), each one of them scoring between 0 (lowest score, poorest performance) to 30 (highest score, best performance). A total dysarthric score (DS) was estimated by summing the three mentioned scores, ranging from 0 (anarthria), 1–15 (profound dysarthria), 16–31 (severe dysarthria), 32–47 (moderate dysarthria), 48–63 (mild dysarthria), 64–79 (very mild dysarthria), to 80–90 (no dysarthria). The following table presents the evaluation of DS for each participant in Table 1, and the relative percentage of evolution after T4 versus T0 to be estimated as , where for case 0100, and for all other cases.
4. Discussion
As already stated in the introduction, this work is intended to characterize the stability of phonation in pre-stimulus and post-stimulus vocal emissions following rTMS by assessing the validity of features estimated on glottal neuromechanics from a limited set of PD participants. The example depicted in Fig. 1(a) was given to illustrate the protocol followed in the assessment. This example is especially meaningful, as it shows two events of phonation activity breaks typical in PD dysphonia, at intervals 4.0–4.2s and 5.1–5.2s, resulting in a muscle stiffness drop at 4s appreciated in Fig. 1(b), which is not that evident at 5.5s, although the stiffness intensification from proprioceptive correction is quite well appreciated in both cases on the -band (Fig. 1(c)). A relaxation towards stable muscle tone is observed between 4.2s and 5.0s as a follow-up after stiffness restoration. Intuitively, it could be hypothesized that the -band signal would be associated with the group activity of sets of motor units reacting to proprioceptive feedback (coarse tuning), whereas the -band could be related to auditory feedback-induced motor actions to maintain muscle tone activity as stable as possible according to tonal settings (fine-neuromotor control of the thyroarytenoid muscle tension). Although the findings presented in Fig. 1 are rather specific to that case, similar 0 blocking events are not infrequent in PD dysarthria. These events and their associated EEG-band activity reveal interesting information, which could help in clarifying the phenomena behind muscular blocking and neuromotor failure.33,34 The first important observation is that an apparent correction takes place immediately to neuromotor blocking, probably from the auditory system, as revealed in the -band, which shows clear corrective actions to recover muscle tone. The -band (Figs. 1(f) and 1(g)) shows intensified activity before and after the first blocking, and a moderate degree of tremor all over the whole interval studied. But interestingly, the -band shows a low level of activity before each blocking, which is reactivated immediately after (Figs. 1(h) and 1(i)). It is also seen as a ripple in the VFS (Fig. 1(b)). Some similar phenomenon is observed in the -band (Figs. 1(j) and 1(k)) where a strong burst of activity is seen after the first blocking following an interval of very low activity. On the contrary, the activity on the -band (Figs. 1(l) and 1(m)) is very intense before the blocking, to become more diffuse and less intense after the incident. The activity in the -band (Figs. 1(n) and 1(o)) seems to intensify after each blocking. These observations open the possibility of conducting studies on 0 blocking, depending on the dual neuromotor mechanism influencing VFS, on the one hand, the cricothyroid muscle innervated by the inferior laryngeal nerve, and on the other hand, the thyroarytenoid muscle innervated by the recurrent laryngeal nerve,12 pointing to misadjustment in the agonist–antagonist balanceexerted by these muscles when failing to ensure fine 0 tuning. Of course, this assumption would demand further studies to provide generalization insights on the VFS neuromotor driving functionality.
The information provided by the activity in the and bands as presented in Figs. 2(a) and 2(b) for the same case (active stimulation case 1400) reveals an important decay of phonation instability after the stimulation, where the tremor distributions reduce notably their variance and average values, the effect being more strongly observed in the -band (Fig. 2(a), T1). This drastic improvement, in value and dispersion, is slightly worsened in the next observations, although post-stimulus tremor amplitudes keep well under 1%. This observation is especially relevant, as the -band is classically associated with the so-called “low-frequency tremor” by many studies in voice quality, as it is easily perceived by listening, and it is typically associated with many cases of PD phonation.35,36
As it was mentioned before, bands and are considered especially relevant in the study according to previous knowledge on EEG-related NMA.37 In this sense, it will be of special interest to review the behavior of the best and worst functional behavior cases comparing pre- and post-stimulus feature distributions in both bands, corresponding to cases 1400 and 2200 shown in Figs. 2(a)–2(d). The boxplots in Fig. 2 show the distributions of the tremor amplitude in the (a) and (b) frequency bands from case 1400, corresponding to the same participant (pre-stimulus code T0, post-stimulus codes T1, T2, T3, and T4). Assuming that tremor is associated with phonation instability, a decrement in the tremor amplitude should be considered an improvement. Therefore, as boxplots, T1–T4 (post-stimulus) show a smaller median and a much smaller interquartile dispersion than boxplot T0 (pre-stimulus) in both bands ( and ), it might be concluded that the post-stimulus tremor is less unstable than pre-stimulus one, and consequently, that an improvement is observed. The situation concerning Figs. 2(c) and 2(d) is just the opposite, as it may be seen that post-stimulus distributions (T1–T4) show larger medians and much larger interquartile dispersion than the pre-stimulus one (T0) in both bands ( and ), therefore, it might be concluded that the situation, contrary to being improved, has deteriorated substantially.
These two cases showed the largest improvements and the largest deterioration in phonation stability. Whereas improvements seen in 1400 could be attributed to the beneficial effects of rTMS, the increment in instability shown in case 2200 could not be attributed to the effects of nonstimulation, therefore, a possible explanation would be a worsening in the phonation conditions within a short time interval between T0 and T4. Some other circumstances could have influenced the worse behavior observed. The same criticism could apply to case 1400, which experienced a strong improvement because recording T0 showed a large instability, which was not shown in T1–T4. Of course, the possible benefits of rTMS might not be the only explanation for such evolution or not its single cause. Obviously, this question is fully open to discussion.
Table 2 presents interesting results when analyzing the phonation instability of each participant before and after stimulation. It may be appreciated that a subset of the participants who underwent active stimulation (0800, 1100, 1200, 1400) shows mainly improvements in most of the evaluation sessions on the two bands of reference. Another subset of sham cases (1600 and 1900) experienced unexpected improvements, whereas two other sham cases (1700 and 1800) showed worsening behavior (quite strong in 1800, possibly pointing out to an extremely stable condition in T0). Another subset ofactive cases (2000, 2700, 2800, 2900) experienced slight or no improvements at all. A fourth subset of sham cases showed worsening behavior (2200 and 2400), whereas the remnant sham cases expressed mixed behavior (2300, 2500, and 2600).
As it is expected, a reduction in variance following a phonation stability improvement is observed when comparing each single pre-stimulus frequency band of the VFS estimate to the corresponding frequency band estimate from each post-stimulus recording in a longitudinal sequence (intra-participant) as given in Table 2 in terms of LLRs, validated by corresponding nonparametric MW -tests shown in Table 3, whereit may be seen that most of the tests reject the null hypothesis, pointing to the fact that the majority of tremor bands examined showed significant differences between each first exploration in time (T0) and the subsequent ones (T1–T4). The tests failing to reject the null hypothesis correspond to small negative LLR values () in Table 2 (italicized) associated with potentially slight deterioration, but its significance is not sustained by the tests given in Table 3, except for case 1600 T1-band, producing a residual LLR of 0.012, its significance not being availed by the MW -test (-value of ). Besides, a transversal (inter-participant) -test has also been conducted to assess how general improvement scores given in Table 4 compare in the context of all participants.
Table 4 presents the summarized overall results indicating which cases showed better phonation stability conditions confronting pre- and post-stimulus examinations considering all participants, evaluation sessions, and frequency bands. The global score per participant , given in the fourth column from left, is estimated including all post-stimulus evaluation sessions (T1–T4) and five out of the six frequency bands (), as per expression (7). The fifth column (Agreement) assigns a “1” to the cases where active stimulation produced a negative global score (True positives, meaning improvement of tremor instability, in agreement with active stimulation expectations), and those cases of sham (non) stimulation which presented a deterioration of tremor instability (true negatives, corresponding to positive global scores, meaning no improvement). The true positives and true negatives are listed in columns TP and TN. Correspondingly, those sham cases (nonstimulus), showing improvement (false positives), and active stimulus cases showing deterioration (false negatives), are listed in columns FP and FN, respectively. The total amounts of TP, TN, FP, and FN are given at the bottom of the respective columns (sixth to ninth). Fixing the detection threshold at (assuming a slight improvement), the sensitivity, specificity, and overall accuracy of the stimulation methodology would be around 56%, 78%, and 67%, respectively, corresponding to an F1 score of 0.63. The agreement rate including TP and TN concerning the whole set size would be around 71%. Twelve cases presented results aligned with expectations: five active cases produced scores under the threshold (improvement in phonation stability), and seven sham cases produced scores over the threshold (worsening in phonation stability). The other six cases produced results contrary to expectations: four active cases produced scores over the threshold (worsening in phonation stability), and two sham cases produced scores under the threshold (improvement in phonation stability). As a general comment, two out of three cases included in the study behaved accordingly to expectations after having been submitted to the stimulation protocol, including active and sham cases.
A possible explanation for the relatively large amount of FP and FN cases could be attributed to the small sample size, as well as to the effects of possible confounding factors affecting the phonation of participants during the tests, such as possible different types of HD involved, and the comorbid and emotional uncontrolled conditions affecting vocal emissions during the evaluation sessions (aging, depression and anxiety, medication, among them38). These effects could play a rather important role in the evaluation results, considering the amount of time between the pre-stimulus and the post-stimulus examinations, extending between 93 (minimum) and 119 days (maximum, see Appendix A Table A.1). Long time intervals might make the evaluation protocol susceptible to the appearance of disruptive events, in the participants’ daily life conditions, such as episodes of emotional disturbance, changes in mood, dissatisfaction, discouragement, helplessness, depression, or despondency, as well as transitory respiratory or pharyngolaryngeal disorders, which could affect vocal production in one or another way, not being considered. Given the apparent high sensitivity of frequency bands to changing conditions in vocal fold unexplained alterations, minor changes in these confounding phonation conditions could produce substantial changes in amplitude distributions such as altering LLRs or null hypothesis rejection results. A further extended study should concentrate on comparing each examination from each participant to observe the extent to which improvement or worsening phonation stability conditions could be observed among them.
An important aspect to be discussed is the robustness of the approach, and if it can establish a causal relationship between rTMS and improvements in PD HD. Given the premises of the experimental setup, the comparison between pre-stimulus and post-stimulus results was undertaken using LLRs, because these ratios could be interpreted in terms of similarity and improvement sense. The fact that the evaluations conducted include a full longitudinal analysis with four post-stimulus evaluations taking into account the time interval having passed after stimulation, added important robustness to the study because improvement assessment was not based on a single post-stimulus evaluation, therefore, a tendency may be inferred from expressions (5) and (6) summarized in a single global improving score given by Eq. (7), because each single summarized score is supported by four longitudinal estimations comparing vs versus , plus three others comparing vs versus , having also the time intervals Ti–T0 into account. The LLRs given in Table 2 have been estimated using pdfs generated from normalized histograms of 100 bins summarizing tremor amplitude time series on a cycle-synchronous basis, which convey around 400 samples per phonation segment in a male voice, and about double size in a female voice. These same distributions have been used in computing MW -tests given in Table 3 to avail LLRs’ significance. Because the checking for false-discovery cases has been conducted already using MW -tests (Table 3) validating the results given in Table 2 of not being produced by chance when their respective -values are below 0.05 (confidence level), global scores have to be taken as reliable enough in the best and worst behaving cases, with the exceptions marked in bold in Table 3. Therefore, it could be asserted that rTMS seems to have some influence on phonation stability in the cases commented with a low margin of error. The reasons why these beneficial effects are not seen in other active cases can also be asserted with little margin for error. Why phonation stability may benefit from rTMS in some cases, and not in others, is a matter which cannot be determined within this study framework, therefore, the debate on causality is to be left for further study.
The general performance reliability has been also evaluated by a transversal (inter-participant) -test, been conducted to assess how general improvement scores given in Table 4 may be compared in the context of all participants. This study follows up on our previous work published by Brabenec et al.5 In this previous work, we evaluated the effect of rTMS perceptually utilizing the 3F test dysarthric profile.39 The test consists of three subtests: I. Faciokinesis, II. Phonorespiration, and III. Phonetics. The phonorespiration subtest could be further divided into the assessment of: (a) respiration, (b) respiration during phonation, and (c) phonation. We observed that the rTMS had a significant inter-speaker effect on articulation, prosody, and intelligibility (as assessed by the III. Phonetics subtest). However, we did not identify any significant impact on the II. Phonorespiration sub-score, which is in line with our findings, because in this study, the general performance reliability has been also evaluated by a transversal (inter-participant) -test. It has been conducted to assess how general improvement scores given in Table 4 may be compared in the context of all participants. The hypothesis -test between the global scores from the set of actively stimulated cases versus the set of sham cases in Table 4 will produce a -value of 0.2, which does not reject the equal means between both sets. Does it mean that the functional assessment methodology proposed fails in its objectives? Evidently, not. The most plausible interpretation is that because actively stimulated cases 2700, 2800, and 2900 show a large deterioration after stimulation, the average global score of the active set (0.36) is not far enough from the average global score of the sham set (0.87) for the two sets overlapping variances (1.18 and 1.61, respectively). Nevertheless, this observation does not invalidate the individual results, because these are evaluated longitudinally on their timeline statistics, independently of how any other participants arebehaving, as this transversal comparison points out. It must be added that longitudinal tests would be in full alignment with the old medical lemma “Treat the patient, not the disease”, which is the grounding floor emphasizing the balance ofintra-participantover inter-participant studies.
Regarding the results from the perceptual score evaluation given in Table 5, it may be seen that some improvements are appreciated in all the cases, although only those over 5% could be relevant in alignment with active stimulation, corresponding to cases 0100, 1100, 1200, 1400, and 2700, and correspondingly, in sham cases 1600, 1700, 1800, 1900, 2200, 2300, 2400, 2500, and 2600, being irrelevant in the remnant cases. It might be interesting to compare the global scores after the phonation assessment (Table 4) to the dysarthria assessment results (Table 5) to verify if there is some relationship between phonation and perceptual improvements. Pearson’s correlation coefficient between both scores gives a result of –0.06, indicating that there is almost no correlation between phonation and articulation results, although a two-tail -test on equal medians and different variances does not reject the null hypothesis under a 0.05 significance level (-value equal to 0.11). This result should not be unexpected, because phonation and articulation are under the control of quite different neural pathways, and they need not be affected similarly in all the cases considered.4
PwP code | DS0 | DS1 | DS2 | DS3 | DS4 | (%) |
---|---|---|---|---|---|---|
0100 | 76 | 76 | 77 | 80 | 5.26 | |
0800 | 83 | 85 | 87 | 89 | 86 | 3.61 |
1100 | 66 | 42 | 72 | 71 | 70 | 6.06 |
1200 | 57 | 59 | 66 | 69 | 74 | 29.82 |
1400 | 71 | 79 | 80 | 84 | 85 | 19.72 |
1600 | 58 | 64 | 64 | 71 | 69 | 18.97 |
1700 | 61 | 65 | 70 | 71 | 72 | 18.03 |
1800 | 66 | 73 | 72 | 73 | 71 | 7.58 |
1900 | 75 | 81 | 80 | 80 | 80 | 6.67 |
2000 | 54 | 63 | 60 | 71 | 55 | 1.85 |
2200 | 73 | 80 | 81 | 83 | 83 | 13.70 |
2300 | 76 | 80 | 81 | 81 | 80 | 5.26 |
2400 | 81 | 86 | 86 | 84 | 86 | 6.17 |
2500 | 63 | 64 | 66 | 71 | 71 | 12.70 |
2600 | 57 | 61 | 63 | 62 | 63 | 10.53 |
2700 | 47 | 54 | 52 | 55 | 56 | 19.15 |
2800 | 65 | 68 | 67 | 69 | 67 | 3.08 |
2900 | 84 | 84 | 84 | 85 | 85 | 1.19 |
In summary, the findings shown in Table 4 support the use of LLRs and conducting hypothesis tests focusing in particular on the - and -bands, as these bands are believed to summarize well both neuromotor and cognitive activity.21,40,41 In this sense, it must be stressed that besides its clear neuromotor character, phonation is also associated with a clear cognitive profile. The compelling argument is that speakers require mastering a real-time perception of their phonation performance to implement and conduct pitch and loudness fine adjustments, therefore, phonation has a marked undeniable cognitive character.42 Different brain areas, such as the primary motor phonation area (VmCSP and DsP), the cerebellum, the basal ganglia, the superior temporal and lateral prefrontal cortex, and the SM area, among others, are involved in the direct, indirect, and feedback neuromotor control circuitry,4,43,44 in what seems to be a clear attention-driven cognitive function. The apparent synchronized activity around phonation motor blocking and disruption events could demand coordinated action of motor units spiking at different frequencies.36,45,46,47 Of course, the clarification of all these observations would require further studies combining other cooperating methods, such as EEG, although at the cost of complicating signal acquisition and pre-filtering to remove facial muscle activity during speech production, possibly by the use of surface electromyography (sEMG).
Although the reach of this study is hampered by evident limitations, such as the nonexhaustive protocol and the relatively small sample size and gender unbalance, some insights on the efficacy of the stimulation methodology and the data analytics methodology used might still be drawn from the results presented, although they had to be taken more as speculative assumptions to open new research lines than proven facts based on exhaustive testing.
On the one hand, the effects of rTMS on phonation stability offer mixed results. Some of the active cases studied report notorious improvements in the frequency bands studied, which could not be explained by a by-chance effect, whereas some other actively stimulated cases do not seem to show phonation stability improvements, or even show more unstable phonation. Conversely, some sham cases show clear improvements, whereas some others show undeniable worsening phonation stability. Relating both behaviors to the absence of stimulation is not an easy task, opening an issue for further discussion. It seems that phonation stability behavior might be too much sensitive to confounding factors to serve as a unique marker by itself, and it should be combined with other speech-based traits, including articulation acoustic features as well. Besides, tremors might not be a bi-univocal feature of PD HD.38 In summary, it could be asserted that rTMS seems to have some influence on phonation stability in the cases commented with a low margin of error, although these beneficial effects are not seen in other active cases, also with a little margin of error. Why stimulation may seem beneficial for phonation stability in some cases, and not in others, is a matter which cannot be determined within this study framework, and it is to be left for further study.
On the other hand, it seems that the study of phonation instability to monitor disruption events in VFS, and its association with EEG-related frequency bands would be fully justified as a powerful introspective methodology to disclose interesting NMA in cortical areas affecting larynx control. In this sense, this approach would be well aligned with relevant studies on the structural complexity of the brain from EEG frequency sub-bands,48 the functional connectivity pattern assessment from EEG signals,49,50 or the detection of movement intention in brain-computer interface (BCI) systems from EEG signals,51 among others.
The analysis being proposed uses instability distributions of the VFS estimated from speech utterances instead of EEG recordings to evaluate differences in functional behavior based purely on acoustic signal analysis, shedding light beyond what can be provided by classical acoustic analysis. The ultimate objective driving future extensions of this study is to reproduce EEG-aligned descriptions of phonation estimated from audio recordings only, which could be used in neurodegenerative speech characterization. The justification for characterizing the VFS biomechanical correlate by frequency bands aligned with those of EEG studies, as if it were an EEG channel signal, comes from preliminary work done on the NMA of the muscles involved in articulation,52,53 using formants and glottal features. This relationship seems natural because the activity of the larynx, oropharyngeal, and facial muscles is the ultimate cause of source modulation and framing into acoustically perceived speech. Therefore, if muscles would respond to neuromotor stimulation, and eventually, to brain cortical activity, a way to study the intervening links of interest in modeling possible dysfunctions in the activation chain, would consider the important advancements in modeling and activity coding of CMC, to align EEG and acoustic signal descriptions on a common code.
This alignment offers the possibility of conducting a more ambitious, extensive, and exhaustive study including the combination of EEG recordings and other traits derived from articulatory movements to include EEG, sEMG, and audio recordings. In this sense, recent advancements in brain connectivity combining EEG, MEG, fMRI, and NIRS characterization by graph theory54,55,56,57 and probabilistic neural networks58 could offer new insights for future studies. The application of this methodology to synchronized mixed EEG-audio databases59 should offer new insights into speech production comprehension and eventually might allow further disclosure of brain functionality and physiological responses in PD.
Other possible methodologies for decomposing and characterizing the biomechanical correlate VFS derived from the acoustic phonation signal would include Gabor transform,60 Wavelet filter banks,61 Fourier-based synchrosqueezingtransform,51 transferentropy,62 or fuzzy synchronization likelihood,48 among others. These studies would require a further extension of the present paper out of reasonable limits; therefore, they are considered for future research. Likewise, phonation improvements from rTMS based on comparisons using glottal source features, such as the harmonic-noise ratios, first-second harmonic ratio, cepstral peak prominence, parabolic spectrum coefficient, open and closed quotients, and other indices of voice quality analysis63 could enrich the assessment protocol within a future study framework. Another important question left for future studies is the potential comparison of dysarthrophonia tests and limb bradykinesia based on a new assessment protocol.
5. Conclusions
In this study, the possibilities of predicting the interactions on the EEG-related – frequency bands of the NMA from the phonation acoustical signal have been explored. Although the size of the sample studied is a limit to the findings observed and these results are tentative given the limited number of participants and need to be verified in longer and larger trials, it may be asserted that the preliminary results are well aligned with ongoing studies in the field, especially in the use of LLRs to assess functional improvements in phonation after rTMS. This finding is based on the possibility of using the VFS to serve as a correlate to monitor disruption events in vocal emission attributable to PD consequences. Consequently, visualizing EEG-related frequency bands could help in understanding some of the phenomena underlying vocal emission disruptions. The positive effects of rTMS are evident in the results, although observations on phonation instability behavior from active and sham cases in pre-stimulus and post-stimulus recordings offered mixed results, concluding that the reasons behind findings need to be further explored and explained. There is a clear promise in these tentative findings grounded on previous work that can demonstrate differences in functional behavior based purely on acoustic signal analysis, which delves deeper and provides insights beyond what can be assessed by classical acoustic analysis and allow one to get closer to brain functionality and physiological responses in PD. However, we appreciate that there may be limitations and confounding factors when examined in detail such as emotional or comorbid conditions which might alter the sensitivity of functional assessment, and their effects should be taken into account in future studies.
Acknowledgments
This research received funding from European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant Agreement No. 734718(CoBeN), a grant from the Czech Ministry of Health, 16-30805A, a grant from EU– Next Generation EU (Project No. LX22NPO5107 (MEYS)), and grants TEC2016-77791-C4-4-R (Ministry of Economic Affairs and Competitiveness of Spain), and Teca-Park-MonParLoc FGCSIC-CENIE 0348-CIE-6-E (InterRegProgramme). Andrés Gómez-Rodellar holds a scholarship from the Medical Research Council Doctoral Training Programme in the Usher’s Institute (University of Edinburgh Medical School).
Appendix A
Code | Interval | Dates | Time Lap | Weight | Code | Interval | Dates | Time Lap | Weight |
---|---|---|---|---|---|---|---|---|---|
0100 | T0 | 4.9.2017 | 0.00 | 0.00 | 2000 | T0 | 27.9.2018 | 0.00 | 0.00 |
0101 | T1 | 15.9.2017 | 11.00 | 0.11 | 2001 | T1 | 12.10.2018 | 15.00 | 0.15 |
0102 | T2 | 16.10.2017 | 42.00 | 0.42 | 2002 | T2 | 9.11.2018 | 43.00 | 0.43 |
0103 | T3 | 13.11.2017 | 70.00 | 0.70 | 2003 | T3 | 7.12.2018 | 71.00 | 0.72 |
0104 | T4 | 13.12.2017 | 100.00 | 1.00 | 2004 | T4 | 4.1.2019 | 99.00 | 1.00 |
0800 | T0 | 8.2.2018 | 0.00 | 0.00 | 2200 | T0 | 9.11.2018 | 0.00 | 0.00 |
0801 | T1 | 23.2.2018 | 15.00 | 0.14 | 2201 | T1 | 23.11.2018 | 14.00 | 0.14 |
0802 | T2 | 6.3.2018 | 26.00 | 0.24 | 2202 | T2 | 18.12.2018 | 39.00 | 0.40 |
0803 | T3 | 27.4.2018 | 78.00 | 0.72 | 2203 | T3 | 18.1.2019 | 70.00 | 0.71 |
0804 | T4 | 28.5.2018 | 109.00 | 1.00 | 2204 | T4 | 15.2.2019 | 98.00 | 1.00 |
1100 | T0 | 6.4.2018 | 0.00 | 0.00 | 2300 | T0 | 12.11.2018 | 0.00 | 0.00 |
1101 | T1 | 20.4.2018 | 14.00 | 0.13 | 2301 | T1 | 4.12.2018 | 22.00 | 0.20 |
1102 | T2 | 21.5.2018 | 45.00 | 0.41 | 2302 | T2 | 4.1.2019 | 53.00 | 0.47 |
1103 | T3 | 22.6.2018 | 77.00 | 0.71 | 2303 | T3 | 1.2.2019 | 81.00 | 0.72 |
1104 | T4 | 24.7.2018 | 109.00 | 1.00 | 2304 | T4 | 4.3.2019 | 112.00 | 1.00 |
1200 | T0 | 4.6.2018 | 0.00 | 0.00 | 2400 | T0 | 12.11.2018 | 0.00 | 0.00 |
1201 | T1 | 15.6.2018 | 11.00 | 0.11 | 2401 | T1 | 4.12.2018 | 22.00 | 0.20 |
1202 | T2 | 18.7.2018 | 44.00 | 0.44 | 2402 | T2 | 4.1.2019 | 53.00 | 0.47 |
1203 | T3 | 15.8.2018 | 72.00 | 0.72 | 2403 | T3 | 1.2.2019 | 81.00 | 0.72 |
1204 | T4 | 12.9.2018 | 100.00 | 1.00 | 2404 | T4 | 4.3.2019 | 112.00 | 1.00 |
1400 | T0 | 30.8.2018 | 0.00 | 0.00 | 2500 | T0 | 3.12.2018 | 0.00 | 0.00 |
1401 | T1 | 14.9.2018 | 15.00 | 0.15 | 2501 | T1 | 14.12.2018 | 11.00 | 0.12 |
1402 | T2 | 16.10.2018 | 47.00 | 0.47 | 2502 | T2 | 10.1.2019 | 38.00 | 0.40 |
1403 | T3 | 12.11.2018 | 74.00 | 0.75 | 2503 | T3 | 8.2.2019 | 67.00 | 0.71 |
1404 | T4 | 7.12.2018 | 99.00 | 1.00 | 2504 | T4 | 8.3.2019 | 95.00 | 1.00 |
1600 | T0 | 6.9.2018 | 0.00 | 0.00 | 2600 | T0 | 4.2.2019 | 0.00 | 0.00 |
1601 | T1 | 21.9.2018 | 15.00 | 0.15 | 2601 | T1 | 15.2.2019 | 11.00 | 0.12 |
1602 | T2 | 23.10.2018 | 47.00 | 0.47 | 2602 | T2 | 18.3.2019 | 42.00 | 0.44 |
1603 | T3 | 19.11.2018 | 74.00 | 0.75 | 2603 | T3 | 12.4.2019 | 67.00 | 0.71 |
1604 | T4 | 14.12.2018 | 99.00 | 1.00 | 2604 | T4 | 10.5.2019 | 95.00 | 1.00 |
1700 | T0 | 7.9.2018 | 0.00 | 0.00 | 2700 | T0 | 18.2.2019 | 0.00 | 0.00 |
1701 | T1 | 1.10.2018 | 24.00 | 0.20 | 2701 | T1 | 1.3.2019 | 11.00 | 0.12 |
1702 | T2 | 31.10.2018 | 54.00 | 0.45 | 2702 | T2 | 1.4.2019 | 42.00 | 0.44 |
1703 | T3 | 27.11.2018 | 81.00 | 0.68 | 2703 | T3 | 3.5.2019 | 74.00 | 0.78 |
1704 | T4 | 4.1.2019 | 119.00 | 1.00 | 2704 | T4 | 24.5.2019 | 95.00 | 1.00 |
1800 | T0 | 8.10.2018 | 0.00 | 0.00 | 2800 | T0 | 11.3.2019 | 0.00 | 0.00 |
1801 | T1 | 19.10.2018 | 11.00 | 0.12 | 2801 | T1 | 25.3.2019 | 14.00 | 0.15 |
1802 | T2 | 15.11.2018 | 38.00 | 0.40 | 2802 | T2 | 18.4.2019 | 38.00 | 0.41 |
1803 | T3 | 14.12.2018 | 67.00 | 0.71 | 2803 | T3 | 17.5.2019 | 67.00 | 0.72 |
1804 | T4 | 11.1.2019 | 95.00 | 1.00 | 2804 | T4 | 12.6.2019 | 93.00 | 1.00 |
1900 | T0 | 3.10.2018 | 0.00 | 0.00 | 2900 | T0 | 11.3.2019 | 0.00 | 0.00 |
1901 | T1 | 12.10.2018 | 9.00 | 0.09 | 2901 | T1 | 25.3.2019 | 14.00 | 0.15 |
1902 | T2 | 12.11.2018 | 40.00 | 0.42 | 2902 | T2 | 18.4.2019 | 38.00 | 0.41 |
1903 | T3 | 7.12.2018 | 65.00 | 0.68 | 2903 | T3 | 17.5.2019 | 67.00 | 0.72 |
1904 | T4 | 7.1.2019 | 96.00 | 1.00 | 2904 | T4 | 12.6.2019 | 93.00 | 1.00 |
References
- 1. , An essay on the shaking palsy, J. Neuropsychiatry Clin. Neurosci. 14 (2002) 223–236. https://doi.org/10.1176/jnp.14.2.223 Crossref, Medline, Web of Science, Google Scholar
- 2. , Epidemiology of Parkinson’s disease, J. Neural Transm. 124(8) (2017) 901–905. https://doi.org/10.1007/s00702-017-1686-y Crossref, Medline, Web of Science, Google Scholar
- 3. , Parkinson’s disease: Cause factors, measurable indicators, and early diagnosis, Comput. Biol. Med. 102 (2018) 234–241. https://doi.org/10.1016/j.compbiomed.2018.09.008 Crossref, Medline, Web of Science, Google Scholar
- 4. , Motor Speech Disorders: Substrates, Differential Diagnosis, and Management, 3 (Elsevier, St. Louis, 2013). Google Scholar
- 5. , Non-invasive brain stimulation for speech in Parkinson’s disease: A randomized controlled trial, Brain Stimul. 14 (2021) 571–578. https://doi.org/10.1016/j.brs.2021.03.010 Crossref, Medline, Web of Science, Google Scholar
- 6. , Transcranial magnetic stimulation: A primer, Neuron 55(2) (2007) 187–199. https://doi.org/10.1016/j.neuron.2007.06.026 Crossref, Medline, Web of Science, Google Scholar
- 7. , Speech disorders in Parkinson’s disease: Early diagnostics and effects of medication and brain stimulation, J. Neural Transm. 124(3) (2017) 303–334. https://doi.org/10.1007/s00702-017-1676-0 Crossref, Medline, Web of Science, Google Scholar
- 8. , Hypophonia in Parkinson’s disease: Neural correlates of voice treatment revealed by PET, Neurology 60 (2003) 432–440. https://doi.org/10.1212/WNL. 60.3.432 Crossref, Medline, Web of Science, Google Scholar
- 9. , Anatomiefonctionnelle des nerfs glossopharyngien, vague, accessoireethypoglosse, Neurochirurgie 55 (2009) 132–135. https://doi.org/10.1016/j.neuchi.2009.01.018 Crossref, Medline, Web of Science, Google Scholar
- 10. , Principles of Voice Production (Prentice-Hall, 1994). Google Scholar
- 11. , A larynx area in the human motor cortex, Cereb. Cortex 18 (2009) 837–845. https://doi.org/10. 1093/cercor/bhm131 Crossref, Web of Science, Google Scholar
- 12. , Limbic and cortical control of phonation for speech in response to a public speech preparation stressor, Brain Imaging Behav. 14 (2020) 1696–1713. https://doi.org/10.1007/s11682-019-00102-x Crossref, Medline, Web of Science, Google Scholar
- 13. , Human cortical motor representation of the larynx as assessed by transcranial magnetic stimulation (TMS), Laryngoscope 114 (2004) 918–922. https://doi.org/10.1097/00005537-200405000-00026 Crossref, Medline, Web of Science, Google Scholar
- 14. ,
Cortical muscle coupling in Parkinson’s disease (PD) bradykinesia , in Parkinson’s Disease and Related Disorders, eds. P. Reiderer (Springer, Vienna, 2006), pp. 31–40. https://doi.org/10.1007/978-3-211-45295-0_7 Crossref, Google Scholar - 15. , Defective cortical drive to muscle in Parkinson’s disease and its improvement with levodopa, Brain 125(3) (2002) 491–500. https://doi.org/10.1093/brain/awf042 Crossref, Medline, Web of Science, Google Scholar
- 16. , Motor intention decoding from the upper limb by graph convolutional network based on functional connectivity, Int. J. Neural Syst. 31(12) (2021) 2150047. https://doi.org/10.1142/S0129065721500477 Link, Web of Science, Google Scholar
- 17. , Movement-related EEG oscillations of contralesional hemisphere discloses compensation mechanisms of severely affected motor chronic stroke patients, Int. J. Neural Syst. 31(12) (2021) 2150053. https://doi.org/10.1142/S0129065721500532 Link, Web of Science, Google Scholar
- 18. , Electroencephalogram–electromyography coupling analysis in stroke based on symbolic transfer entropy, Front. Neurol. 8 (2018) 176. https://doi.org/10.3389/fneur.2017.00716 Crossref, Web of Science, Google Scholar
- 19. , Corticomuscular and intermuscular coupling in simple hand movements to enable a hybrid brain–computer interface, Int. J. Neural Syst. 31(11) (2021) 2150052. https://doi.org/10.1142/S0129065721500520 Link, Web of Science, Google Scholar
- 20. , Challenges for future theories of Parkinson pathophysiology, Neurosci. Res. 177 (2022) 1–7. https://doi.org/10.1016/j.neures.2021. 11.010 Crossref, Medline, Web of Science, Google Scholar
- 21. , Widespread theta synchrony and high-frequency desynchronization underlies enhanced cognition, Nat. Commun. 8 (2017) 1704. https://doi.org/10.1038/s41467-017-01763-2 Crossref, Medline, Web of Science, Google Scholar
- 22. , How many gammas? Redefining hypocampal theta-gamma dynamic during spatial learning, Front. Behav. Neurosci. 16 (2022) 811278. https://doi.org/10.3389/fnbeh.2022.811278 Crossref, Medline, Web of Science, Google Scholar
- 23. , A multiblock PLS model of cortico-cortical and corticomuscular interactions in Parkinson’s disease, Neuroimage 63 (2012) 1498–1509. https://doi.org/10.1016/j.neuroimage. 2012.08.023 Crossref, Medline, Web of Science, Google Scholar
- 24. , Neurophysiological muscle activation scheme for controlling vocal fold models, IEEE Trans. Neural Syst. Rehab. Eng. 27(1) (2019) 1043–1052. https://doi.org/10.1109/TNSRE.2019.2906030 Crossref, Medline, Web of Science, Google Scholar
- 25. , The impact of respiratory function on voice in patients with presbyphonia, J. Voice 36(2) (2020) 256–271. https://doi.org/10.1016/j.jvoice. 2020.05.027 Crossref, Medline, Web of Science, Google Scholar
- 26. , Discrete-Time Processing of Speech Signals (Macmillan, New York, 1993). Google Scholar
- 27. , OPENGLOT—an open environment for the evaluation of glottal inverse filtering, Speech Commun. 107 (2019) 38–47. Crossref, Web of Science, Google Scholar
- 28. , The akaike information criterion: Background, derivation, properties, application, interpretation, and refinements, WIREs Comput. Stat. 11(3) (2019) e1460. https://doi.org/10.1002/wics.1460 Crossref, Web of Science, Google Scholar
- 29. , Glottal Source biometrical signature for voice pathology detection, Speech Commun. 51(9) (2009) 759–781. https://doi.org/10.1016/j.specom.2008.09.005 Crossref, Web of Science, Google Scholar
- 30. , Estimation of glottal closure instants in voiced speech using the DYPSA algorithm, IEEE Trans. Audio Speech Lang. Process. 15(1) (2007) 34–43. https://doi.org/10.1109/TASL.2006.876878 Crossref, Web of Science, Google Scholar
- 31. , A novel pre-processing technique in pathologic voice detection: Application to Parkinson’s disease phonation, Biomed. Signal Process. Control 68 (2021) 102604. https://doi.org/10.1016/j.bspc.2021.102604 Crossref, Web of Science, Google Scholar
- 32. , Probability, Random Variables, and Stochastic Processes (McGraw-Hill, New York, 1991). Google Scholar
- 33. , A multiblock PLS model of cortico-cortical and corticomuscular interactions in Parkinson’s disease, Neuroimage, 63 (2012) 1498–1509. https://doi.org/10.1016/j.neuroimage.2012.08.023 Crossref, Medline, Web of Science, Google Scholar
- 34. , Coupling analysis of EEG and EMG signals based on transfer entropy after consistent empirical Fourier decomposition, in 8th Int. Conf. Control, Automation, and Robotics (2022), pp. 436–441. https://doi.org/10.1109/ICCAR55106.2022.9782665 Crossref, Google Scholar
- 35. , Measurement of tremor in the voices of speakers with Parkinson’s disease, Procedia Comput. Sci. 128 (2018) 47–54. https://doi.org/10.1016/j.procs.2018.03.007 Crossref, Google Scholar
- 36. , Breaking down a rhythm: Dissecting the mechanisms underlying task-related neural oscillations, Front. Neural Circuits 16 (2022) 846905. https://doi.org/10.3389/fncir. 2022.846905 Crossref, Medline, Web of Science, Google Scholar
- 37. , Combined use of EMG and EEG techniques for neuromotor assessment in rehabilitative applications: A systematic review, Sensors 21 (2021) 7014. https://doi.org/10.3390/s21217014 Crossref, Web of Science, Google Scholar
- 38. P. Gillivan-Murphy, Voice tremor in Parkinson’s disease (PD): Identification, characterisation, and relationship with speech, voice and disease variables, Doctoral dissertation, Newcastle University (2013), https://theses.ncl.ac.uk/jspui/bitstream/10443/2170/1/Gillivan-Murphy%2013.pdf. Google Scholar
- 39. , The 3F test dysarthric profile-normative speach values in Czech, Ces. Slov. Neurol. Neurochir. 76(5) (2013) 614–618. Google Scholar
- 40. , Sensory-motor networks involved in speech production and motor control: An fMRI study, NeuroImage 109 (2015) 418–428, https://doi.org/10.1016/j.neuroimage.2015.01.040. Crossref, Medline, Web of Science, Google Scholar
- 41. , Rhythms for cognition: Communication through coherence, Neuron 88 (2015) 220–235. https://doi.org/10.1016/j.neuron.2015.09.034 Crossref, Medline, Web of Science, Google Scholar
- 42. , Overall Perspective. Principles of Neural Science, eds. E. Kandel, J. Schwartz, S. A. Siegelbaum and A. J. Hudspeth, 5th edn (MgGraw-Hill, New York, USA, 2013). Google Scholar
- 43. , Neural pathways underlying vocal control, Neurosci. Biobehav. Rev. 26 (2002) 235–258. https://doi.org/10.1016/S0149-7634(01)00068-9 Crossref, Medline, Web of Science, Google Scholar
- 44. , The somatotopy of speech: Phonation and articulation in the human motor cortex, Brain Cogn. 70 (2009) 31–41. https://doi.org/10.1016/j.bandc.2008. 12.006 Crossref, Medline, Web of Science, Google Scholar
- 45. , Efficient high-resolution TMS mapping of the human motor cortex by nonlinear regression, Neuroimage 245 (2021) 118654. https://doi.org/10.1016/j.neuroimage.2021.118654 Crossref, Medline, Web of Science, Google Scholar
- 46. , Cortical mechanisms underlying variability in intermittent theta-burst stimulation-induced plasticity: A TMS-EEG study, Clin. Neurophysiol. 132 (2021) 2519–2531. https://doi.org/10.1016/j.clinph. 2021.06.02 Crossref, Medline, Web of Science, Google Scholar
- 47. , Impaired motor cortical facilitatory-inhibitory circuit interaction in Parkinson’s disease, Clin. Neurophysiol. 132(10) (2021) 2685–2692. https://doi.org/10.1016/j.clinph.2021.05.032 Crossref, Medline, Web of Science, Google Scholar
- 48. , Complexity of weighted graph: A new technique to investigate structural complexity of brain activities with applications to aging and autism, Neurosci. Lett. 650 (2017) 103–108. https://doi.org/10.1016/j.neulet.2017.04.009 Crossref, Medline, Web of Science, Google Scholar
- 49. , Brain functional connectivity patterns for emotional state classification in Parkinson’s disease patients without dementia, Behav. Brain Res. 298B (2016) 248–260. https://doi.org/10.1016/j.bbr. 2015.10.036 Crossref, Web of Science, Google Scholar
- 50. , Evaluation of brain functional connectivity from electroencephalographic signals under different emotional states, Int. J. Neural Syst. 32(10) (2022) 2250026. https://doi.org/10.1142/S0129065722500265 Link, Web of Science, Google Scholar
- 51. , Detection of movement intention in EEG-based brain–computer interfaces using Fourier-based synchrosqueezing transform, Int. J. Neural Syst. 32(1) (2022). https://doi.org/10.1142/S0129065721500593 Link, Web of Science, Google Scholar
- 52. , Neuromechanical modelling of articulatory movements from surface electromyography and speech formants, Int. J. Neural Syst. 29(2) (2019) 1850039. https://doi.org/10.1142/S0129065718500399 Link, Web of Science, Google Scholar
- 53. , Acoustic to kinematic projection in Parkinson’s disease dysarthria, Biomed. Signal Process. Control 66 (2021) 102422. https://doi.org/10.1016/j.bspc.2021.102422 Crossref, Web of Science, Google Scholar
- 54. , Graph theory and brain connectivity in Alzheimer’s disease, Neuroscientist 23(6) (2017) 616–626. https://doi.org/10.1177/1073858417702621 Crossref, Medline, Web of Science, Google Scholar
- 55. , Complexity of functional connectivity networks in mild cognitive impairment subjects during a working memory task, Clin. Neurophysiol. 125(4) (2014) 694–702. https://doi.org/10.1016/j.clinph.2013.08.033 Crossref, Medline, Web of Science, Google Scholar
- 56. , Control of transcranial direct current stimulation duration by assessing functional connectivity of near-infrared spectroscopy signals, Int. J. Neural Syst. 32(1) (2022) 2150050. https://doi.org/10.1142/S0129065721500507 Link, Web of Science, Google Scholar
- 57. , Impact of machine learning pipeline choices in autism prediction from functional connectivity data, Int. J. Neural Syst. 31(4) (2021) 2150009. https://doi.org/10.1142/S012906572150009X Link, Web of Science, Google Scholar
- 58. , Computer-aided diagnosis of Parkinson’s disease using enhanced probabilistic neural network, J. Med. Syst. 39 (2015) 179. https://doi.org/10.1007/s10916-015-0353-9 Crossref, Medline, Web of Science, Google Scholar
- 59. , Dataset of speech production in intracranial electroencephalography, Sci. Data 9 (2022) 434. https://doi.org/10.1038/s41597-022-01542-9 Crossref, Medline, Web of Science, Google Scholar
- 60. , Gabor PDNet:Gabor transformation and deep neural network for Parkinson’s disease detection using EEG signals, Electronics 10 (2021) 1740. https://doi.org/10.3390/electronics10141740 Crossref, Web of Science, Google Scholar
- 61. , Automated detection of abnormal EEG signals using localized wavelet filter banks, Pattern Recognit. Lett. 133 (2020) 188–194. https://doi.org/10.1016/j.patrec.2020.03.009 Crossref, Web of Science, Google Scholar
- 62. , Transfer entropy — a model-free measure of effective connectivity for neurosciences, J. Comput. Neurosci. 30 (2011) 45–67. https://doi.org/10.1007/s10827-010-0262-3 Crossref, Medline, Web of Science, Google Scholar
- 63. , Robust and complex approach of pathological speech signal analysis, Neurocomputing 167 (2015) 94–111. https://doi.org/10.01016/j.neucom. 2015.02.085 Crossref, Web of Science, Google Scholar