EEG brain-machine interface to control a speech synthesizer
This project utilizes an established computational neural model of speech production, the DIVA model, as the theoretical foundation for a brain-machine interface (BMI) for control over a real-time speech synthesizer by individuals with near-total paralysis. Based on DIVA, we use formant frequencies as the key “intermediate representation” between neural activity and intended speech output. Formant frequencies provide a continuous, low-dimensional space for speech movement planning, which is ideal for brain-machine interface applications. This system was previously employed as a communication device which permitted an intracortical implant participant to successfully control a formant synthesizer for real-time production of vowel sounds. We are now extending this work in three important directions.
First, we are pursuing the possibility of using our current decoder/synthesizer approach in a non-invasive brain-machine interface that utilizes electroencephalography (EEG), thereby expanding utility of the system beyond locked-in patients willing to undergo intra-cortical implantation. We and others have established that standard EEG analysis techniques are capable of discriminating between the imagined production of two different speech sounds but are not currently sufficient for control of our speech synthesizer. Therefore, EEG signals related to imagined limb movements are being used in addition to imagined speech; a wealth of previous studies have shown limb movement signals to be readily identifiable and robust for BMI purposes. We have developed software that allows a practiced user to control the synthesizer for vowel production in a pilot study. In future work we will be investigating optimal methods for decoding EEG signals from imagined movement tasks as well as optimal training paradigms that allow the user to more quickly learn to produce “movements” of the synthesizer.
A second new component of this work involves the creation of improved speech synthesizers for real-time synthesis. Although a formant synthesizer can allow the user to produce a few words, the output is limited by the extreme difficulty of producing natural-sounding consonants with formant synthesis. We are therefore developing software that will, in real time, map from a formant frequency-type representation into a vocal tract articulatory model, which will then be used to synthesize speech output. This method allows creation of realistic-sounding consonants with simple smooth movements in the two-dimensional control space, thereby dramatically increasing the potential vocabulary of the user over the formant synthesizer, which is primarily limited to vowels.
Third, we are planning to enroll severely paralyzed patients for testing our EEG-based BCI systems. This will be carried out through a collaboration with Steve Williams of Boston Medical Center. Dr. Williams is Director of the New England Regional Spinal Cord Injury Center. We plan to test whether these patients can carry out the same tasks learned by neurologically normal subjects.
Successful completion of this project will contribute substantially to the rapidly growing field of neural prosthesis by further extending this work into the domain of speech BMI, an area in which the supervising faculty members are recognized as pioneers.
|Main personnel||Jon Brumberg, Andres Salazar Gomez||Funding:|
|Collaborators||Frank Guenther, Steve Williams||NIH/NIDCD: Investigating output modality for a brain-computer interface for communication
CELEST: Developing a brain-machine interface for speech communication