Etd

Mobile Paralinguistic Health Assessment from Speech: Energy-Efficient and Privacy-Preserving Neural Network Models

Öffentlich Deposited

Speech is an effective biomarker for evaluating neurological disorders, such as Traumatic Brain Injury (TBI), and mental health conditions. Speech production and communication difficulties are common manifestations of disability after TBI (2% of the population), whereas speech patterns such as low pitch and monotonous speech are effective indicators of depression (8.4% of the population). To alleviate healthcare burdens and reduce rehospitalization, passive speech monitoring through mobile devices offers a promising approach at scale, requiring minimal subject involvement while providing more accurate assessments compared to traditional methods that require active engagement and clinic visits.This dissertation addresses three major challenges in employing Deep Neural Networks (DNN) for continuous paralinguistic health assessment on smartphones: energy efficiency, adverse recording environments, and speaker privacy. DNN is a powerful method for speech analysis but consumes significant energy. To enhance power utilization of DNN and optimize speech recording on smartphones, a novel masking kernel is proposed to learn the most energy-efficient length and sampling rate of a speech sample during DNN training. Addressing the challenge of diverse recording environments, particularly in crowded spaces that patients may visit, requires the isolation of the target speaker's speech from other people's speech and background noise. We propose Target Speaker Isolation with Normal Distribution (TSI-N), which utilizes an N-vector speaker representation trained to follow an unbounded normal distribution for each speaker cluster, enabling precise isolation of the target speaker's speech. Furthermore, ensuring speaker privacy, including the protection of biometric and linguistic content from unauthorized access, is crucial. An adversarial pruning technique is proposed for extracting privacy-preserving speech features on smartphones. Successfully overcoming these challenges is critical for the effective implementation of passive and continuous paralinguistic health assessments on smartphones, that is essential for enhancing healthcare monitoring and interventions at scale.

Creator
Mitwirkende
Degree
Unit
Publisher
Identifier
  • etd-121225
Stichwort
Advisor
Orcid
Committee
Defense date
Year
  • 2024
Date created
  • 2024-04-18
Resource type
Source
  • etd-121225
Rights statement
License
Zuletzt geändert
  • 2024-05-29

Beziehungen

In Collection:

Objekte

Artikel

Permanent link to this page: https://digital.wpi.edu/show/kw52jd39m