‍JI Wei,WANG Chuanyu,LI Yun,et al.Auxiliary detection method of Parkinson’s disease based on multi-source speech information fusion[J].Journal of Signal Processing, 2023, 39(12): 2254-2264. DOI: 10.16798/j.issn.1003-0530.2023.12.012
Citation: ‍JI Wei,WANG Chuanyu,LI Yun,et al.Auxiliary detection method of Parkinson’s disease based on multi-source speech information fusion[J].Journal of Signal Processing, 2023, 39(12): 2254-2264. DOI: 10.16798/j.issn.1003-0530.2023.12.012

Auxiliary Detection Method of Parkinson’s Disease Based on Multi-Source Speech Information Fusion

  • ‍ ‍In the early stages of Parkinson’s disease, patients develop symptoms such as difficulties in pronunciation and unstable articulation due to a decrease in the flexible coordination ability of the vocal organs. To analyze the speech ability of the subjects, the experts design a multi-type corpus, including sustained vowels, repetitive syllables, and situational dialogues, based on the aforementioned physiological phenomena. Existing researches on Parkinson’s disease speech detection mostly rely on single-type corpora, which can evaluate the coordination ability of certain acoustic organs in the subjects but cannot comprehensively reflect the subjects’ vocal conditions and are susceptible to factors such as the collection environment and individual differences. To address the aforementioned issues, a multi-source speech information fusion model for assisting Parkinson’s disease detection was proposed in this paper. The aim was to fully utilize the multi-source speech data obtained from diverse types of corpora, extract comprehensive and rich pathological information, and counteract the influence of non-pathological factors. The proposed model consists of an encoder module, a decoder module, and a classifier module. In the encoder module, multiple branches are employed to learn the specific information from each individual source of speech data. Through a multi-head attention mechanism-based fusion branch, finer-grained information interaction is achieved, enabling the learning of common information present in the multi-source speech data, thereby comprehensively extracting the pathological information carried by the multi-source data. The decoder module assists the encoder module in information compression and redundancy elimination. The classifier module detects Parkinson’s disease based on the output of the encoder module, while also aiding the encoder module in learning compact representations of pathological information. To further ensure the extraction of specific and common information, the model imposes orthogonal constraints on these information components.Multiple comparative experiments were conducted, based on a self-collected dataset containing 340 speech samples. The experimental results demonstrated that the proposed model outperformed the models based on single-source speech data in terms of accuracy, sensitivity, and F1 score for Parkinson’s disease detection, with improvements of 6%, 3%, and 6% respectively. Moreover, the effective integration of common and specific information enabled the proposed model to achieve more than a 2.8% improvement in accuracy compared to other information fusion models.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return