Logo
User Name

Emina Alickovic

Associate Professor, Linköping University

Društvene mreže:

Institucija

Linköping University
Associate Professor
Gautam Sridhar, Sofia Boselli, Martin A. Skoglund, Bo Bernhardsson, E. Alickovic

OBJECTIVE This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise. APPROACH Three different models were implemented for comparison: a baseline linear model (LM), a non-linear model without contrastive learning (NLM), and a non-linear model with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment. RESULTS The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively. SIGNIFICANCE These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.

Johanna Wilroth, E. Alickovic, Martin A. Skoglund, Carine Signoret, J. Rönnberg, Martin Enqvist

Visual Abstract Hearing impairment (HI) disrupts social interaction by hindering the ability to follow conversations in noisy environments. While hearing aids (HAs) with noise reduction (NR) partially address this, the “cocktail-party problem” persists, where individuals struggle to attend to specific voices amidst background noise. This study investigated how NR and an advanced signal processing method for compensating for nonlinearities in Electroencephalography (EEG) signals can improve neural speech processing in HI listeners. Participants wore HAs with NR, either activated or deactivated, while focusing on target speech amidst competing masker speech and background noise. Analysis focused on temporal response functions to assess neural tracking of relevant target and masker speech. Results revealed enhanced neural responses (N1 and P2) to target speech, particularly in frontal and central scalp regions, when NR was activated. Additionally, a novel method compensated for nonlinearities in EEG data, leading to improved signal-to-noise ratio (SNR) and potentially revealing more precise neural tracking of relevant speech. This effect was most prominent in the left-frontal scalp region. Importantly, NR activation significantly improved the effectiveness of this method, leading to stronger responses and reduced variance in EEG data and potentially revealing more precise neural tracking of relevant speech. This study provides valuable insights into the neural mechanisms underlying NR benefits and introduces a promising EEG analysis approach sensitive to NR effects, paving the way for potential improvements in HAs.

Zlatan Ajanovi'c, Hamza Merzi'c, Suad Krilasevi'c, Eldar Kurti'c, Bakir Kudi'c, Rialda Spahi'c, E. Alickovic, Aida Brankovic, Kenan Sehic et al.

In this paper, we analyze examples of research institutes that stand out in scientific excellence and social impact. We define key practices for evaluating research results, economic conditions, and the selection of specific research topics. Special focus is placed on small countries and the field of artificial intelligence. The aim is to identify components that enable institutes to achieve a high level of innovation, self-sustainability, and social benefits.

Sara Carta, E. Alickovic, Johannes Zaar, Alejandro López Valdés, Giovanni M. Di Liberto

Hearing impairment alters the sound input received by the human auditory system, reducing speech comprehension in noisy multi-talker auditory scenes. Despite such difficulties, neural signals were shown to encode the attended speech envelope more reliably than the envelope of ignored sounds, reflecting the intention of listeners with hearing impairment (HI). This result raises an important question: What speech-processing stage could reflect the difficulty in attentional selection, if not envelope tracking? Here, we use scalp electroencephalography (EEG) to test the hypothesis that the neural encoding of phonological information (i.e., phonetic boundaries and phonological categories) is affected by HI. In a cocktail-party scenario, such phonological difficulty might be reflected in an overrepresentation of phonological information for both attended and ignored speech sounds, with detrimental effects on the ability to effectively focus on the speaker of interest. To investigate this question, we carried out a re-analysis of an existing dataset where EEG signals were recorded as participants with HI, fitted with hearing aids, attended to one speaker (target) while ignoring a competing speaker (masker) and spatialised multi-talker background noise. Multivariate temporal response function (TRF) analyses indicated a stronger phonological information encoding for target than masker speech streams. Follow-up analyses aimed at disentangling the encoding of phonological categories and phonetic boundaries (phoneme onsets) revealed that neural signals encoded the phoneme onsets for both target and masker streams, in contrast with previously published findings with normal hearing (NH) participants and in line with our hypothesis that speech comprehension difficulties emerge due to a robust phonological encoding of both target and masker. Finally, the neural encoding of phoneme-onsets was stronger for the masker speech, pointing to a possible neural basis for the higher distractibility experienced by individuals with HI.

Heidi B. Borges, Johannes Zaar, E. Alickovic, C. B. Christensen, P. Kidmose

This study investigates the potential of speech-reception-threshold (SRT) estimation through electroencephalography (EEG) based envelope reconstruction techniques with continuous speech. Additionally, we investigate the influence of the stimuli’s signal-to-noise ratio (SNR) on the temporal response function (TRF). Twenty young normal-hearing participants listened to audiobook excerpts with varying background noise levels while EEG was recorded. A linear decoder was trained to reconstruct the speech envelope from the EEG data. The reconstruction accuracy was calculated as the Pearson’s correlation between the reconstructed and actual speech envelopes. An EEG SRT estimate (SRTneuro) was obtained as the midpoint of a sigmoid function fitted to the reconstruction accuracy versus SNR data points. Additionally, the TRF was estimated at each SNR level, followed by a statistical analysis to reveal significant effects of SNR levels on the latencies and amplitudes of the most prominent components. The SRTneuro was within 3 dB of the behavioral SRT for all participants. The TRF analysis showed a significant latency decrease for N1 and P2 and a significant amplitude magnitude increase for N1 and P2 with increasing SNR. The results suggest that both envelope reconstruction accuracy and the TRF components are influenced by changes in SNR, indicating they may be linked to the same underlying neural process.

Oskar Keding, E. Alickovic, Martin A. Skoglund, Maria Sandsten

In the literature, auditory attention is explored through neural speech tracking, primarily entailing modeling and analyzing electroencephalography (EEG) responses to natural speech via linear filtering. Our study takes a novel approach, introducing an enhanced coherence estimation technique to assess the strength of neural speech tracking. This enables effective discrimination between attended and ignored speech. To mitigate the impact of colored noise in EEG, we address two biases–overall coherence-level bias and spectral peak-shifting bias. In a listening study involving 32 participants with hearing impairment, tasked with attending to competing talkers in background noise, our coherence-based method effectively discerns EEG representations of attended and ignored speech. We comprehensively analyze frequency bands, individual frequencies, and EEG channels. Frequency bands of importance are shown to be delta, theta and alpha, and the important EEG channels are the central. Lastly, we showcase coherence differences across different noise reduction settings implemented in hearing aids (HAs), underscoring our method's potential to objectively assess auditory attention and enhance HA efficacy.

Oskar Keding, Johanna Wilroth, Martin A. Skoglund, E. Alickovic

Effective preprocessing of electroencephalography (EEG) data is fundamental for deriving meaningful insights. Independent component analysis (ICA) serves as an important step in this process by aiming to eliminate undesirable artifacts from EEG data. However, the decision on which and how many components to be removed remains somewhat arbitrary, despite the availability of both automatic and manual artifact rejection methods based on ICA. This study investigates the influence of different ICA-based artifact rejection strategies on EEG-based auditory attention decoding (AAD) analysis. We employ multiple ICA-based artifact rejection approaches, ranging from manual to automatic versions, and assess their effects on conventional AAD methods. The comparison aims to uncover potential variations in analysis results due to different artifact rejection choices within pipelines, and whether such variations differ across different AAD methods. Although our study finds no large difference in performance of linear AAD models between artifact rejection methods, two exeptions were found. When predicting EEG responses, the manual artifact rejection method appeared to perform better in frontal channel groups. Conversely, when reconstructing speech envelopes from EEG, not using artifact rejection outperformed other approaches.

Joshua P. Kulasingham, H. Innes-Brown, Martin Enqvist, E. Alickovic

Visual Abstract The auditory brainstem response (ABR) is a measure of subcortical activity in response to auditory stimuli. The wave V peak of the ABR depends on the stimulus intensity level, and has been widely used for clinical hearing assessment. Conventional methods estimate the ABR average electroencephalography (EEG) responses to short unnatural stimuli such as clicks. Recent work has moved toward more ecologically relevant continuous speech stimuli using linear deconvolution models called temporal response functions (TRFs). Investigating whether the TRF waveform changes with stimulus intensity is a crucial step toward the use of natural speech stimuli for hearing assessments involving subcortical responses. Here, we develop methods to estimate level-dependent subcortical TRFs using EEG data collected from 21 participants listening to continuous speech presented at 4 different intensity levels. We find that level-dependent changes can be detected in the wave V peak of the subcortical TRF for almost all participants, and are consistent with level-dependent changes in click-ABR wave V. We also investigate the most suitable peripheral auditory model to generate predictors for level-dependent subcortical TRFs and find that simple gammatone filterbanks perform the best. Additionally, around 6 min of data may be sufficient for detecting level-dependent effects and wave V peaks above the noise floor for speech segments with higher intensity. Finally, we show a proof-of-concept that level-dependent subcortical TRFs can be detected even for the inherent intensity fluctuations in natural continuous speech.

M. A. Tanveer, Martin A. Skoglund, Bo Bernhardsson, E. Alickovic

Objective. This study develops a deep learning (DL) method for fast auditory attention decoding (AAD) using electroencephalography (EEG) from listeners with hearing impairment (HI). It addresses three classification tasks: differentiating noise from speech-in-noise, classifying the direction of attended speech (left vs. right) and identifying the activation status of hearing aid noise reduction algorithms (OFF vs. ON). These tasks contribute to our understanding of how hearing technology influences auditory processing in the hearing-impaired population. Approach. Deep convolutional neural network (DCNN) models were designed for each task. Two training strategies were employed to clarify the impact of data splitting on AAD tasks: inter-trial, where the testing set used classification windows from trials that the training set had not seen, and intra-trial, where the testing set used unseen classification windows from trials where other segments were seen during training. The models were evaluated on EEG data from 31 participants with HI, listening to competing talkers amidst background noise. Main results. Using 1 s classification windows, DCNN models achieve accuracy (ACC) of 69.8%, 73.3% and 82.9% and area-under-curve (AUC) of 77.2%, 80.6% and 92.1% for the three tasks respectively on inter-trial strategy. In the intra-trial strategy, they achieved ACC of 87.9%, 80.1% and 97.5%, along with AUC of 94.6%, 89.1%, and 99.8%. Our DCNN models show good performance on short 1 s EEG samples, making them suitable for real-world applications. Conclusion: Our DCNN models successfully addressed three tasks with short 1 s EEG windows from participants with HI, showcasing their potential. While the inter-trial strategy demonstrated promise for assessing AAD, the intra-trial approach yielded inflated results, underscoring the important role of proper data splitting in EEG-based AAD tasks. Significance. Our findings showcase the promising potential of EEG-based tools for assessing auditory attention in clinical contexts and advancing hearing technology, while also promoting further exploration of alternative DL architectures and their potential constraints.

...
...
...

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više