ABSTRACT Everyday communication is dynamic and multisensory, often involving shifting attention, overlapping speech, and visual cues. Yet, most neural attention tracking studies are still limited to highly controlled lab settings, using clean, often audio‐only stimuli and requiring sustained attention to a single talker. This work addresses that gap by introducing a novel dataset from 24 normal‐hearing participants. We used a wearable electroencephalography (EEG) system (44 scalp electrodes and 20 cEEGrid electrodes) in an audiovisual (AV) paradigm with three conditions: sustained attention to a single talker in a two‐talker environment, attention switching between two talkers, and unscripted two‐talker conversations with a competing single talker. Analysis included temporal response functions (TRFs) modeling, optimal lag analysis, selective attention classification with decision windows ranging from 1.1 to 35 s, and comparisons of TRFs for attention to AV conversations versus side audio‐only talkers. Key findings show significant differences in the attention‐related P2 peak between attended and ignored speech across conditions for scalp EEG. Interestingly, our results revealed strong cross‐condition generalization, with models trained in one condition maintaining good performance when evaluated on the other two. No significant change in performance between switching and sustained attention suggests robustness for attention switches. Optimal lag analysis revealed a narrower peak for conversation compared to single‐talker AV stimuli, reflecting the additional complexity of multi‐talker processing. Classification of selective attention was consistently above chance (55%–70% accuracy) for scalp EEG, whereas cEEGrid data yielded lower correlations, highlighting the need for further methodological improvements. These results demonstrate that wearable EEG can reliably track selective attention in dynamic, multisensory listening scenarios and provide guidance for designing future AV paradigms and real‐world attention tracking applications.
This study examines how the signal‐to‐noise‐interference ratio (SNIR) influences auditory performance and neural responses associated with listening effort (LE). A new dataset was collected from individuals with moderate hearing loss, all fitted with hearing aids (HAs). Participants listened to two competing audiobooks presented via front‐facing loudspeakers, while 16‐talker babble noise was delivered from background speakers. Six SNIR levels (5.47, −$$ - $$ 3.55, −$$ - $$ 2.13, −$$ - $$ 1.19, −$$ - $$ 0.64, and −$$ - $$ 0.27 dB) were tested. Participants were instructed to attend to one audiobook while ignoring the competing speech and background noise and were subsequently assessed on content of the attended speech and perceived LE. The performance results revealed a significant linear effect of SNIR on subjective ratings of LE and a primarily quadratic effect on comprehension questionnaire accuracy, suggesting that perceived effort decreases steadily with improving SNIR, while comprehension questionnaire performance exhibits a plateau at higher SNIR levels. The EEG analyses demonstrated a significant relationship between SNIR and local connectivity, specifically in the parietal electrodes and in the alpha frequency band. Further analysis confirmed that parietal local connectivity correlates linearly with subjective LE ratings. Moreover, spectral power analysis showed that parietal alpha power is not significantly related to SNIR, indicating that local connectivity may serve as a more sensitive neural marker. While local connectivity and alpha power may share some neural underpinnings, they offer complementary, yet non‐identical insights. These findings highlight the potential of local EEG connectivity as a reliable estimate of LE in acoustically challenging environments.
Everyday communication is dynamic and multisensory, often involving shifting attention, overlapping speech and visual cues. Yet, most neural attention tracking studies are still limited to highly controlled lab settings, using clean, often audio-only stimuli and requiring sustained attention to a single talker. This work addresses that gap by introducing a novel dataset from 24 normal-hearing participants. We used a mobile electroencephalography (EEG) system (44 scalp electrodes and 20 cEEGrid electrodes) in an audiovisual (AV) paradigm with three conditions: sustained attention to a single talker in a two-talker environment, attention switching between two talkers, and unscripted two-talker conversations with a competing single talker. Analysis included temporal response functions (TRFs) modeling, optimal lag analysis, selective attention classification with decision windows ranging from 1.1s to 35s, and comparisons of TRFs for attention to AV conversations versus side audio-only talkers. Key findings show significant differences in the attention-related P2-peak between attended and ignored speech across conditions for scalp EEG. No significant change in performance between switching and sustained attention suggests robustness for attention switches. Optimal lag analysis revealed narrower peak for conversation compared to single-talker AV stimuli, reflecting the additional complexity of multi-talker processing. Classification of selective attention was consistently above chance (55-70% accuracy) for scalp EEG, while cEEGrid data yielded lower correlations, highlighting the need for further methodological improvements. These results demonstrate that mobile EEG can reliably track selective attention in dynamic, multisensory listening scenarios and provide guidance for designing future AV paradigms and real-world attention tracking applications.
OBJECTIVE Previous studies have demonstrated that the speech reception threshold (SRT) can be estimated using scalp electroencephalography (EEG), referred to as SRTneuro. The present study assesses the feasibility of using ear-EEG, which allows for discreet measurement of neural activity from in and around the ear, to estimate the SRTneuro. Approach: Twenty young normal-hearing participants listened to audiobook excerpts at varying signal-to-noise ratios (SNRs) whilst wearing a 66-channel EEG cap and 12 ear-EEG electrodes. A linear decoder was trained on different electrode configurations to estimate the envelope of the audio excerpts from the EEG recordings. The reconstruction accuracy was determined by calculating the Pearson's correlation between the actual and the estimated envelope. A sigmoid function was then fitted to the reconstruction-accuracy-vs-SNR data points, with the midpoint of the sigmoid serving as the SRTneuro estimate for each participant. Main results: Using only in-ear electrodes , the estimated SRTneuro was within 3 dB of the behaviorally measured SRT (SRTbeh) for 6 out of 20 participants (30%). With electrodes placed both in and around the ear, the SRTneuro was within 3 dB of the SRTbeh for 19 out of 20 participants (95%) and thus on par with the reference estimate obtained from full-scalp EEG. Using only electrodes in and around the ear from the right side of the head, the SRTneuro remained within 3 dB of the SRTbeh for 19 out of 20 participants. .
Previous studies have demonstrated the feasibility of estimating the speech reception threshold (SRT) based on electroencephalography (EEG), termed SRTneuro, in younger normal-hearing (YNH) participants. This method may support speech perception in hearing-aid users through continuous adaptation of noise-reduction algorithms. The prevalence of hearing impairment and thereby hearing-aid use increases with age. The SRTneuro estimation is based on envelope reconstruction accuracy, which has also been shown to increase with age, possibly due to excitatory/inhibitory imbalance or recruitment of additional cortical regions. This could affect the estimated SRTneuro. This study investigated the age-related changes in the temporal response function (TRF) and the feasibility of SRTneuro estimation across age. Twenty YNH and 22 older normal-hearing (ONH) participants listened to audiobook excerpts at various signal-to-noise ratios (SNRs) while EEG was recorded using 66 scalp electrodes and 12 in-ear-EEG electrodes. A linear decoder reconstructed the speech envelope, and the Pearson's correlation was calculated between the reconstructed and speech-stimulus envelopes. A sigmoid function was fitted to the reconstruction-accuracy-versus-SNR data points, and the midpoint was used as the estimated SRTneuro. The results show that the SRTneuro can be estimated with similar precision in both age groups, whether using all scalp electrodes or only those in and around the ear. This consistency across age groups was observed despite physiological differences, with the ONH participants showing higher reconstruction accuracies and greater TRF amplitudes. Overall, these findings demonstrate the robustness of the SRTneuro method in older individuals and highlight its potential for applications in age-related hearing loss and hearing-aid technology.
Hearing aid (HA) users often experience increased listening effort, particularly in noisy environments. While noise reduction (NR) algorithms aim to alleviate this, traditional electroencephalography (EEG) methods based on power analysis have limited success in assessing the listening effort in this population. This study proposes a novel method using a whole-head synchronization map analysis that uses local connectivity, a measure of statistical dependencies within localized brain regions. We use EEG electrodes to define a region based on the surrounding electrodes in the first-order neighborhood. This approach was tested using EEG data from 22 HA users with active or inactive NR engaged in a continuous speech-in-noise (SiN) task at low (3dB) and high (8dB) signal-to-noise ratio (SNR) levels. Whole-head synchronization was quantified using circular omega complexity (COC), a multivariate phase synchrony measure. Results showed increased local connectivity in the alpha band (8–12 Hz) within frontal and occipital regions during SiN condition compared to the background noise-only (NO) condition. Furthermore, NR activation impacted the synchronization map differently at the two SNRs of the experiment, with greater effect observed at low SNR, primarily in the left parietal region and alpha band. This behavior is in line with that of existing measures for listening effort, and therefore suggests that EEG local connectivity analysis holds promise as a tool for objectively assessing listening effort in HA users, especially in challenging listening environments.
In this study, we investigate integrating eye tracking with auditory attention decoding (AAD) using portable EEG devices, specifically a mobile EEG cap and cEEGrid, in a preliminary analysis with a single participant. A novel audiovisual dataset was collected using a mobile EEG system designed to simulate real-life listening environments. Our study has two main objectives: (1) to use eye tracking data to automatically infer the labels of attended and unattended speech streams, and (2) to train an AAD model using these estimated labels, evaluating its performance through speech reconstruction accuracy. The results demonstrate the feasibility of using eye tracking data to estimate attended speech labels, which were then used to train speech reconstruction models. We validated our models with varying amounts of training data and a second dataset from the same participant to assess generalization. Additionally, we examined the impact of mislabeling on AAD accuracy. These findings provide preliminary evidence that eye tracking can be used to infer speech labels, offering a potential pathway for brain-controlled hearing aids, where true labels are unknown.
This study investigates the neural encoding of speech features in hearing aid users using electroencephalography (EEG) during a simulated cocktail party scenario. The objective was to investigate neural tracking of various acoustic and linguistic features and how hearing aid noise reduction influenced this tracking. The features analyzed included the acoustic envelope, phonetic features, word onset, and word surprisal, the latter derived from GPT-2. Temporal Response Functions (TRFs) were used to correlate these features with EEG signals, revealing how the brain tracks attended (target) versus unattended (masker) speech. TRFs were estimated using a boosting algorithm, with speech features as predictors and EEG signals as responses. Results revealed a significant distinction between target and masker speech. The acoustic envelope showed the strongest correlation with EEG responses. Distinct tracking patterns were observed: the acoustic envelope and phonetic features correlated with early processing stages, while word onset and word suprisal were linked to later stages. Noise reduction further influenced the tracking of these features. These findings improve our understanding of how hearing aid users process speech and provide insight for developing hearing aids that adapt to individual neural responses.
Objective. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise. Approach. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment. Results. The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively. Significance. These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.
Visual Abstract Hearing impairment (HI) disrupts social interaction by hindering the ability to follow conversations in noisy environments. While hearing aids (HAs) with noise reduction (NR) partially address this, the “cocktail-party problem” persists, where individuals struggle to attend to specific voices amidst background noise. This study investigated how NR and an advanced signal processing method for compensating for nonlinearities in Electroencephalography (EEG) signals can improve neural speech processing in HI listeners. Participants wore HAs with NR, either activated or deactivated, while focusing on target speech amidst competing masker speech and background noise. Analysis focused on temporal response functions to assess neural tracking of relevant target and masker speech. Results revealed enhanced neural responses (N1 and P2) to target speech, particularly in frontal and central scalp regions, when NR was activated. Additionally, a novel method compensated for nonlinearities in EEG data, leading to improved signal-to-noise ratio (SNR) and potentially revealing more precise neural tracking of relevant speech. This effect was most prominent in the left-frontal scalp region. Importantly, NR activation significantly improved the effectiveness of this method, leading to stronger responses and reduced variance in EEG data and potentially revealing more precise neural tracking of relevant speech. This study provides valuable insights into the neural mechanisms underlying NR benefits and introduces a promising EEG analysis approach sensitive to NR effects, paving the way for potential improvements in HAs.
In this paper, we analyze examples of research institutes that stand out in scientific excellence and social impact. We define key practices for evaluating research results, economic conditions, and the selection of specific research topics. Special focus is placed on small countries and the field of artificial intelligence. The aim is to identify components that enable institutes to achieve a high level of innovation, self-sustainability, and social benefits.
Objective. Previous studies have demonstrated that the speech reception threshold (SRT) can be estimated using scalp electroencephalography (EEG), referred to as SRTneuro. The present study assesses the feasibility of using ear-EEG, which allows for discreet measurement of neural activity from in and around the ear, to estimate the SRTneuro. Approach. Twenty young normal-hearing participants listened to audiobook excerpts at varying signal-to-noise ratios (SNRs) whilst wearing a 66-channel EEG cap and 12 ear-EEG electrodes. A linear decoder was trained on different electrode configurations to estimate the envelope of the audio excerpts from the EEG recordings. The reconstruction accuracy was determined by calculating the Pearson’s correlation between the actual and the estimated envelope. A sigmoid function was then fitted to the reconstruction-accuracy-vs-SNR data points, with the midpoint of the sigmoid serving as the SRTneuro estimate for each participant. Main results. Using only in-ear electrodes, the estimated SRTneuro was within 3 dB of the behaviorally measured SRT (SRTbeh) for 6 out of 20 participants (30%). With electrodes placed both in and around the ear, the SRTneuro was within 3 dB of the SRTbeh for 19 out of 20 participants (95%) and thus on par with the reference estimate obtained from full-scalp EEG. Using only electrodes in and around the ear from the right side of the head, the SRTneuro remained within 3 dB of the SRTbeh for 19 out of 20 participants. Significance. These findings suggest that the SRTneuro can be reliably estimated using ear-EEG, especially when combining in-ear electrodes and around-the-ear electrodes. Such an estimate can be highly useful e.g. for continuously adjusting noise-reduction algorithms in hearing aids or for logging the SRT in the user’s natural environment.
Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više