会议BIBM conference accept论文：DyBrainFormer：基于层次transformer的脑-多媒体关联动态脑语义解码

论文“DyBrainFormer: Decoding Dynamic Brain Semantics with Hierarchical Transformer for Brain-Multimedia Association”被会议BIBM2025 conference接收

时间：2025-11-10

关键词：脑影像人工智能，文章接收

近日，博士研究生喻四刚研究文章 “DyBrainFormer: Decoding Dynamic Brain Semantics with Hierarchical Transformer for Brain-Multimedia Association”，已被国际生物信息学与生物医学领域会议 IEEE BIBM (Bioinformatics and Biomedicine) 接收，该论文将发布在BIBM 的Embodied Intelligence for Context-Aware Perception and Adaptive Reasoning in Human-Centered Healthcare 研讨会。通讯作者为张枢教授。

Exploring the association between high-level semantic brain responses and multimedia features is crucial for understanding the human semantic processing mechanism. However, a significant “semantic gap” persists between abstract brain representations captured by functional Magnetic Resonance Imaging (fMRI) and concrete multimedia features, remaining both unclear and challenging to quantify. To address this, we introduce DyBrainFormer, a novel Transformer-based Brain Dynamics Decoder for Brain-Multimedia Association. Inspired by the topological structure and dynamic properties of the human brain, DyBrainFormer uniquely integrates Graph Convolutional Networks (GCNs) and Hierarchical Temporal Transformer (HTT). It first encodes each sequenced dynamic brain graph using GCNs to capture spatial dependencies and derive brain temporal node attention. Subsequently, these temporal graph representations are fed into the HTT, which excels at modeling complex dynamic changes and long-range temporal dependencies within brain networks. The learned temporal weights from HTT serve as interpretable semantic descriptors, forming a quantifiable bridge that links high-level brain semantics to dynamic multimedia features. Evaluated on the Healthy Brain Network naturalistic fMRI dataset, DyBrainFormer effectively learns distinguishable brain dynamics, achieving ~83% classification accuracy in differentiating between children and adolescents. Our analysis further identifies distinct age-related patterns in semantic processing, demonstrating that children emphasize perceptual features while adolescents focus on higher-level conceptual elements. This work provides important references for bridging the semantic gap by establishing a robust and interpretable link between high-level semantic features and multimedia features, offering a novel perspective to uncover the human semantic understanding mechanism.

图1：论文框架。