华东师范大学学报(自然科学版) ›› 2023, Vol. 2023 ›› Issue (4): 52-64.doi: 10.3969/j.issn.1000-5641.2023.04.006

• 计算机科学 • 上一篇    下一篇

基于多模态的急性肾衰竭预测模型

邓未, 周昉*()   

  1. 华东师范大学 数据科学与工程学院, 上海 200062
  • 收稿日期:2022-07-09 出版日期:2023-07-25 发布日期:2023-07-25
  • 通讯作者: 周昉 E-mail:fzhou@dase.ecnu.edu.cn
  • 基金资助:
    国家自然科学基金 (61902127)

Multimodal-based prediction model for acute kidney injury

Wei DENG, Fang ZHOU*()   

  1. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Received:2022-07-09 Online:2023-07-25 Published:2023-07-25
  • Contact: Fang ZHOU E-mail:fzhou@dase.ecnu.edu.cn

摘要:

急性肾衰竭是一种发病率较高的临床疾病, 尽早识别潜在患者有助于医生对其进行医疗干预, 降低发病率和死亡率. 近年来, 依靠电子健康病历去预测患者潜在的健康风险逐渐受到关注. 大多数模型通过聚合数据或者填充缺失值的方式处理人体生理指标数据中存在的稀疏性和不规则性问题, 忽视了缺失信息隐含的患者健康状态. 此外, 现有的急性肾衰竭预测模型并没有考虑各种模态的数据特点和模态之间的相关性. 为了解决以上问题, 提出了基于多模态的急性肾衰竭预测模型. 该模型考虑了人体生理指标数据、疾病数据和人口统计学数据. 设计了新的基于掩码和时间差的LSTM (long short term memory)网络去学习各个生理指标的时间间隔和缺失信息, 捕获指标的数值变化和检测频率变化, 引入了多头自注意力机制促进各模态表征的相互学习. 在真实的数据集上进行了急性肾衰竭预测问题和死亡风险预测问题的实验, 证明了所提出模型的有效性和合理性.

关键词: 急性肾衰竭, 电子健康病历, 多模态, 时间序列数据分类

Abstract:

Acute kidney injury is a clinical disease with a high morbidity rate, and early identification of potential patients can facilitate medical interventions to reduce morbidity and mortality. In recent years, electronic health records have been widely used to predict an individual’s potential risk. Most of the existing acute kidney injury prediction models tackle the issue of sparsity and irregularity in the physiological variables data by aggregating data or imputing the missing value, but ignore the patient’s health status implied by the missing information. Moreover, they do not consider the characteristics of and correlation between the various modalities. To solve the above issues, we present a multi-modal disease prediction model for acute kidney injury. The proposed model considers a variety of modal data, including physiological variables, disease, and demographic data. A new mask and time span based long short term memory (LSTM) network is designed to learn the time span and missing information of individual Physiological variables, and furthermore, to capture their numerical changes and frequency changes. The multi-head self-attention mechanism is introduced to promote interaction learning of each modality representation. Experiments on the real-world application of acute kidney injury risk prediction and mortality risk prediction demonstrate the effectiveness and rationality of the proposed model.

Key words: acute kidney injury, electronic health records, multimodal, time series data classification

中图分类号: