Keiji AI LogoKeiji AI

Our Research Publications

Explore our contributions to AI in clinical research, published in leading journals and conferences.

Benchmark

TrialPanorama: Developing Large Language Models for Clinical Research Using One Million Clinical Trials

Benchmark
Clinical trials
Systematic Review
Trial Design
Database

Source Preprint

Authors Zifeng Wang, Qiao Jin, Jiacheng Lin, Junyi Gao, Jathurshan Pradeepkumar, Pengcheng Jiang, Benjamin Danek, Zhiyong Lu, Jimeng Sun

Can Large Language Models Replace Data Scientists in Biomedical Research?

LLM
Data Science
Biomedical Research
AI Assistant
Benchmark

Source Preprint

Authors Zifeng Wang, Benjamin Danek, Ziwei Yang, Zheng Chen, Jimeng Sun

BIODSA-1K: Benchmarking Data Science Agents for Biomedical Research

Benchmark
Data Science
Biomedical Research
AI Agent
LLM

Source Preprint

Authors Zifeng Wang, Benjamin Danek, Jimeng Sun

Clinical Trial Foundation Model

Panacea: A foundation model for clinical trial search, summarization, design, and recruitment

LLM
Clinical trials
Foundation model

Source Preprint

Authors Jiacheng Lin, Hanwen Xu, Zifeng Wang, Sheng Wang, Jimeng Sun

Systematic Literature Review

Accelerating clinical evidence synthesis with large language models

LLM
Medical Literature Mining
Systematic Review

Source NPJ Digital Medicine

Authors Wang, Zifeng and Cao, Lang and Danek, Benjamin and Jin, Qiao and Lu, Zhiyong and Sun, Jimeng

A foundation model for human-AI collaboration in medical literature mining

LLM
Medical Literature Mining
Systematic Review
Human-AI Collaboration

Source Nature Communications

Authors Zifeng Wang, Lang Cao, Qiao Jin, Joey Chan, Nicholas Wan, Behdad Afzali, Hyun-Jin Cho, Chang-In Choi, Mehdi Emamverdi, Manjot K. Gill, Sun-Hyung Kim, Yijia Li, Yi Liu, Hanley Ong, Justin Rousseau, Irfan Sheikh, Jenny J. Wei, Ziyang Xu, Christopher M. Zallek, Kyungsang Kim, Yifan Peng, Zhiyong Lu, Jimeng Sun

Patient Recruitment

Matching patients to clinical trials with large language models

LLM
Prompting
Patient-Trial Matching

Source Nature Communications

Authors Qiao Jin, Zifeng Wang, Charalampos S. Floudas, Fangyuan Chen, Changlin Gong, Dara Bracken-Clarke, Elisabetta Xue, Yifan Yang, Jimeng Sun & Zhiyong Lu

COMPOSE: Cross-modal pseudo-siamese network for patient trial matching

Patient-Trial Matching
EHR

Source KDD'20

Authors Junyi Gao, Cao Xiao, Lucas M. Glass, Jimeng Sun

Doctor2Vec: Dynamic Doctor Representation Learning for Clinical Trial Recruitment

Patient-Trial Matching
EHR

Source AAAI'20

Authors Junyi Gao, Cao Xiao, Lucas M. Glass, Jimeng Sun

Trial Design

AutoTrial: Prompting Language Models for Clinical Trial Design

LLM
Instruction Tuning
Eligibility Criteria

Source EMNLP'23

Authors Zifeng Wang, Cao Xiao, Jimeng Sun

Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using Self-Supervision

Trial Search
Contrastive Learning
Dense Retrieval

Source EMNLP'22

Authors Zifeng Wang, Jimeng Sun

SPOT: Sequential Predictive Modeling of Clinical Trial Outcome with Meta-Learning

Trial Outcome
Sequential Learning

Source ACM-BCB'23

Authors Zifeng Wang, Jimeng Sun

HINT: Hierarchical Interaction Network for Clinical Trial Outcome Predictions

Trial Outcome
Graph Neural Network

Source Patterns

Authors Zifeng Wang, Jimeng Sun

Multimodal Model

BioBridge: Bridging Biomedical Foundation Models via Knowledge Graph

Foundation Model
Multimodal AI
Biomedical AI
Knowledge Graph

Source ICLR'24

Authors Zifeng Wang, Zichen Wang, Balasubramaniam Srinivasan, Vassilis N. Ioannidis, Huzefa Rangwala, Rishita Anubhai

MedCLIP: Contrastive Learning from Unpaired Medical Images and Text

Vision-Language Model
Multimodal AI
Clinical Note
X-Ray

Source EMNLP'22

Authors Zifeng Wang, Jimeng Sun

Clinical Predictive Model

UniPredict: Large Language Models are Universal Tabular Predictors

Tabular Learning
Patient Outcome
LLM
Instruction Tuning

Source Preprint

Authors Ruiyu Wang, Zifeng Wang, Jimeng Sun

MediTab: Scaling Medical Tabular Data Predictors via Data Consolidation, Enrichment, and Refinement

Tabular Learning
Patient Outcome
LLM
Data-Centric AI

Source IJCAI'24

Authors Zifeng Wang, Chufan Gao,Cao Xiao, Jimeng Sun

TransTab: Learning Transferable Tabular Transformers Across Tables

Tabular Learning
Patient Outcome
Transfer Learning

Source NeurIPS'22

Authors Zifeng Wang, Jimeng Sun

STAN: Spatio-Temporal Attention Network for Pandemic Prediction using Real-World Evidence

Pandemic Prediction
Graph Neural Network
Real-World Evidence

Source JAMIA'21

Authors Junyi Gao, Rakshith Sharma, Cheng Qian, Lucas M Glass, Jeffrey Spaeder, Justin Romberg, Jimeng Sun, Cao Xiao

Evidence-driven spatiotemporal COVID-19 hospitalization prediction with Ising dynamics

Pandemic Prediction
Spatio-temporal Prediction

Source Nature Communications

Authors Junyi Gao, Joerg Heintz, Christina Mack, Lucas Glass, Adam Cross & Jimeng Sun

PopNet: Real-Time Population-Level Disease Prediction with Data Latency

Population Health Prediction
Graph Neural Network
Spatio-temporal Prediction

Source WWW'22

Authors Junyi Gao, Cao Xiao, Lucas M. Glass, Jimeng Sun

Improving medical machine learning models with generative balancing for equity and excellence

Predictive Modeling
Data Augmentation
Data Synthesis

Source npj Digital Medicine'25

Authors Brandon Theodorou, Benjamin Danek, Venkat Tummala, Shivam Pankaj Kumar, Bradley Malin, Jimeng Sun

Synthetic Patient Generation

TWIN: Personalized Clinical Trial Digital Twin Generation

Digital Twin
Variational Autoencoder

Source KDD'23

Authors Trisha Das, Zifeng Wang, Jimeng Sun

Synthesize high-dimensional longitudinal electronic health records via hierarchical autoregressive language model

Synthetic Data
EHR
Language Model
Longitudinal Data

Source Nature Communications

Authors Brandon Theodorou, Cao Xiao, Jimeng Sun