Published Research & Validation Studies

Peer-reviewed evidence demonstrating our AI's clinical accuracy and real-world impact.

Retina Society Abstract 2025
Retina Society 2025

Feasibility of AI-Assisted Screening for Clinical Trials in Wet Age-Related Macular Degeneration

Authors: Louis Cai¹, Amr Dessouki¹, Riya Fukui², Allen Ho³, Jason Hsu³, Richard Kaiser³, Carl Regillo³, Meera Sivalingam³, William Xie², David Xu³, Yoshihiro Yonekawa³, Ajay Kuriyan³

¹ Retinal Diagnostic Center, Campbell, California, USA
² Cosign AI, San Francisco, California, USA
³ Department of Ophthalmology, Wills Eye Hospital, Thomas Jefferson University, Philadelphia, Pennsylvania, USA

Purpose

This study aimed to evaluate the feasibility and accuracy of using artificial intelligence (AI) language agents for initial patient screening in clinical trials for neovascular age-related macular degeneration (nAMD). We assessed the ability of AI agents to identify eligible candidates for three retinal trials: LUGANO/LUCIA (Eyepoint), ASCENT (REGENXBIO), and CONSTANCE (Genentech).

Methods

We developed an AI-based screening system using large language models to review 728 consecutive patient records (1456 eyes) from six retina specialists, applying trial eligibility criteria. Criteria that could not be assessed from the record, such as willingness to participate, were excluded. The system was validated on 226 (31%) patients and tested on 502 (69%) patients. Performance was assessed based on sensitivity, specificity, and time to eligibility determination. Experts independently assessed trial eligibility. Biases and misclassification patterns were also analyzed.

Results

nAMD was diagnosed in 168 patients (23.1%), with only 4 patients (2.4%) being treatment-naive. On the test set of 502 patients, the AI system showed 97.1% overall accuracy, 82.5% sensitivity, and 97.6% specificity. It performed best on CONSTANCE (accuracy: 98.8%, sensitivity: 100%, specificity: 98.8%), followed by ASCENT (accuracy: 96.6%, sensitivity: 83.3%, specificity: 97.3%), and LUGANO (accuracy: 95.8%, sensitivity: 80.7%, specificity: 96.8%). The system spent an average of 167 seconds per patient. Documentation analysis showed increased misclassifications for one retina specialist’s notes (OR = 0.967, 95% CI: 0.937–0.998). Error analysis revealed common issues: missing external context (25.6%), medical reasoning mistakes (20.5%), and expert labeling errors (20.5%). Correcting labeling errors improved performance (accuracy: 97.6%, sensitivity: 89.3%, specificity: 97.9%).

Conclusions

AI-assisted screening is a feasible tool for automating patient eligibility assessments in wet AMD clinical trials. It can be executed simultaneously across all patients and trials, significantly reducing the burden of manual chart review while improving efficiency. Future work should address reasoning mistakes, incorporate external context (e.g., image processing), and validate AI in real-world settings for clinical trial workflows.