Retinal Specialist versus Artificial Intelligence Detection of Retinal Fluid from OCT: Age-Related Eye Disease Study 2: 10-Year Follow-On Study.
PURPOSE: To evaluate the performance of retinal specialists in detecting retinal fluid presence in spectral domain OCT (SD-OCT) scans from eyes with age-related macular degeneration (AMD) and compare performance with an artificial intelligence algorithm.
DESIGN: Prospective comparison of retinal fluid grades from human retinal specialists and the Notal OCT Analyzer (NOA) on SD-OCT scans from 2 common devices.
PARTICIPANTS: A total of 1127 eyes of 651 Age-Related Eye Disease Study 2 10-year Follow-On Study (AREDS2-10Y) participants with SD-OCT scans graded by reading center graders (as the ground truth).
METHODS: The AREDS2-10Y investigators graded each SD-OCT scan for the presence/absence of intraretinal and subretinal fluid. Separately, the same scans were graded by the NOA.
MAIN OUTCOME MEASURES: Accuracy (primary), sensitivity, specificity, precision, and F1-score.
RESULTS: Of the 1127 eyes, retinal fluid was present in 32.8%. For detecting retinal fluid, the investigators had an accuracy of 0.805 (95% confidence interval [CI], 0.780-0.828), a sensitivity of 0.468 (95% CI, 0.416-0.520), a specificity of 0.970 (95% CI, 0.955-0.981). The NOA metrics were 0.851 (95% CI, 0.829-0.871), 0.822 (95% CI, 0.779-0.859), 0.865 (95% CI, 0.839-0.889), respectively. For detecting intraretinal fluid, the investigator metrics were 0.815 (95% CI, 0.792-0.837), 0.403 (95% CI, 0.349-0.459), and 0.978 (95% CI, 0.966-0.987); the NOA metrics were 0.877 (95% CI, 0.857-0.896), 0.763 (95% CI, 0.713-0.808), and 0.922 (95% CI, 0.902-0.940), respectively. For detecting subretinal fluid, the investigator metrics were 0.946 (95% CI, 0.931-0.958), 0.583 (95% CI, 0.471-0.690), and 0.973 (95% CI, 0.962-0.982); the NOA metrics were 0.863 (95% CI, 0.842-0.882), 0.940 (95% CI, 0.867-0.980), and 0.857 (95% CI, 0.835-0.877), respectively.
CONCLUSIONS: In this large and challenging sample of SD-OCT scans obtained with 2 common devices, retinal specialists had imperfect accuracy and low sensitivity in detecting retinal fluid. This was particularly true for intraretinal fluid and difficult cases (with lower fluid volumes appearing on fewer B-scans). Artificial intelligence-based detection achieved a higher level of accuracy. This software tool could assist physicians in detecting retinal fluid, which is important for diagnostic, re-treatment, and prognostic tasks.