Task for AI 👩‍💻👨‍💻¶

The LUNA25: AI Study aims to evaluate the performance of modern state-of-the-art AI algorithms at lesion-level malignancy risk estimation of lung nodules in low-dose chest CT scans. The objective of the developed AI algorithms is to provide a lesion-level risk score, similar to the approach used in the PanCan model (McWilliams et al., 2013).

Organizers will provide baseline code that participating team can utilize to develop AI algorithms. However, participants are encouraged to adapt these algorithms and implement advanced AI models, explore alternative preprocessing techniques and perform ensemble learning methods. More information on the baseline algorithms and how to utilize them to get started can be found at Algorithm Submissions.

Evaluation 📊¶

The key performance metrics used to evaluate AI algorithms will be through a receiver operating characteristics (ROC) curve analysis, where the area under the curve (AUC) will be used as final metric. The AUC will be derived from the continuous malignancy risk scores between 0-100 that the AI algorithms provide. For the comparison of sensitivity/specificity of the AI algorithms, the AI malignancy risk score will be binarized at different clinically relevant operating points. However, the sensitivity and specificity will not be used for the ranking on the public leaderboard.

Performance evaluation utilities for lung nodule malignancy risk estimation in low-dose chest CT: https://github.com/DIAGNijmegen/luna25-evaluation-public