


A research article entitled "Design and Evaluation of an
Automated Pediatric Acute Lymphoblastic Leukemia Registry from Clinical Data
Warehouses" has been published in BMC Medical Informatics and Decision
Making (2026). BMC Medical Informatics and Decision Making is a leading
international journal that publishes high-impact research on the development,
implementation, and evaluation of medical informatics, clinical decision-making
systems, and health information technologies. Our study presents a
comprehensive framework for constructing a high-quality, scalable pediatric
acute lymphoblastic leukemia (ALL) registry through automated extraction of electronic
medical record (EMR)-based real-world data (RWD) from clinical data warehouses
(CDWs).
The study developed PeARL (Pediatric ALL
Registry using eLectronic medical record-based RWD) by integrating standardized
mapping, multivariate transformations, and rule-based natural language
processing (NLP) to capture clinically relevant variables from two
university-affiliated tertiary-care hospitals in South Korea: Seoul National
University Hospital (SNUH) and the Catholic Medical Center (CMC). A total of
1,609 pediatric patients with ALL (663 from SNUH and 946 from CMC) were
included. Data quality was systematically evaluated using 228 rules across five
dimensions (completeness, validity, accuracy, uniqueness, consistency).
The results showed that the automated data
extraction proportion reached 89.7% at SNUH and 75.0% at CMC, with most
elements processed through single-field transformations. After applying the
quality management process, the overall error rate decreased from 1.858% to
0.001% at SNUH and from 0.129% to 0.001% at CMC, corresponding to an estimated
reliability of 99%. Moreover, the registry demonstrated robust multicenter applicability
through standardized table specifications and cross-site review, despite
structural differences between institutional CDWs.
Overall, the study highlights the potential of a clinically guided, standardized framework as a reproducible and scalable solution for registry construction. By combining structured mapping, multivariate transformation, and rule-based NLP, the proposed approach offers a practical pathway for multicenter research and regulatory-grade real-world evidence generation.
Room C-208, 145 Gwanggyo-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16229, SOUTH KOREA (Gwanggyo)
Room 406, Building 17, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, SOUTH KOREA (Yeon-gun)
Tel: +82-31-888-9189 (Gwanggyo); +82-2-3668-7381 (Yeon-gun)
Fax: +82-31-888-9575
Email: ccadd.snu@gmail.com