CCADD is breaking the ground for biomedical natural language processing or NLP!
On January 22nd, 2021, CCADD was nominated by the Ministry of Food and Drug Safety to develop an NLP model that extracts core clinical information from unstructured data in the Korea Adverse Event Reporting System (KAERS). The total research fund amounts to two hundred million Korean won or a hundred and eighty thousand U.S. dollars. The project will be performed in collaboration with the Machine Intelligence lab in the Department of Electrical and Computer Engineering, Seoul National University (Prof. Gyo-Min Jung).
For starters, CCADD will build a manually annotated dataset, which is indispensable to train and refine any machine learning algorithm including NLP. Next, CCADD plans to build the KAERS-BERT model, an NLP model specifically tuned to handle adverse drug events. The KAERS-BERT model will be pre-trained on KAERS text data, which then will be deployed to extract medical information from free texts, such as hand-written records of adverse events. Once complete, the KAERS-BERT model is expected to address many of the language ambiguities caused by the mixed use of English and Korean in medical domains already peppered with technical and medical jargons. The KAERS-BERT model will significantly improve the performance and the outcome of NLP tasks when extracting information from clinical notes written in English and Korean.
Prof. Howard Lee, the PI of the grant, congratulated the team, saying "The grant awarding is timely and important since it will help CCADD enter a new area of biomedical NLP." At the conclusion of this project, CCADD will once again have secured its place as the leading lab in the field of convergence science between medicine, drug development, and machine learning including NLP. "This certainly provides CCADD with a momentum to transform into a frontrunner in the field", Prof. Lee added.