서울대학교 CCADD 연구실

HOME

SEARCH

[NEWS]

The First Patent Was Granted

[NEWS]

CCADD joined KOSMI 2019

[NEWS]

CCADD members visited G-Epi to share research outcomes

[NEWS]

Prof. Lee and Yoomin Jeon at the EMNLP 2018 in Brussels, Belgium from October 31 through November 4

Prof. Howard Lee and Yoomin Jeon attended the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018) held at the Square Meeting Center in Brussels from October 31^st through November 4^th, 2018. EMNLP is one of the top-tier conferences in the area of natural language processing organized by SIGDAT, the Association for Computational Linguistics (ACL) special interest group on linguistic data and corpus-based approaches to NLP. With the current artificial intelligence and machine learning boom from a variety of disciplines, this year's conference had a record-breaking number of submissions (2,231) and 2,500+ attendees, a 48% increase compared with EMNLP 2017. Of all the long and short papers submitted, only 549 papers were accepted, which can be found in the EMNLP proceedings. The conference included two days of workshops, followed by three days of main conference sessions including long and short presentations, tutorials, and demos. Researchers and practitioners from around the world gathered to share their latest research findings and emerging trends in machine learning for NLP, machine translation, language models, text mining, information extraction, and many more.

In the 'Writing Code for NLP Research' tutorial, researchers from Allen AI Institute shared their best practices learned from designing the development of the Allen NLP toolkit, a PyTorch-based open-source library for deep learning NLP research. It was an excellent tutorial to learn practical advice on writing research code and to recognize the importance of having good quality codes. In the 'Deep Latent Variable Models of Natural Language' tutorial, deep latent variable models were covered, which are emerging as a useful tool in NLP applications. Mainly, three types of latent variable models (discrete, continuous, and structured) and inference strategies (exact gradient, sampling, and conjugacy) over the latent variables were examined.

There were a number of presentations/papers in the clinical domain. One of the interesting papers, titled 'emrQA: A Large Corpus for Question Answering on Electronic Medical Records', proposed a novel semi-automated framework to generate domain-specific large-scale question answering (QA) corpus by leveraging expert annotations on clinical notes for various NLP tasks from the i2b2 challenge datasets. For medical QA on EMR, they released 400K+ question-answer evidence pairs and 1M question-logical forms, a representation useful for generating corpus. Generating patient-specific QA from an EMR is an important NLP task to process a natural language question and provide a human level accuracy answer. Among many datasets that were presented at the conference, a clinical domain dataset MedNLI was included. MedNLI is a Natural Language Inference (NLI) dataset on the medical history of patients annotated by physicians, which we can later refer in our research.

In a paper presented by Google Research, titled 'Self-Governing Neural Networks for On-Device Short Text Classification', researchers described on-device Self-Governing Neural Networks (SGNNs) that achieve state-of-the-art results in dialog related tasks. SGNN uses locality sensitive hashing, a technique for dimensionality reduction in data for clustering. As a result, the embedding layer for their on-device architecture has dropped to 300K parameters compared to other methods having millions of parameters. The proposed method is useful for text classification applications that significantly save storage cost and improve computational performance.

As newcomers to the field focusing on data-driven approaches to NLP, this conference was challenging but worthwhile to see cutting-edge research in NLP. We sure took some inspirations from the conference for our new research to develop a dimensionality reduction model for assessing clinical trials feasibility. We hope to attend the next year's EMNLP conference in Hong Kong!

Okay, below are two cuts of the so-called "Beauty and the Beast." Funny, isn't it?

[NEWS]

CCADD presence at Korean Society of Medical Informatics (KOSMI) conference.

[NEWS]

Prof. Howard Lee received a grant of $223,000 from NRF

[NEWS]

Four CCADD members attended the SNU-Hokkaido University joint symposium held in Sapporo, Japan

Four CCADD members attended the SNU-Hokkaido University joint symposium held in Sapporo, Japan. On the second day of the symposium, professors and students of the Graduate School of Convergence Science and Technology of Seoul National University met members of the Graduate School of Information Science and Technology of Hokkaido University and shared their research ideas, experiences, and results. The participants from the two universities introduced their research progress on a variety of topics, varying from computer-human interaction study to the development of new drug entities using machine learning technology.

Dr. Soohyun Kim of CCADD gave an oral presentation, where she discussed if therapeutic drug monitoring (TDM) of vancomycin is cost-effective in elderly patients. According to Dr. Kim, the cost-effectiveness of vancomycin in this population might not be so cost-effective as was previously anticipated, but could be cost-effective in patients admitted to the intensive care unit. In the following poster session, Dr. Jeong-An Gim presented the results of a pharmacoepigenomic research (Difference in DNA methylation and pharmacokinetics parameters of tacrolimus between two periods of a bioequivalence test with tacrolimus). Dr. Kim tried to provide a basis for tacrolimus pharmacoepigenomics and may provide a clue to the different dose of tacrolimus through his research. Yoomin Jeon, a PhD student at CCADD, introduced a new research project, entitled "A Dimensionality Reduction Model to Increase the Efficiency and Accuracy of Clinical Trial Feasibility Assessment Using Electronic Medical Records." Yoomin’s poster was associated with a new grant CCADD received recently from the National Research Foundation (PI: Prof. Howard Lee). Another PhD student, Siun Kim, presented his research with Dr. Soohuyn Kim about the validity of the equivalence margin in the phase 3 trial with CT-P13, an Infliximab biosimilar. He had an opportunity to discuss the implication and the future plan of the study with researchers of Hokkaido University.

Those attending the symposium from CCADD were able to meet with many young Japanese professionals and researchers in various fields. They look forward to seeing them again at a joint symposium at Seoul National University next year.

[RESEARCH]

Use of Artificial Intelligence for Clinical Drug Development

Artificial Intelligence (AI) is a discipline of computer science that studies the ways to mimic and reproduce human intelligence processes such as learning, knowledge representation, decision-making and reasoning by machines. Several AI-based approaches have been applied to drug discovery and development to increase the efficiency while reducing the time and cost, resulting in a mix of success and failure.

Clinical trials are an important research tool to determine the safety and efficacy of the drug in humans, which s an indispensable knowledge for physicians and patients for the best and optimal care. With the advent of digital health care technology, the fragmented nature of clinical data capture in the previous era is about to change and the way that clinical trials are performed can be also revolutionized. In particular, digital healthcare technology implemented in wearable devices and smart phones has enabled uninterrupted continuous data collection from patients in a real-life setting.

Furthermore, there has been a growing demand for customized, user-friendly digital healthcare platforms that serve each patient’s specialized needs to identify and locate a clinical trial [s]he may be eligible for. Patient recruitment is one of the most expensive, time-consuming, and inefficient steps in any clinical trials. What makes the matter worse is that eligibility assessment is manually conducted by humans (i.e., physicians or study coordinators) d on eyeballing of large amount of patient records and related information. To overcome this drawback, AI-based deep learning algorithms can help match eligible patients to a specific clinical trial and recruiting them into it. Therefore, AI can turn lengthy, laborious, and complex procedures of patient eligibility assessment into several quick and easy clicks on a machine-learning system.

Likewise, it is almost impossible for an average patient, who is lacking the domain knowledge, to find a list of potential clinical trials that [s]he might be eligible for without his/her physician’s help. A simpler, but more efficient, way may be to leverage the utility of AI to identify an appropriate list of clinical trials d on patient’s diagnosis, disease stage, severity and conditions, geographical location, and even personal preferences. Chatbot and the voice-activated assistant can play an important role too in this process.

CCADD has focused on the application of AI technology to clinical drug development, particularly the operational aspects of clinical trials such as eligibility assessment and patient recruit. With the advance of information and communication technology, AI-approach can increase the efficiency of patient recruitment process by providing automated eligibility screening. To analyze heterogeneous patient data, a variety of machine learning algorithms can be used to assess whether a certain patient meets the eligibility criteria, on his/her age, gender, stage of disease, medical history, and clinical conditions. Additionally, dynamic deep neural network can be also used to select clinical features from the electronic medical records (EMR) for eligibility screening. Topic modeling is another helpful tool.

To enhance EMR usage for clinical trials, the two most important clinical trials' resources - information on the potential pool and clinical trials being conducted (or have been conducted) - should be fully integrated. By the year 2018, CCADD has successfully developed a clinical trial resource integration system named 'AI-based Clinical Trial Resource Information System' or ACTRiS. What makes ACTRiS unique is its active employment of state-of-the-art user interface (UI) & user experience (UX) technologies, and implementation fo AI technologies. This integrated system will provide a sound basis to design clinical trials that are feasible and practical to perform.

CCADD is currently working on a research project named 'A Dimensionality Reduction Model to increase the efficiency and accuracy of clinical trial feasibility assessment using electronic medical records'. This research aimed to develop a machine learning-basedimensionality reduction model to select the discriminant subset of eligibility features, which

adequately returns a sufficient number of eligible patients. This algorithm has the potential to significantly increase the efficiency and performance standard of the traditional approach for patient eligibility screening, contributing to better and more economic conduct of clinical trials.

Also, since spring 2019, CCADD has participated in the three-year project named 'Development an AI-model to predict and evaluate drug-basedrug interactions (DDIs)'. In this project, CCADD is responsible for collecting and curating drug-food interaction information (DFI) from publicly available research papers. Now, we are developing NLP models that recognize name entities of drug and food and classify whether a sentence in an abstract of scientific literature contains a valid DFI information or not. Also, we plan to validate a developed system by verifying whether predicted drug-basedrug pairs as having DDI cause some meaningful change of safety and efficacy of victim drug using a Common Data Model (CDM) of SNUH.

[NEWS]

CCADD presence @KOSMI conference

Many of CCADD members have attended the Spring Conference of The Korean Society of Medical Informatics (KOSMI), held at Samsung Medical Comprehensive Cancer Center on June 14-15, 2018. Reflecting the recent drastic increase in the demand and importance of data science in health sciences, the theme of the conference was 'Evolving Data for Better Health'. More than 600 researchers, practitioners, and vendors in health informatics gathered together to share their knowledge, experience, results of research, and expertise through a series of keynote speeches, symposia, oral/poster presentations, and exhibitions.

Four abstracts by CCADD were accepted for poster presentation. Yoomin Jeon proposed a new approach to transform clinical trials eligibility criteria (EC, free-text) into structured data using the OMOP-CDM(Feasibility of Using the OMOP Common Data Model to Assess Clinical Trial Feasibility). She found that >75% of the EC could be mapped by the proposed approach, and the most important data group was condition.

Serin Lee, a student intern, gave her poster presentation entitled 'Dimensionality Reduction Model for Common Eligibility Criteria to effectively use Electronic Medical Record: A Pilot Study.' She proposed a dimensionality reduction model for common eligibility features (CEFs) based on ontology and semantic features of EC to reasonably compose the concept sets.

Dr. Soohyun Kim proposed a new type of command data model for clinical trials (Clinical Trial Common Data Model, CT-CDM) to support decision-making processes in clinical trials, entitled 'CDISC-based Common Data Model for Developing an Integrated Decision-Support System in Clinical Trials'. This command data model fully complies with the standards of Clinical Data Interchange Standards Consortium(CDISC) as a global, platform-independent data standard guaranteeing system interoperability.

Lastly, but not the least, Dr. Jeong-An Gim shared his research findings: Application of Machine Learning Technology to Classify Healthy Subjects into Different Pharmacokinetic Exposure Groups to Tacrolimus. Using the decision-tree approach employed in R, Dr. Gim aimed to identify SNPs that are associated with the exposure to tacrolimus. The rs776746 in the CYP3A5 gene was the most influential mutation.

More detailed research findings can be downloaded below.

Much to everyone's delighted surprise, Serin Lee won an external solid-state drive of 1TB at the raffle draw! Since Serin will graduate in August, the prize was certainly a nice present.

CCADD

Center for Convergence Approaches in Drug Development, Graduate School of Convergence Science and Technology, Seoul National University

Room C-208, 145 Gwanggyo-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16229, SOUTH KOREA (Gwanggyo)
Room 406, Building 17, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, SOUTH KOREA (Yeon-gun)

Tel: +82-31-888-9189 (Gwanggyo); +82-2-3668-7381 (Yeon-gun)

Fax: +82-31-888-9575

Email: ccadd.snu@gmail.com

KOREAN

ABOUT

PRINCIPAL INVESTIGATOR

PEOPLE

RESEARCH

PUBLICATIONS

NEWS

SEARCH

SEARCH

CCADD

Center for Convergence Approaches in Drug Development, Graduate School of Convergence Science and Technology, Seoul National University

Room C-208, 145 Gwanggyo-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16229, SOUTH KOREA (Gwanggyo)
Room 406, Building 17, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, SOUTH KOREA (Yeon-gun)

SEARCH

CCADD

Center for Convergence Approaches in Drug Development, Graduate School of Convergence Science and Technology, Seoul National University

Room C-208, 145 Gwanggyo-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16229, SOUTH KOREA (Gwanggyo)Room 406, Building 17, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, SOUTH KOREA (Yeon-gun)

Room C-208, 145 Gwanggyo-ro, Yeongtong-gu, Suwon-si, Gyeonggi-do, 16229, SOUTH KOREA (Gwanggyo)
Room 406, Building 17, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, SOUTH KOREA (Yeon-gun)