![]() ![]() guidelines directing the annotation process) 2) whose expertise is needed for annotation (e.g. Previous efforts to understand the challenges of annotation have focused on to three considerations of this task: 1) how to annotate (e.g. Therefore, it is important to understand issues related to the annotation process, to ensure the quality and improve the efficiency of corpus development for both research and operational machine-learning purposes. Manual annotation by clinical experts is both time consuming and expensive. Most of these machine learning-based approaches, however, rely on high-quality annotated clinical corpora. Although early clinical NLP systems were primarily rule-based, recent statistical NLP approaches (e.g., machine learning methods) have demonstrated superior performance on clinical information extraction tasks such as named entity recognition (NER) and relation extraction 5, 6. Consequently, advanced NLP methods, various end-to-end NLP systems, and diverse NLP applications have been reported in this domain 2– 4. Clinical natural language processing (NLP) technologies, which can unlock information embedded in clinical text, have been extensively investigated 1. With the wide implementation of electronic health record (EHRs) systems, the accumulation of clinical data, including unstructured textual documents, has become a valuable resource for clinical research and practice 1. The linear regression model achieved an R 2 of 0.611, and revealed eight time-associated factors, including characteristics of sentences, individual users, and annotation order with implications for the practice of annotation, and the development of cost models for active learning research. Then we defined a set of factors that we hypothesized might affect annotation time, and fitted them into a linear regression model to predict annotation time. We recruited 9 users to annotate a clinical corpus and recorded annotation time for each sample. In this study, we aimed to investigate how factors inherent to the text affect annotation time for a named entity recognition (NER) task. ![]() However, limited work has been done to understand annotation of clinical text. Consequently, it important to identify factors that may affect annotation time, such as syntactic complexity of the text- to-be-annotated and the vagaries of individual user behavior. Building high-quality annotated clinical corpora is necessary for developing statistical Natural Language Processing (NLP) models to unlock information embedded in clinical text, but it is also time consuming and expensive. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |