Keywords: coronavirus disease; COVID-19; openEHR; archetype; template; knowledge modeling; clinical guidelines
The coronavirus disease (COVID-19) is a severe infectious disease that has been confirmed to lead to human-to-human transmission since December 2019 [[
Although, most cases of the disease occurred in mainland China in the beginning, other areas have also confirmed cases of the same disease, and the number of cases continues to increase. The World Health Organization (WHO) has declared the 2019-20 coronavirus outbreak to be a Public Health Emergency of International Concern [[
Symptoms include fever, cough, or shortness of breath, and even pneumonia, multi-organ failure, and death in the most severe cases. The latent period can be between 1 and 14 days, and on average is between 3 and 7 days according to the epidemiological investigation. What is even worse is that some patients may be asymptomatic at the beginning, which results in some undetected errors [[
Considering the rapid spread of the disease, the transferring of knowledge of diagnosis and treatment of the disease and newly updated achievements of research are important, especially from areas with improved epidemics like mainland China, the area where the epidemic had begun. Although some efforts have been made through teleconsultation and medical staff assistance, they are still limited due to a lack of experts. Using decision support tools is an efficient way to transfer the knowledge of experts to clinical practices. Epic (Epic Systems), which is a health care software company with electronic medical record software application, has sent out an update to its customers to detect potential cases of COVID-19 [[
Many studies on terminology standardization have been conducted to improve the interoperability. Systematized Nomenclature of Human Medicine International has issued an interim release to promote the analysis with the most up-to-date terminology [[
An open, semantic-sharing, and collaborative-modeling framework is needed to meet the dynamic change of data requirements. openEHR specifications can be used to create standards and build information and interoperability solutions for health care as a multilevel modeling framework [[
As a goal, we developed an openEHR template to promote interoperability among clinical systems for the diagnosis and treatment of COVID-19. The remainder of this paper is organized as follows: the Methods section introduces the knowledge source and methodology we used to develop and review the template, the Results section illustrates our results step-by-step according to the proposed methodology, and the Discussion section discusses the contributions of this paper and limitations.
Given that the outbreak of the disease happened within a short time frame, the involved knowledge is limited. To make them justified and believed, the Guideline for Diagnosis and Treatment of COVID-19 released by the National Health Commission of the People's Republic of China was adopted as the knowledge source. At present, the guideline has evolved to the 7th edition [[
To develop an openEHR template for COVID-19, our method consists of six steps. These steps include collecting data items, organizing domain concepts, searching corresponding archetypes, developing an openEHR template, and reviewing and releasing the template (see Figure 1).
Graph: Figure 1. The method of developing an openEHR template about COVID-19. CKM: Clinical Knowledge Manager; COVID-19: coronavirus disease.
In this step, data items related to diagnosis and treatment were extracted from sections 4-9, section 11, and part of section 10 of the guideline, and further organized in Excel (Microsoft Corporation) with three columns. The first and the second column corresponded to the sections and subsections of the guideline, and the third column corresponded to the data items extracted from the subsection. The extracted original data items in Chinese have been translated from Chinese to English. Although extraction of data items was done manually, two principles were followed to lower the bias of extraction and reduce the errors.
First, considering that the composition of the guideline is hierarchical and segmented, which is the inherent knowledge of grouping data items, the extracted data items were organized in the same hierarchical structure as the guideline to not only lay the foundation for further organization of domain concepts but also provide a much easier correspondence for reviewers when verifying the correctness of the extraction.
Second, two members of our team extracted these data items separately. After the extraction, both of them exchanged and reviewed the opponent's results. For the results acknowledged by both of them, they were included in the Excel file directly. For the results acknowledged by only one of them, they were reviewed by another member to confirm the final results. For the results that were acknowledged by both of them but needed to be refined, they were re-extracted by both team members.
The organization of domain concepts is the basis for the development of archetypes and templates. Five steps were performed to organize domain concepts from the extracted data items.
- If the data items from different subsections were the same semantically, they were merged into a single-data item. For example, coagulopathy and blood coagulation disorder can be merged into blood coagulation disorder.
- If the data items from different subsections belong to the same domain concept, they were regrouped into a more suitable group other than the sections or subsections. For example, symptoms such as fever and difficulties in breathing found in different sections will be regrouped together.
- According to practices from clinical decision support with the clinician participants, medical concepts that are encountered and used commonly were selected and organized as a supplement. For example, in diagnosis and treatment, operations such as surgery are generally mentioned, such as pneumonectomy or splenectomy, but similar medical concepts are not mentioned in the guideline, so this step is a significant supplement to the knowledge extracted from the guideline.
- All domain concepts were then organized as a tree structure according to the inherent correlation among them and were represented into a mind map using XMind (XMind Ltd) as a tool. The extracted data items can be either the data elements themselves or one value in the value set of the data elements within domain concepts. For example, respiratory failure and blood coagulation disorder are each treated as a single value within the value set of diagnoses.
- Finally, the domain concepts were further classified into three categories according to the different stages in the process of clinical diagnosis and treatment; they are "Instruction, Evaluation, and Observation."
To avoid developing archetypes repeatedly and to facilitate semantic interoperability, the adoption of existing archetypes is of much significance. The openEHR Foundation provides a website called Clinical Knowledge Manager (CKM) [[
This step mainly focused on performing a search in the repository to find the archetypes with similar semantics. The name of domain concepts and data items were used as keywords to identify the archetypes. On account of polysemy and synonym, extra manual work was carried out to find the related archetypes. Some archetypes can be used directly, which means the data elements can be represented in these archetypes exactly, and there is no difference on a semantic level among them.
If no corresponding archetypes exist or existing archetypes cannot represent the data elements fully, developing new archetypes or extending existing archetypes is necessary according to the syntax of openEHR Archetype Definition Language [[
After the required archetypes were found and developed, the template can be built based on them. This task can be performed with the support of Ocean Template Editor [[
The review process is necessary to achieve a template with high quality. Two aspects of the template have been reviewed. First, the representation of the domain knowledge, such as correctness of the semantics, classification of data elements, and logic structure of archetypes, was reviewed. Second, the template was reviewed from informatics, such as the data types of data elements and relationships among different archetypes. The study has designed two review phases to achieve the goal.
- Internal review phase: the internal review group includes a total of four persons, with one person that is familiar with the COVID-19 guideline, two persons who are developers of clinical data repositories (CDRs) and decision support tools, and one person that is familiar with openEHR specifications.
- Outreach review phase: the participation of two domain experts beyond the research team was involved in this phase. One has the expertise of openEHR modeling and the other has the expertise of medical informatics.
After the review, the template and the used archetypes have been uploaded and shared in the Healthcare Modeling Collaboration [[
The study has designed a test case to verify the feasibility of the template. The case was conducted in a hospital located in Wuhan, which has already implemented openEHR-based CDR and accepted a large number of patients with COVID-19. The CDR in the hospital was built based on the solution we have proposed, which can be found in [[
The test case was designed to include two steps: (
Graph: Figure 2. The interaction diagram between CDR and decision support tool. CDR: clinical data repository; CDSS: clinical decision support system.
Based on the methodology previously described, 203 data items were extracted from the guideline in China, including 8 sections and 15 subsections (see Multimedia Appendix 1). After the classification and merge of these data elements, 16 domain concepts (16 leaf nodes in the mind map) were organized for diagnosis and treatment of COVID-19. The results in this step are illustrated in the mind map in Figure 3 (full results can be found in Multimedia Appendix 2).
Among these domain concepts, only 2 archetypes were classified into Instruction, and 3 of them were classified into Evaluation. The archetypes of Observation include 11 items. A total 22 archetypes have been developed to represent all data elements about COVID-19, and all of them can be referred to from the CKM directly. These archetypes found in the CKM, which are adapted to our requirements, are shown in Textbox 1. Finally, a template was developed with the constraint of these archetypes as shown in Figure 4.
In addition, it has been deployed in a hospital, which has accepted many cases of COVID-19, to support data sharing between CDRs and clinical decision support systems (CDSS). Because the CDR is developed based on openEHR, the storage structure is consistent with the template. Although there exist many storage implementations of openEHR [[
In the end, the template (COVID-19 Pneumonia Diagnosis and Treatment [7th edition]) has been uploaded into the CKM [[
Graph: Figure 3. Domain concepts about COVID-19. COVID-19: coronavirus disease.
Graph: Textbox 1. Domain concepts and their archetypes found in the Clinical Knowledge Manager.
Graph: Figure 4. The developed template in Ocean Template Editor. COVID-19: coronavirus disease; CT: computed tomography; ESR: erythrocyte sedimentation rate; SARS: severe acute respiratory syndrome.
Graph: Figure 5. The data view of COVID-19 Diagnosis and Treatment CDSS. CDSS: clinical decision support system; COVID-19: coronavirus disease; ICU: intensive care unit; rRT-PCR: real time reverse transcription-polymerase chain reaction; WBC: white blood cell.
Graph: Textbox 2. The part of the content of data in representational state transfer application programming interface from clinical data repositories.
Since the openEHR template developed in our study covered the contents related to clinical characteristics, diagnosis criteria, clinical classification, warning signs for severe and critical cases, differential diagnosis, diagnosis of suspected cases, treatment, and discharge from the latest guideline, it could be used for data exchanging among systems in different clinical scenarios such as screening patients in outpatient clinics, where the diagnosis of suspected cases will be the main focus; the routine round in the wards, where the diagnosis and warning signs for severe or critical cases will be more important; and the intensive care unit, where the treatment recommendation will be the most necessary. Although some hospital information system vendors [[
Furthermore, the results of our study can be used for purposes other than the diagnosis and treatment of COVID-19. It can help to develop scales according to severity at different levels and be used for risk assessment of COVID-19. Meanwhile, it is also significant for the prevention and control of the disease in the community. The questionnaires can be designed for people who are under closed management to monitor their physical conditions.
COVID-19 was a new threatening infectious disease that brought great pressure on medical systems around the world and with limited previous knowledge in the domain. The methods of diagnosis and treatment have been updated rapidly to reflect the achievement of the latest research since the outbreak of the disease, which sets the challenge for data exchange among systems. The openEHR modeling approach perfectly meets the requirement since the multilevel modeling is especially suitable for the knowledge evolution. In the openEHR ecosystem, when the knowledge of diagnosis and treatment has been updated, only the template needs to be updated and the apps can be kept unchanged. This enables the latest knowledge to be applied to clinical practice at the fastest speed.
In our study, once the latest guideline has been released, the new knowledge can be incorporated into the existing template according to the flowchart shown in Figure 6. Compared with starting from scratch as in the first round, only a few steps need to be performed when the knowledge has been updated, which is shown in the red box (Figure 6).
Graph: Figure 6. The updating process of the template. CKM: Clinical Knowledge Manager; COVID-19: coronavirus disease.
The purpose of the template needs to be refined before modeling since it will largely affect the final results. First, although the purpose of the study is to develop the template for diagnosis and treatment, it still needs to be refined based on whether it is for rule-based decision support tools only or general decision support. As an example, the guideline only mentioned that pregnancy status may affect the intervention without specifying the exact rules, so it is not necessary to be modeled if only for the rule-based decision support tool, but it will still be useful information for professionals to make the decision. Second, the refined purpose should also clarify whether it is used for exchanging the original data from EMR or the condition points for the final decision support. Data items extracted from the guideline usually did not exist in CDR, so they need to be abstracted from the existing data. For instance, the guideline may describe the "Two consecutive negative nucleic acid tests using respiratory tract samples (taken at least 24 hours apart)" as a condition point for discharge, but it has to be calculated from two data items of the nucleic acid test.
This study has used the guideline released by the National Health Commission of the People's Republic of China as the only knowledge source for modeling since the authoritative knowledge was limited at the beginning of the outbreak. However, with the ever-increasing research results and experiences of diagnosis and treatment, there are and will be more knowledge sources, such as the handbook developed by the First Affiliated Hospital, Zhejiang University School of Medicine jointly sponsored by the Jack Ma Foundation and Alibaba Foundation [[
Although the template has been reviewed and verified in our study, it still has limitations. First, due to the reason that most of the experienced medical professionals were prioritizing clinical care of patients with COVID 19, it was difficult to have the template reviewed by professionals. Second, since there are not many cases that have been conducted, there may exist some points not appropriate for specific cases (eg, the patient may need to be treated with extra intervention that is not represented in the template). Therefore, further case studies and reviews are necessary to improve the template.
The template can be easily deployed in the openEHR-based CDR as shown in this study. However, not all institution's implemented systems are based on openEHR, so there will be a limitation for the use of the template in such scenarios. However, since the infrastructure of openEHR has been designed to be compatible with other existing industry standards, the template can be easily transferred to other popular accepted industry standards like JavaScript Object Notation (JSON), XML, and Health Level 7. To take JSON as an example, the template can be expressed in this format within the support of the JSON schema [[
This paper developed and released the openEHR template based on the latest guidelines of COVID-19 in China. Most of the archetypes used in the template can be covered by existing archetypes in the CKM. This study proved that the openEHR approach has advantages in modeling a new medical application field and meeting the requirements of rapidly updating knowledge. The template developed in this study could be used to transfer the experience and knowledge achieved from China to other countries and regions as soon as possible from the perspective of improving data exchange among applications to defeat COVID-19.
This study was funded by Chinese National Science and Technology Major Project (grant number 2016YFC0901703).
None declared.
Multimedia Appendix 1
Data items extracted from the latest clinical guideline released by China.
XLSX File (Microsoft Excel File), 15 KBMultimedia Appendix 2
The details of domain concepts presented in mind map.
PDF File (Adobe PDF File), 224 KB
Edited by G Eysenbach; submitted 14.05.20; peer-reviewed by S Kobayashi, I McNicoll, N Deng; comments to author 27.05.20; revised version received 03.06.20; accepted 03.06.20; published 10.06.20
By Mengyang Li; Heather Leslie; Bin Qi; Shan Nan; Hongshuo Feng; Hailing Cai; Xudong Lu and Huilong Duan
Reported by Author; Author; Author; Author; Author; Author; Author; Author