This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
Medical research and machine learning for health care depend on high-quality data. Electronic data capture (EDC) systems have been widely adopted for metadata-driven digital data collection. However, many systems use proprietary and incompatible formats that inhibit clinical data exchange and metadata reuse. In addition, the configuration and financial requirements of typical EDC systems frequently prevent small-scale studies from benefiting from their inherent advantages.
The aim of this study is to develop and publish an open-source EDC system that addresses these issues. We aim to plan a system that is applicable to a wide range of research projects.
We conducted a literature-based requirements analysis to identify the academic and regulatory demands for digital data collection. After designing and implementing OpenEDC, we performed a usability evaluation to obtain feedback from users.
We identified 20 frequently stated requirements for EDC. According to the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) 25010 norm, we categorized the requirements into functional suitability, availability, compatibility, usability, and security. We developed OpenEDC based on the regulatory-compliant Clinical Data Interchange Standards Consortium Operational Data Model (CDISC ODM) standard. Mobile device support enables the collection of patient-reported outcomes. OpenEDC is publicly available and released under the MIT open-source license.
Adopting an established standard without modifications supports metadata reuse and clinical data exchange, but it limits item layouts. OpenEDC is a stand-alone web app that can be used without a setup or configuration. This should foster compatibility between medical research and open science. OpenEDC is targeted at observational and translational research studies by clinicians.
High-quality data are crucial for obtaining medical research results [
Data exchange and data compatibility are two of the most important areas in medical research. However, proprietary or customized data formats used by EDC vendors render this endeavor a major point of concern. As a result, incompatible electronic case report form (eCRF) data structures impede data integration and analysis from different sources and, hence, the full potential of captured information [
In this study, we describe the development process of OpenEDC to address the aforementioned issues. OpenEDC is an EDC system based on the results of a systematic requirements analysis. To the best of our knowledge, this study makes two unique contributions. First, OpenEDC is entirely based on the regulatory-compliant and internationally accepted Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM) standard [
The remainder of this paper is structured as follows: the Methods section outlines the requirements analysis and evaluation process of OpenEDC. The Results section gives an overview of the identified requirements, the resulting software, and its evaluation outcomes. The contributions, limitations, and future work are discussed in the Discussion section.
OpenEDC was developed within the context of a large-scale medical register project for chronic diseases. For an intended period of more than 10 years, most German university hospitals were to collect patient-reported outcomes and medical routine data with tablet and desktop computers. During the system selection process, however, the shortcomings of the present EDC systems became apparent. On the basis of the register’s long-lasting nature, an ideal system was open source so that it could be maintained in the future without manufacturer dependency or insecure licensing conditions. Being open source would also reduce the risk of unaffordable expenses once the funding of the register might have expired. In addition, standardized metadata import was requested as we had the most eCRFs in the standardized CDISC ODM format. This would allow us to use these methods without time-consuming and error-prone manual transmission. A standardized system would also allow us to export metadata or captured clinical data in a reusable, interoperable, and nonproprietary format in the future. Finally, an easy-to-use and network-independent support for mobile devices was necessary for data collection at the participating sites.
In addition to the project-specific demands, we performed a literature-based requirements analysis to ensure the applicability of OpenEDC in a wide range of research projects. This analysis included the following three steps: first, a literature search revealed the EDC requirements stated by both academics and public bodies. Keywords for searching in the academic repositories PubMed and ScienceDirect were
Main requirements and subrequirements of OpenEDC. Subrequirements are based on commonly stated electronic data capture requirements in the literature. The main categorizing requirements and their definitions originate from the ISO/IEC 25010 norm [
Requirement | Definition | Subrequirements |
Functional suitability | Product or system provides functions that meet stated and implied needs when used under specified conditions |
Design [ Capture and store clinical data [ Form completion tracking [ Field validations (edit checks) [ Conditional fields (skip patterns) [ Multicentric (multisite) studies [ Longitudinal studies (with defined events) [ Multilingual forms [ |
Availability | System, product, or component is operational and accessible when required for use |
Open source [ Minimal setup and configuration [ Distributed (near) real-time access [ Cross-platform (mobile device support) [ Offline-capable [ |
Compatibility | Product, system, or component can exchange information with other products, systems, or components |
Standard-compliant import and export of metadata and clinical data [ Semantic annotation (medical coding) of items [ |
Usability | Product or system can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use |
Ease of use (user-friendly) [ Medical staff and patient accessibility [ |
Security | Product or system protects information and data so that persons or other products or systems have the degree of data access appropriate to their types and levels of authorization |
Authentication and authorization (user rights and roles) [ Encrypted data storage and transmission [ Audit trail [ |
An iterative waterfall model was used to implement the identified requirements [
OpenEDC was evaluated against the identified requirements. However, whereas most of these requirements can be evaluated qualitatively in an absolute sense, that is, achieved or not achieved, usability is perceived subjectively and difficult to generalize [
Before the actual task, a video was shown to the users to explain the main functionalities of OpenEDC. This video is openly available and can be consulted by prospective users as well [
A fundamental requirement for EDC systems is the support for metadata design [
Availability is frequently stated as an important property or, if absent, the reason for the limited dissemination of EDC systems [
Standardized data formats and coding of data elements can foster data compatibility [
It was reported that “the lack of a simple, intuitive, and user-friendly EDC system is noteworthy” [
Regulatory bodies frequently address the data protection and privacy measures of computerized systems in clinical trials. The General Data Protection Regulation (GDPR) of the European Union, for example, became enforceable in all European Union member states in May 2018 [
OpenEDC [
The CDISC ODM provides the groundwork for achieving the following system requirements. From the metadata perspective, events are at the highest hierarchical level with subordinate forms to allow the representation of longitudinal studies. Descriptive or interrogative texts can be defined as multiple translations for multilingual projects. Most frequently, these texts are assigned to data items for which data are to be collected. Data items have data types and may also have specified value ranges to enable real-time field validation. Moreover, items can be dynamically hidden to support conditional fields. Item definitions can be further referenced and reused in other locations. To complement the data schema for the remaining
All of the specifications were implemented and internally used by OpenEDC. This results in fully standard-compliant imports and exports of both metadata and clinical research data. In addition, the CDISC ODM enables the annotation of data items with an arbitrary number of semantic codes. These features constitute the
The
We implemented a simple user interface to address the
User interface of the metadata design mode. The hierarchical order of metadata elements is represented by the centered column view (1). By means of a referencing system, electronic case report forms (eCRFs) can be reused entirely or partially (2). The language of eCRFs can be changed with the drop-down at the top left (3).
User interface of the clinical data capture mode. Subjects can be managed with the left column where an audit trail can be accessed as well (4). Filled or empty circles in the 2 center columns indicate whether an event or form has been completed (5). A survey view button within the right electronic case report form column switches to a mode for patient-reported outcomes (6).
To accomplish the
Sequence diagram of a typical use scenario with OpenEDC. In this example, the stand-alone OpenEDC web application is used to design electronic case report forms and capture data. A Clinical Data Interchange Standards Consortium Operational Data Model file can be uploaded to reuse metadata or import clinical data. Optionally, the user can initialize an empty OpenEDC server with locally stored data. This enables the user to set up a multiuser system and conduct multicentric research studies. EDC: electronic data capture; ODM: operational data model.
The server provides authentication and authorization services to address the remaining
Usability is a subjectively perceived characteristic of a particular context and user [
OpenEDC achieved a mean usability score of 83.1 (SD 9.6) out of 100. Men rated it slightly lower than women with an average score of 82.5 (SD 11.7) compared with 83.8 (SD 7.6). Two additional open questions were answered by 75% (12/16) of the participants. They provided very heterogeneous suggestions for improvement, with most being related to the user interface and few to functionality. Interface-related suggestions were shortcut buttons for frequently used functions, more noticeable highlighting of inputs with implausible data, and a larger visual difference between the metadata and clinical data view. Introducing simple statistics for data completeness and patient enrollment, labeling conditionally unavailable items in the CSV export, and improving support for older browsers were suggestions related to functionality. Most participants stated that they liked the clear user interface and the performance of the system.
This paper describes the implementation process of OpenEDC, an open-source and standard-compliant EDC system for medical research. We conducted a requirements analysis to identify the academic and regulatory demands for digital data collection. After implementation, we performed a usability evaluation to obtain feedback from the users. OpenEDC achieved a mean usability score of 83.1, which can be considered user-friendly [
OpenEDC is based on the CDISC ODM standard, yielding several advantages. Metadata and clinical research data can be imported and exported without constraints in a nonproprietary format and without vendor lock-in effects. Investigators may also download eCRFs from public metadata registries, such as the Portal of Medical Data Models [
Other standards exist for exchanging metadata and clinical research data. For example, Fast Healthcare Interoperability Resources (FHIR) from Health Level 7 (HL7) is increasingly adopted to exchange electronic health records and other information in the medical domain [
OpenEDC is publicly available for the creation of local studies. The app is available via the web for desktop and mobile devices, whereas data storage occurs locally and encrypted. This architecture allows researchers to benefit from metadata-driven digital data collection without an information technology department, web server configuration issues, or device constraints. In addition, it leaves data sovereignty to the investigator, rather than a third-party infrastructure or server provider. While this approach offers advantages in terms of flexibility, it also has some drawbacks. It is generally helpful to have a dedicated computer scientist who can make educated decisions about data security, data backup, and metadata design concerns. Moreover, it may be beneficial for a study’s sustainability to have a contact person for technical problems and issues. However, it is worth noting that an information technology specialist can still be employed when using OpenEDC. In particular, when an OpenEDC server must be configured, for example, for projects with multiple users and sites, knowledge in setting up a web server is important. In our opinion, OpenEDC’s architecture is particularly useful for investigator-initiated studies and enables researchers to set up and test databases before information technology support and infrastructure investments have to be made.
Other EDC systems also exist. One of the most frequently used EDC systems is REDCap [
Future work is necessary. The main objective was to ensure the applicability of OpenEDC to a wide range of research projects. However, literature-based requirements analysis was influenced by the demands of a large-scale medical register. Rarely mentioned requirements were not included if they were not required by the internal project. Examples of rarely mentioned but deferred demands are integrated query management as well as document storage and report functionalities. In addition, although OpenEDC complies with relevant laws and regulations, including 21 Code of Federal Regulations Part 11 and GDPR, a computer system validation required for interventional trials has not yet been conducted. Validating an EDC system is also trial-specific and requires activities by the investigator or sponsor. Currently, we see OpenEDC’s distinct advantages for observational and translational research studies by clinicians rather than commercial clinical trials. We hope it is a valuable first step toward an openly available, standard-compliant, and mobile EDC system. We plan to develop OpenEDC further and use it in prospective studies. To expand the support for varying study protocols, unavailable functions stated earlier should be added. We hope for contributions from the research community, as we have published OpenEDC under the MIT open-source license.
We showed that it is possible to develop an EDC system for use without upfront investment and preservation of data sovereignty. The primary focus was on standard compliance to foster metadata reuse, interoperable research data, and open science. Future work is necessary to extend the system’s functionality and prove its robustness in large-scale studies. OpenEDC is publicly available and released under the MIT open-source license.
Clinical Data Interchange Standards Consortium
electronic case report form
electronic data capture
Fast Health care Interoperability Resources
General Data Protection Regulation
Health Level 7
International Organization for Standardization/International Electrotechnical Commission
Open Data Kit
operational data model
Research Electronic Data Capture
structured data capture
system usability scale
LG designed and implemented OpenEDC and wrote the manuscript. SH conceived and reviewed the manuscript. MD designed the overall concept, supervised the work, and reviewed the manuscript.
None declared.