Information Chaos
Beasley and colleagues noted that information in an EHR needed for optimal care may be unavailable, inadequate, scattered, conflicting, lost, or inaccurate, a condition they term information chaos.5 Smith and colleagues reported that decision making in 1 of 7 primary care visits was impaired by missing critical information. Surveyed HCPs estimated that 44% of patients with missing information may receive compromised care as a result, including delayed or erroneous diagnosis and increased costs due to duplication of diagnostic testing.6
Even when technically available, the usability of patient-specific data needed for accurate diagnosis is compromised if the HCP cannot find the information. In most systems data storage paradigms mirror database design rather than provider cognitive models. Ultimately, the design of current EHR interaction paradigms squanders precious cognitive resources and time, particularly during patient encounters, leaving little available for the cognitive tasks necessary for accurate diagnosis and treatment decisions.1,3,4,7
VA Corporate Data Warehouse
VistA was implemented as a decentralized system with 130 instances, each of which is a freestanding EHR. However, as all systems share common data structures, the data can be combined from multiple instances when needed. The VA established a CDW more than 15 years ago in order to collate information from multiple sites to support operations as well as to seek new insights. The CDW currently updates nightly from all 130 EHR instances and is the only location in which patient information from all treating sites is combined. Voogle can access the CDW through the Veterans Informatics and Computing Infrastructure (VINCI), which is a mirror of the CDW databases and was established as a secure research environment.
The CDW contains information on 25 million veterans, with about 15 terabytes of text data. Approximately 4 billion data points, including 1 million text notes, are accrued nightly. The Integrated Control Number (ICN), a unique patient identifier, is assigned to each CDW record and is cross-indexed in the master patient index. All CDW data are tied to the ICN, facilitating access to and attribution of all patient data from all VA sites. Voogle relies on this identifier to build indexed files, or domains (which are document collections), of requested specific patient information to support its search algorithm.
Structured Data
Most of the data accrued in an EHR are structured data (such as laboratory test results and vital signs) and stored in a defined database framework. Voogle uses iFind (Intersystems Inc, Cambridge, MA) to index, count, and then search for requested information within structured data fields.
Unstructured Text
In contrast to structured data, text notes are stored as documents that are retrievable by patient, author, date, clinic, as well as numerous other fields. Unstructured (free) text notes are more information rich than either structured data or templated notes since their narrative format more closely parallels providers’ cognitive processes.1,7 The value of the narrative becomes even more critical in understanding complex clinical scenarios with multiple interacting disease processes. Narratives emphasize important details, reducing cognitive overload by reducing the salience of detail the author deems to be less critical. Narrative notes simultaneously assure availability through the use of unstandardized language, often including specialty and disease-specific abbreviations.1 Information needed for decision making in the illustrative case in this report was present only in HCP-entered free-text notes, as the structured data from which the free text was derived were not available.