Research Highlights
Study will boost role of electronic records in care, research
August 3, 2009
Humans are unique in their ability to use language. But machines are gaining ground.
Think of search engines such as Google or Yahoo. Type in even a careless entry such as "chicogp Movie theters" and amazingly the computer figures out what you want.
Such is the power of "natural language processing" (NLP). Search engines demonstrate only one aspect of this branch of computer science, which involves teaching computers to recognize and make sense of free text or speech.
Now, a new VA research project seeks to harness NLP's power to squeeze more value out of electronic health records for researchers, clinicians and managers.
The project is called the Consortium for Healthcare Informatics Research (CHIR). According to lead investigator Matt Samore, MD, the main goal is to change free text in the electronic medical record—doctors' notes, for example—into structured data. That will make the information more useful for research and other purposes.
"It's basically a matter of converting information of one form into information of a different form that has all kinds of new uses," says Samore, a clinician and epidemiologist at the Salt Lake City VA Medical Center.
Webmasters and database architects typically use checkboxes, pull-down menus and other tools to structure the information they want to collect. From their perspective, the less free text, the better. Free text—for example, a couple of sentences about why you visited a store—can't be easily analyzed by computers. Electronic health records are a bit different. They use structured data and templates wherever possible, but there's also an unavoidable need for free text. VA's electronic records, first implemented in the 1990s and still considered state-of-the-art, were designed to give doctors ample opportunity to record their thoughts and decisions.
"There's a real limitation to asking clinicians to input only structured data when they are evaluating patients, recording those evaluations, describing what's happening with the patient, documenting their decisions," says Samore. "There's a richness to free text, a communication benefit. It allows people to express themselves." Merry Ward, PhD, who is overseeing CHIR for VA's Office of Research and Development, agrees: "That narrative is very important for health care providers. Radio buttons, pull-downs, yesno and other forced choices can only go so far in describing the patient's condition. It's much more complex than that."
Samore and colleagues—a consortium involving experts within VA and at several universities—want to unleash the wealth of free-text information within VA's electronic medical records, in a secure manner, so VA can use it to improve veterans' care.
"If you can convert narrative text into structured data," says Samore, "you can improve your measurement of quality, improve surveillance [of infectious diseases and adverse drug events], create new decision-support systems, and help clinicians improve documentation of problems in the medical record. There are a huge number of applications."
One key application, says Samore, is research. For instance: Doctors often enter free-text notes about why they are prescribing certain drugs or how patients are responding. The only way for researchers to study the notes would be to manually review each chart. In studies involving thousands of charts, this would chew up countless hours of time. The effort would be laborious, inefficient and perhaps inaccurate.
With NLP, computers could scan tens of thousands of patient records, find and extract the relevant notes, and convert the information into a structured format. Researchers could use computers to analyze it in huge batches. This could reveal new insights about the drugs that would otherwise remain hidden.
According to Ward, such studies could reveal nuances—as expressed by doctors in free text—about the comparative benefits of one treatment over another. "If we can look at large numbers of patients, large amounts of data, we might be able to get a better sense of what is it about some patients that made them respond better to one treatment over another."
The technology would also boost genomic research—studies that link patients' genetic information with their health risks, needs or outcomes. Ward says CHIR will allow the electronic medical record to be a "very powerful" tool in this regard—for example, by enabling researchers to analyze free text about clinical traits tied to certain genes.
Along with these applications, Samore says CHIR will also develop new ways to "de-identify" patient charts so researchers can access clinically relevant information but not patient names or other identifiers. He says protecting veterans' privacy is a critical, overarching goal of the project.
While CHIR will increase the number of data fields within the electronic health record that are available for research, a related project will greatly expand the number of patient records available for any given study.
As of now, the records from each of VA's more than 1,400 care sites are gathered together only at the regional level. Most studies based on patient records use charts only from a facility-wide or region-wide sample—say, from the Southwest. Some administrative data are pulled out and made available in national databases, and VA researchers have been using these for years, but this represents only part of the data available in the actual records.
That will change, thanks to an initiative called Veterans' Informatics, Information, and Computing Infrastructure. Known by the acronym VINCI, the project will "roll up" electronic medical records from VA sites nationwide into one secure, centralized data repository.
"VINCI will make all the data available for researchers in a highly secure fashion," explains Samore. "The data will never be taken off the VINCI servers. And the tools to do natural language processing will be installed and available for use within the VINCI environment."
Adds Ward, "Not only will the data that researchers have access to be much richer [because of the ability to analyze free text], but they'll be able to include veterans everywhere."
She notes that currently, veterans seen at medical centers with no research program are far less likely to be included in database studies by VA investigators at other sites. The new paradigm will allow for more representative sampling of veterans nationwide.
Between CHIR and VINCI, experts expect that VA researchers will be in a position to conduct large, nationwide studies—in some cases covering up to two decades of patients' medical care, as documented in their electronic records—with unprecedented efficiency and precision. The end beneficiaries of the knowledge gained, notes Samore, will be veterans.
"VA research is always tied to care of veterans and efforts to improve care," he says. "It's not just research for research's sake."
This article originally appeared in the August 2009 issue of VA Research Currents.
