Large amounts of patient data from clinical and research settings could allow for predictive medicine and more effective treatments, but the integration work is just beginning
There are so many ways Big Data can have an impact on human well-being. In the healthcare world, Big Data encompasses everything from a patient’s electronic medical record (EMR), imaging, lab tests, and –omic data (e.g. genomics, proteomics, metabolomics, etc.) that could lead to better diagnosis and treatments, to healthcare cost data. Meanwhile, there are all types of devices generating continuous health monitoring data that need to be processed.
Biomedical Big Data can also be categorized into several pools: one is in the research and experimental mode, where scientists would look at molecular profiling –omic data; another one is in hospital environment, where lab tests and imaging data of patients flow in; in addition, there is data from healthcare industry or insurance payers.
Integrating these different pools of biomedical data to provide evidence for health decision making is fraught with non-technical and technical challenges. Currently, on the research side, one major hurdle is the lack of standards and effective ways that integrate advanced medical research data in various institutions to improve clinical practice. On the industry and health system side, a pharmaceutical company may protect a longitudinal study of patients because it is using that data in a competitive atmosphere to market a drug or get FDA clearance to be the first to market. Hospitals and insurance companies are so concerned with patient privacy that they assume a conservative posture and lock things up very tightly and make it difficult for others to access those types of data sets.
There are only few hospitals that have attempted to integrate MRI, CT, and PET images, and advanced –omic testing into traditional EMR. This presents a missed opportunity because we are not able to leverage the knowledge that’s spinning out on hard drives in disparate hospitals around the world to support smarter decision making and to improve patient outcomes. Further complicating matters is for the same physiological target, data is being generated from multiple different sources and vendors with different formats and security protections, and some sitting in distributed networks.
There is also the condition of the data to take into consideration. Big Data is not only about volume and variety, but it is also about velocity, veracity and value. Velocity needs to be taken into consideration because some data doesn’t change very often or even at all, but other data may change frequently. For example, heart rate may change on the scale of minutes, while the genomic profile of the patient may not change for months. Or if a patient had to stay in the intensive care unit for a few days, a lot of physiological parameters will change, which means data is being measured at a completely different velocity as compared to an annual checkup.
On veracity, there is a push toward EMR adoption, which has hit more than 80% of clinics and hospitals by 2015 in the United States. That is a huge accomplishment, but it currently lacks interoperability and standards for the EMR. Also, hospital databases are rarely perfect. They are often not structured and data isn’t “clean” in the sense that there is missing or incorrect data due to human or system errors. Thus, there is an urgent need to develop tools that can tease out bad data to ensure overall integrity; and it is critical for all stakeholders in healthcare to invest in and develop standards and interoperability.
To get value out of Big Data, in addition to creating standards, there is need for developing data analytics. This effort is not only driven by the federal government, but also involves a number of health communities spanning cancer care, cardiology and neurology, among others, that all require integration of different modalities of data to make decision. However, combining multi-modality of data from multiple patients where the veracity and velocity vary widely is challenging.
Many big technology players are jumping into the health and wellness area to develop biomedical Big Data infrastructure and analytics, while players such as insurance companies are also using Big Data analytics for reimbursement. For example, an insurance company could take data from a single hospital or a cluster of hospitals to determine the best practices and outcome in consideration of financial reimbursements and over time expand the initiative across an entire state. Data sets from various hospitals could be aggregated together to develop the health decision support system, with the ultimate goal to provide more effective ways of treating patients while minimizing costs,
As standards and Big Data analytics work progresses, it will open doors to many opportunities to leverage Big Data in healthcare, where tools such as data visualization and machine learning can be applied to the variety of available data sets and lead to more effective treatments as well as more efficient healthcare delivery. For more information on Big Data, please visit http://bigdata.ieee.org/.