By J. Daniel Janzen
J. Daniel Janzen is a writer in Brooklyn, New York focusing on industry and technology trends.
J. Daniel Janzen is a writer in Brooklyn, New York focusing on industry and technology trends.
How many innovations remain stranded on the drawing board because their processing requirements were simply too intensive to be practical? Imagine, however, if vast reserves of processing power were available for computing massively parallel data problems. Suddenly, algorithms previously dismissed as pipe dreams move decisively into the realm of the possible.
Recent advances are making it simple for programmers to harness the massively parallel core machine within high-end graphics processing units (GPUs) for general purpose computing, providing the processing power to enable a broad spectrum of new applications, from life-saving medical imaging to oil and gas exploration to electromagnetic simulation.
Medical imaging demands push traditional CPUs to the limit
The Breast Imaging Division in the Department of Radiology at Massachusetts General Hospital (MGH) has been a longtime leader in breast cancer detection and diagnosis. MGH’s technological innovation and clinical research have helped doctors catch tumors earlier in their growth cycle. While the overall breast cancer rate in the U.S. is increasing, widespread and regular screening has reduced the death rate by 25% since 1990.
Management at this division was instrumental in winning FDA approval for full field digital mammography, which captures and displays X-ray images electronically rather than on film. However, while digital mammography is more effective than film mammography in some cases, it still shares many of its shortcomings. The two-dimensional images used in traditional mammography can be difficult to read, especially with small tumors or when breast tissue is particularly dense or overlapping due to compression. As a result, 25% of the “suspicious” mammograms that lead to a callback turn out to be false alarms.
In a collaborative effort, Nvidia Quadro, Mercury Computer Systems, and the Breast Imaging Div, Dept of Radiology at Massachusetts General Hospital developed a 3D cancer screening device that uses technique called Digital Breast Tomosynthesis. This technique is a way to construct a 3D map of the breast to view tumors that are often obscured on 2D scans. Instead of two X-ray views the scanner integrates up to 25 views per breast, each taken from a different vantage point along an arc. This technique required the use of a powerful graphics processing computer to process the images in minutes rather than hours. Made possible by a grant from the United States Army and the engineering expertise of the General Electric Company, a Tomosynthesis scanner was built and installed at the Massachusetts General Hospital.
To improve the effectiveness of early breast cancer screening, doctors applied a new technique called Digital Breast Tomosynthesis (DBT)—essentially, a way to construct a 3D map of the breast to help radiologists see tumors that might be obscured on 2D scans. Instead of the two X-ray views per breast used for traditional mammography, DBT integrates up to 25 views per breast, each taken from a different vantage point along an arc, while exposing the patient to no more radiation than in traditional mammography. Based on these views, a computer estimates the location of structures throughout the breast using Maximum Likelihood Expectation Maximization, an iterative reconstruction algorithm co-developed by Brandeis University and MGH.
“The computer estimates where things are within each breast, and compares it with the information acquired,” explained Dr. Daniel Kopans, director of Breast Imaging at MGH. “If it’s off by a certain amount, the computer adjusts its computations to get closer to the truth, based on multiple scans. Through multiple iterations, we find the maximum likelihood that what’s in the breast is what we’re seeing.” Through research, MGH determined that eight iterations delivered a result acceptable to radiologists.
“DBT lets us use those 11 or 15 images to synthesize slices of the entire breast at a single millimeter of separation,” continued Dr. Kopans, “so the clinician can look at each individual image without confusion.” However, while DBT improves detection, the computing demands are intensive, hindering clinical adoption.
Tomosynthesis as a general concept dates back to the 1960s as a possible technique for looking at the brain and other parts of the body. Its applications for the breast were recognized in the late 1970s, but at that time clinicians lacked both the digital detectors and the processing power to make it practical. Each of the 11 to 15 images used in the typical DBT scan comprises a 1,800 by 2,304 array of pixels, each 100 only microns in size—all of which must be read out in a third of a second to minimize patient movement. While the sensors needed to capture such a vast amount of data so quickly had finally arrived by the 1990s, computing remained an obstacle. MGH’s original attempts to synthesize DBT data with a standard PC took all night to process a single breast. Dr. Moore brought this time down to 15 to 20 minutes by using a parallel processing system of 34 PCs, but real clinical utility would require still greater speed and efficiency.
Unlocking the power of graphics processing technology
To make DBT practical, MGH turned to Mercury Computer Systems, Inc., a specialist in transforming sensor data to information for analysis and interpretation, particularly in the context of medical diagnostic imaging devices including MRI, CT, PET, and digital X-ray. Although Mercury’s image processing expertise and mathematical optimizations of the Maximum Likelihood Expectation Maximization algorithm made it possible to reduce processing time while preserving clarity, more processing power was still needed. Nvidia Quadro professional graphics processing units (GPUs) provided the answer.
The Digital Breast Tomosynthesis system provides doctors greater assurance that they see problems in an X-ray and not just dense tissue.
Although more commonly known in the video gaming market, Nvidia has been active in life sciences since its inception in the mid 1990s. Said Robert Murphy, director of marketing for Mercury Computer Systems’ Life Sciences Group (LSG), “It costs billions of dollars to develop these GPUs; we could never have done it ourselves. Nvidia has been producing them in bulk for years. In a sense, we have all those gamers to thank.”
Mercury began by mapping the algorithm to a GPU based on Nvidia Quadro professional graphics processor technology, which provides a unique programmable rendering pipeline. The software port had to be constructed carefully to overcome memory, bandwidth, and instruction set limitations while maximizing run-time performance, but this was only part of the challenge.
At the time, the only way to harness the GPU was through the OpenGL cross-platform application programming interface, an approach that required quite a bit of skill. With no interface for mathematics, the GPU could be awakened only by a graphics command, forcing engineers to think in graphics analogies. “Our engineers had to trick OpenGL into doing the reconstruction in the right way to get results—basically, to manipulate something that usually focuses purely on graphics to handle non-graphics tasks,” said Murphy. Instead of simply multiplying two arrays, engineers told the GPU to draw a triangle in a given location, which would kick off the computation, providing a result in the form of a value stored in a pixel location.
While there was no performance penalty with this indirect approach—the GPU had gigaflops of power to spare—a new bottleneck was created on the human side. Not only did it take longer to write the program; few programmers had both the graphics capabilities and the needed medical industry expertise. Still, the results were compelling enough to inspire further
efforts to make the GPU’s power more broadly accessible.
For their painstaking work, Mercury’s engineers were rewarded with a more than 60 times acceleration of Dr. Moore’s original single-system solution; one Nvidia GPU processes a DBT scan in under five minutes. Subsequent improvements yielded another 50% reduction in processing time. With the key technical challenges solved, MGH developed specifications for a DBT device to be built by General Electric, the first of its kind in the world that can be used to evaluate the entire breast using 3D mammography.
“From a physician’s point of view, we want to catch cancers, and we think DBT will help us find more tumors more easily, though this remains to be proven,” said Dr. Kopans. A pilot study is now being conducted with the National Institutes of Health (NIH); initial results suggest a dramatically lower callback rate, as false positives due to overlapping tissues are eliminated.
Following approval for general practice by the FDA, which is anticipated within the coming year, Dr. Kopans expects DBT to begin to replace traditional mammography as the medical standard. “Radiologists accustomed to traditional mammography will have little difficulty adjusting to DBT, and will quickly understand what those images are showing. It’s not a completely new thing to learn, just a better mammogram.
During this time, growing interest in the promise of graphics hardware for general purpose computing has led to other methods for translating C code into graphics language. For its part, Nvidia developed the Compute Unified Device Architecture (CUDA), a technology that enables programmers to code for the GPU directly in C and gain unfettered access to its native instruction set and memory.
Said Nvidia product manager Sanford Russell, “Our goal isn’t to produce a niche product or an application-specific microprocessor; there are a finite number of people who understand graphics programming, but everyone knows C. By allowing people to program in C on our next-generation architecture, we can turn the pyramid upside down and open the marketplace for anyone who has a problem they want to run on these massively parallel GPUs without touching OpenGL or graphics.” Since its release, the company has continued to work with thousands of researchers to learn how they are using CUDA for applications such as seismic processing, financial modeling and general high-performance computing, and to gather feedback for further enhancements.
: Design World :