High Performance Fourier Volume Rendering on Graphics Processing Units ( GPUs )
Academic Graduation Year
2011 / 2012 - Systems & Biomedical Engineering Department, Faculty of Engineering
Cairo University, Egypt
Assoc. Prof. Dr. Ayman I. Eldeib
Assoc. Prof. Dr. Amr A. Shaarawi
Prof. Dr. Yasser M. Kadah, Cairo University, Egypt [ Home Page ]
Prof. Dr. Mohamed I. Eladawy, Helwan University, Egypt [ Home Page ]
Assoc. Prof. Dr. Ayman I. Eldeib, Cairo University, Egypt [ Home Page ]
Assoc. Prof. Dr. Amr A. Shaarawi, Cairo University, Egypt [ Home Page ]
The past several years have seen tremendous advances in volume visualization techniques that have been used broadly in medical imaging. In particular, volume rendering has received a considerable attention in this area. However, spatial domain volume rendering has achieved a wide acceptance from scientists and physicians, but this category of rendering techniques was associated with constrains due to their O(N3) time-complexity for a volume of N3, which limited their usability in several aspects. Fourier Volume Rendering (FVR) is an alternative technique that operates on the frequency spectrum of the volume with lower time complexity of order O(N2 log N) relying on the projection-slice theory. This technique allows the generation of attenuation-only renderings or projections of volumetric data that look like x-ray radiographs.
It has been used extensively in digital radiography. In this work, a high performance pure GPU-accelerated implementation for the Fourier volume rendering pipeline is proposed to achieve 30X of speed up over a hybrid implementation by mapping the entire pipeline to be executed on the GPU.
Keywords: Fourier Volume Rendering, Medical Image Reconstruction, Projection-Slice Theory, GPU Computing, CUDA.
In this work, an in-depth investigation has been carried out to achieve a high performance implementation of the Fourier volume rendering pipeline on the GPU. It considered in particular CUDA-enabled GPUs to be used as a high performance computing architectures that can leverage the performance of data-parallel algorithm, which completely suits our problem. In advance, in Chapter 1, Introduction, volume visualization techniques that have been used widely in the medical arena are presented. It concentrated mainly on volume rendering as a scientific tool to explore the internal structures of volumetric objects. Then, it focused on Frequency domain volume rendering as an alternative technique to spatial domain algorithms at which it reduces the rendering time-complexity to order of O(N2 log N). Afterwards, we summarize the previous work in this area and our contribution.
Chapter 2, Theory Behind Frequency Domain Volume Rendering, aims at providing a gentle introduction to the theories relevant to frequency domain volume rendering. Sampling theory, Fourier transform, Hartley transform, and projection-slice theory are briefly discussed to set the stages to chapters to come by.
Basically, High Performance Computing - as we understand - deals with the implementations of some algorithm and the hardware it run on, but as a research tool, it demands at least a basic understanding of several disciplines, concepts, and methodologies that range from algorithms, computer programming, software and hardware architectures. In Chapter 3, High Performance Computing on Graphics Processing Units, we explain how the evolution of GPUs has turned them to be high performance platforms relying on their massively parallel architecture. A special treatment for the CUDA architecture is considered. Although we tried to keep this chapter comprehensive and concise, but the temptation to cover everything is overwhelming and the reader is assumed to have some familiarity with programming and high-level computer architecture.
In Chapter 4, Algorithm & Implementation, the Fourier volume rendering algorithm is presented and demystified to the reader. This chapter is intended as an attempt to summarize the Fourier volume rendering pipeline. It started with a general description on a level independent of specific architecture and then it moves towards a certain strategy that will be adopted to leverage the performance of the GPU-accelerated implementation. It is the author’s persuasion that a good understanding of the implementation aspects of this algorithm will reflect the significance of the achieved results.
In Chapter 5, Results, we discuss reconstruction and performance benchmarking results of both the naive implementation and our proposed one that is executed entirely on the GPU.
In Chapter 6, Conclusion & Future Work, we wrap up and conclude what have been presented in this sequel followed by some future work that might be undertaken either by us or by future researchers working in the same area.
All the datasets used in this thesis are available on GitHub in .img/.hdr format.