Xem mẫu

Chapter 18 Real-Time Onboard Hyperspectral Image Processing Using Programmable Graphics Hardware Javier Setoain, Complutense University of Madrid, Spain Manuel Prieto, Complutense University of Madrid, Spain Christian Tenllado, Complutense University of Madrid, Spain Francisco Tirado, Complutense University of Madrid, Spain Contents 18.1 Introduction .......................................................... 412 18.2 Architecture of Modern GPUs ........................................ 414 18.2.1 The Graphics Pipeline ........................................ 414 18.2.2 State-of-the-art GPUs: An Overview .......................... 417 18.3 General Purpose Computing on GPUs ................................. 420 18.3.1 Stream Programming Model .................................. 420 18.3.1.1 Kernel Recognition ................................ 421 18.3.1.2 Platform-Dependent Transformations ............... 422 18.3.1.3 The 2D-DWT in the Stream Programming Model ... 426 18.3.2 Stream Management and Kernel Invocation ................... 426 18.3.2.1 Mapping Streams to 2D Textures ................... 427 18.3.2.2 Orchestrating Memory Transfers and Kernel Calls ................................... 428 18.3.3 GPGPU Framework .......................................... 428 18.3.3.1 The Operating System and the Graphics Hardware .. 429 18.3.3.2 The GPGPU Framework ........................... 431 18.4 Automatic Morphological Endmember Extraction on GPUs ............ 434 18.4.1 AMEE ....................................................... 434 18.4.2 GPU-Based AMEE Implementation .......................... 436 411 © 2008 by Taylor & Francis Group, LLC 412 High-Performance Computing in Remote Sensing 18.5 Experimental Results ................................................. 441 18.5.1 GPU Architectures ........................................... 441 18.5.2 Hyperspectral Data ........................................... 442 18.5.3 Performance Evaluation ...................................... 443 18.6 Conclusions .......................................................... 449 18.7 Acknowledgment ..................................................... 449 References .................................................................. 449 This chapter focuses on mapping hyperspectral imaging algorithms to graphics pro-cessing units (GPU). The performance and parallel processing capabilities of these units, coupled with their compact size and relative low cost, make them appealing for onboard data processing. We begin by giving a short review of GPU architec-tures. We then outline a methodology for mapping image processing algorithms to these architectures, and illustrate the key code transformation and algorithm trade-offs involved in this process. To make this methodology precise, we conclude with an example in which we map a hyperspectral endmember extraction algorithm to a modern GPU. 18.1 Introduction Domain-specific systems built on custom designed processors have been extensively used during the last decade in order to meet the computational demands of image and multimedia processing. However, the difficulties that arise in adapting specific designs to the rapid evolution of applications have hastened their decline in favor of other architectures. Programmability is now a key requirement for versatile platform designs to follow new generations of applications and standards. At the other extreme of the design spectrum we find general-purpose architectures. Theincreasingimportanceofmediaapplicationsindesktopcomputinghaspromoted theextensionoftheircoreswithmultimediaenhancements,suchasSIMDinstruction sets (the Intel’s MMX/SSE of the Pentium family and IBM-Motorola’s AltiVec are well-know examples). Unfortunately, the cost of delivering instructions to the ALUs posesaseriousbottleneckinthesearchitecturesandmakesthemstillunsuitedtomeet more stringent (real-time) multimedia demands. Graphics processing units (GPUs) seem to have taken the best from both worlds. Initially designed as expensive application-specific units with control and commu-nication structures that enable the effective use of many ALUs and hide latencies in the memory accesses, they have evolved into highly parallel multipipelined proces-sors with enough flexibility to allow a (limited) programming model. Their numbers are impressive. Today’s fastest GPU can deliver a peak performance in the order of 360 Gflops, more than seven times the performance of the fastest x86 dual-core pro-cessor (around 50 Gflops) [11]. Moreover, they evolve faster than more-specialized © 2008 by Taylor & Francis Group, LLC Real-Time Onboard Hyperspectral Image Processing 413 platforms, such as field programmable gate arrays (FPGAs) [23], since the high-volume game market fuels their development. Obviously, GPUs are optimized for the demands of 3D scene rendering, which makes software development of other applications a complicated task. In fact, their astonishing performance has captured the attention of many researchers in differ-ent areas, who are using GPUs to speed up their own applications [1]. Most of the research activity in general-purpose computing on GPUs (GPGPU) works towards finding efficient methodologies and techniques to map algorithms to these archi-tectures. Generally speaking, it involves developing new implementation strategies following a stream programming model, in which the available data parallelism is explicitly uncovered, so that it can be exploited by the hardware. This adaptation presents numerous implementation challenges, and GPGPU developers must be pro-ficient not only in the target application domain but also in parallel computing and 3D graphics programming. The new hyperspectral image analysis techniques, which naturally integrate both the spatial and spectral information, are excellent candidates to benefit from these kinds of platforms. These algorithms, which treat a hyperspectral image as an image cube made up of spatially arranged pixel vectors [18, 22, 12] (see Figure 18.1), exhibit regular data access patterns and inherent data parallelism across both pixel vectors(coarse-grainedpixel-levelparallelism)andspectralinformation(fine-grained spectral-level parallelism). As a result, they map nicely to massively parallel systems made up of commodity CPUs (e.g., Beowulf clusters) [20]. Unfortunately, these systems are generally expensive and difficult to adapt to onboard remote sensing data processing scenarios, in which low-weight integrated components are essential to reduce mission payload. Conversely, the compact size and relative low cost are what make modern GPUs appealing to onboard data processing. Therestofthischapterisorganizedasfollows.Section18.2beginswithanoverview of the traditional rendering pipeline and eventually goes over the structure of modern Pixel Vector Bands Width Figure 18.1 A hyperspectral image as a cube made up of spatially arranged pixel vectors. © 2008 by Taylor & Francis Group, LLC 414 High-Performance Computing in Remote Sensing GPUs in detail. Section 18.3, in turn, covers the GPU programming model. First, it introduces an abstract stream programming model that simplifies the mapping of imageprocessingapplicationstotheGPU.Thenitfocusesondescribingtheessential codetransformationsandalgorithmtrade-offsinvolvedinthismappingprocess.After this comprehensive introduction, Section 18.4 describes the Automatic Morpholog-ical Endmember Extraction (AMEE) algorithm and its mapping to a modern GPU. Section 18.5 evaluates the proposed GPU-based implementation from the viewpoint ofbothendmemberextractionaccuracy(comparedtootherstandardapproaches)and parallel performance. Section 18.6 concludes with some remarks and provides hints at plausible future research. 18.2 Architecture of Modern GPUs This section provides background on the architecture of modern GPUs. For this introduction, it is useful to begin with a description of the traditional rendering pipeline [8, 16], in order to understand the basic graphics operations that have to be performed. Subsection 18.2.1 starts on the top of this pipeline, where data are fed from the CPU to the GPU, and work their way down through multiple processing stages until a pixel is finally drawn on the screen. It then shows how this logical pipeline translates into the actual hardware of a modern GPU and describes some specific details of the different graphics cards manufactured by the two major GPU makers, NVIDIA and ATI/AMD. Finally, Subsection 18.2.2 outlines recent trends in GPU design. 18.2.1 The Graphics Pipeline Figure 18.2 shows a rough description of the traditional 3D rendering pipeline. It consists of several stages, but the bulk of the work is performed by four of them: vertex-processing(vertexshading),geometry,rasterization,andfragment-processing (fragment shading). The rendering process begins with the CPU sending a stream of vertex from a 3D polygonal mesh and a virtual camera viewpoint to the GPU, using somegraphicsAPIcommands.Thefinaloutputisa2Darrayofpixelstobedisplayed on the screen. Inthevertexstagethe3Dcoordinatesofeachvertexfromtheinputmesharetrans-formed(projected)ontoa2Dscreenposition,alsoapplyinglightingtodeterminetheir colors. Once transformed, vertices are grouped into rendering primitives, such as tri-angles, and scan-converted by the rasterizer into a stream of pixel fragments. These fragments are discrete portions of the triangle surface that correspond to the pixels of the rendered image. The vertex attributes, such as texture coordinates, are then inter-polated across the primitive surface storing the interpolated values at each fragment. In the fragment stage, the color of each fragment is computed. This computation usually depends on the interpolated attributes and the information retrieved from the © 2008 by Taylor & Francis Group, LLC Vertex Stream Vertex Proyected Stream Vertex Stream Rasterization Fragment Stream Fragment Stage ROB Memory Colored Fragment Stream Figure 18.2 3D graphics pipeline. © 2008 by Taylor & Francis Group, LLC ... - tailieumienphi.vn
nguon tai.lieu . vn