Game changer in signal processing

Situational awareness is key on the battlefield. Today’s sensors can pack hundreds of imagers on a platform and provide 360-degree coverage. But how can that ocean of data be sifted for relevant information within tactical timelines?

Enter the Graphics Processing Unit (GPU). This specialized silicon combines large numbers of arithmetic logic units that can execute thousands of fairly simple math instructions simultaneously and repetitively. This feat is essential for tasks like wide area sensing in real time.

Although spawned by the $80 billion video gaming industry, GPUs play a vital role in advanced signal processing applications, where they are used to deconstruct masses of real-world, 3D data into targets and threats rather than to render an imaginary 3D world on a computer screen.

These sensor processing applications are known as General-Purpose computing on Graphics Processing Units (GPGPUs) because they involve math operations that traditionally would have required a Central Processing Unit (CPU). While CPUs are still necessary in GPGPU applications to set up a transaction between a sensor back-end and the GPU, and to retrieve the GPU’s final result, the graphics processor is key. Today’s GPU architectures – with their scale, memory techniques, and power efficiency – make it possible to execute large-scale sensor processing tasks at a rate that would be impractical for conventional processors.

GPUs are all about data reduction. In an image processing system, for example, a GPU or GPUs might take sensor data at a rate of 1 GBps and boil that down to a couple of targets. These components are found in GPGPU applications such as target recognition, ground moving target indication, and Synthetic Aperture Radar (SAR).

How they do it

One GPU architecture on the market today is Kepler, designed by NVIDIA for the video game market. Although the Kepler family currently scales up to 1,344 cores, embedded applications use GPUs that burn less power. The GK107, for example, features 384 cores and burns only 50 W with a power efficiency of about 15 GFLOPS per watt.

These masses of relatively simple cores are orchestrated internally by core aggregates called Streaming Multiprocessors (SMPs). The GK107 chip, for example, features two 192-core SMPs. Sharing cache and registers, the SMPs work together on a similar task. SMPs solve the challenging problem of keeping all of the cores busy all of the time by assuming the scheduling function and telling the cores what to do.

But the SMPs are crucially aided in this task by GPUDirect, a new Direct Memory Access (DMA) technology that allows applications to stream digitized sensor data directly to the GPU. Previously all this data first had to go into CPU memory. Then the GPU would copy it back from CPU memory. This chokepoint meant that you couldn’t get the data into the GPU fast enough.

A revolutionary mil/aero Kepler implementation, and the flagship of a larger family, is GE Intelligent Platforms’ GRA112 – a ruggedized 3U VPX card that will use individually soldered NVIDIA subcomponents, a unique advantage that allows developers to maximize ruggedization and cooling performance (Figure 1).

21
Figure 1: The GE Intelligent Platforms GRA112 is a ruggedized 3U VPX card that takes advantage of the GPGPU capabilities of NVIDIA’s Kepler technology.
(Click graphic to zoom by 1.9x)

Secret weapon

Kepler uses NVIDIA’s Compute Unified Device Architecture (CUDA), a parallel computing model that enables programmers to easily access GPU cores’ resources and map a sensor processing task onto the parallel platform. Using the CUDA model, a software developer can create an application that launches millions of threads, each corresponding, say, to a single pixel in a sensor image. Each core can multiplex between up to four threads at a time, so that hundreds of threads execute simultaneously.

The result is that GPUs outperform CPUs in compute-intensive sensor applications that can exploit the GPUs’ parallelism. A SAR task, for example, can run 225 times faster on a GPU than on a CPU implementation. Tasks that took minutes can be executed in milliseconds.

A further advantage of Kepler’s GPUDirect technology, compared to the DMA function in earlier architectures, is that it’s free and open and doesn’t require any special code to use. GPUDirect is supported in NVIDIA’s standard CUDA programming environment available from their developer website. NVIDIA GPUs are also backward-compatible, so that older applications will run faster on newer chips.

defense.ge-ip.com