Visual computing made easy
We are at the dawn of a new and exciting era in computing called “visual computing.” It has the potential to fundamentally change the way data is processed and visualized in design and engineering modern military systems. With the availability of Jacket, a new programming tool, it is surprisingly easy to make use of this new technology in complex computer simulations.
Recent developments associated with the emergence of multicore architectures are transforming the field of High Performance Computing (HPC) in a disruptive way. The forerunner technology leading this transformation is the Graphics Processing Unit (GPU). These massively parallel processors were originally developed to perform complex graphics calculations, but they can also be used for general-purpose computations. GPUs offer up to 240 computational cores, in contrast to the 4 cores currently available on CPUs. Due to this performance discrepancy, a new era of “visual computing” is emerging in which GPUs are being used to increase application productivity and quality through the merger of computation with visualization.
These recent developments in the HPC market are not going unnoticed by the defense industry, which has always been an early adopter of emerging technologies. Major companies and governmental organizations in the industry have already started working with GPUs and are discovering the potential for significant acceleration in a wide range of applications ranging from signal and image processing to distributed control of multi-agent systems. However, as is common for new technologies, the development tools for GPU computing have not yet matured. GPU manufacturers and several third-party providers are now offering a variety of Application Programming Interfaces (APIs) to program these devices. Although these tools improve the accessibility to these new computing platforms, they still inherit the fundamental difficulties associated with parallel programming: switching from the sequential to parallel way of writing programs, as well as the steep learning curve associated with available software development tools.
One particular programming language that has the potential to bridge this gap between parallelizable applications and GPUs is the M-language. It is available in MATLAB, a very popular Integrated Development Environment (IDE) for prototyping algorithms and complex simulations. Available since earlier this year, Jacket, a new add-on toolbox to MATLAB, now enables standard M code to run on GPU-based accelerators from within MATLAB. A user can get started with GPU computing almost instantly with a negligible learning curve involved. The authors explore the nuances of M-language, then present a volumetric control simulation example to illustrate the ease of use and code reusability afforded with Jacket.
Visual computing and new challenges
Visual computing is the union of visualization and mathematics. It allows users to interact with data and visually explore results. Visual computing is important to numerous fields ranging from defense and industrial applications to games, sports, and more (Figure 1). Engineering problems are increasingly becoming dependent on larger amounts of data. Consequently, new challenges arise both in terms of computational power required to process in reasonable time as well as effective visualization of results. In performing computer simulations of systems and processes, the results are generally visualized after the computation has completed to avoid considerable computational overhead of visualization code. Now with the GPU as the computational workhorse, seamless integration of visualization and computation becomes possible where users can visually explore data and results in real time as they get computed in a simulation. Therefore, large-scale visualization is an ideal fit for GPU computing for several reasons:
- Visualization is a data-intensive application, and GPUs are well suited for data-intensive tasks.
- Visualization computations exhibit substantial parallelism.
- Visualization tasks should be closely coupled to the graphics system.
- Integrating computations with visualization on the GPU eliminates huge overhead of memory transfer between CPU and GPU, permitting entire applications to run faster.
GPU manufacturers are focused on developing programming tools to harness the incredible power of these processors such as NVIDIA’s CUDA (Compute Unified Device Architecture) and AMD’s CTM (Close To the Metal). These tools, though powerful, require learning new APIs as well as mastering the ability to write parallel code. Moreover, optimizing code for performance is also nontrivial, requiring a deeper knowledge of underlying hardware architectures. Therefore, despite the promise of significant performance benefits being reported in research literature, adoption of GPU computing has been below expectations.
M-language to the rescue
M-language allows complex systems to be represented in a concise manner and provides sophisticated visualization and analysis functions as compared to C/C++ with OpenGL visualization. The key to its growing popularity is that nonprogrammers find it relatively easy to learn. It is a data-parallel language and offers an easy way for scientists working with vectors and matrices to express element-wise parallelism on aggregate data. This feature combined with the ubiquity of MATLAB makes it an ideal candidate for mapping to fine-grained, data-parallel systems like GPUs. Moreover, from a scientist’s perspective, minimal effort is needed to achieve basic visualizations. For example, the following code snippet shows M code that visualizes a signal as a surface while it is being manipulated within a loop.
A = zeros(1000,1000); % Allocate CPU memory
for t = 1:100
A = exp(-t*i) * A; % manipulate
B = ifft(A); % inverse FFT
surf(abs(B)); % visualize (transfer to video memory)
In contrast, the same function if written in C/C++ with OpenGL visualization would span, at minimum, hundreds of lines. Though concise and easy to learn, M-language also has its restrictions that become quickly noticeable in the form of slower computational speeds as the data size increases. There are solutions to this in the form of distributed/cluster computing, but their cost and complexity do not make sense for most applications.
Jacket, an add-on toolbox to MATLAB, can make up for these deficiencies by providing a trivial way to offload computations to the GPU, resulting in code speedups of 10-100x for many applications. Users can simply mark data, via casting operations, to indicate that computations should occur on the GPU. The following code shows how simple it is to achieve this.
A = gzeros(1000,1000); % Allocate GPU memory
for t = gsingle(1:100) % Casting of iterator vector
A = exp(-t*i) * A; % manipulate
B = ifft(A); % inverse FFT
surf(abs(B)); % visualize (no memory transfer)
Jacket also includes a Graphics Toolbox that enables seamless integration of computation with visualization, making difficult-to-program, multithreaded, and real-time graphical displays simple to achieve. For example, by placing a single visualization command at the end of a loop as shown in the code snippet above, data may be viewed as it is processed in place on the GPU. Load-balancing decisions are automatically made to optimally use GPU resources for compute as well as display. Further, the Graphics Toolbox exposes the entire OpenGL API and allows for interactive scene creation and rapid prototyping.
A volumetric control simulation example
Although the applications are numerous, the code reuseability and overall ease-of-use benefits of Jacket for a complex simulation scenario can be appreciated by an example from volumetric control. “Volumetric control” means monitoring, detecting, tracking, reporting, and responding to environmental conditions within a specified physical region. This is done in a distributed manner by deploying numerous vehicles, each carrying one or more sensors, to collect, aggregate, and fuse distributed data into a tactical assessment. Examples of this include surveillance and perimeter defense using multiple agents. Jacket is used in the code segment shown in Figure 2 to offload the computation and visualization to the GPU by maximally reusing existing CPU M code.
Notice that a simple casting operation (outlined in red) is the only change between the two code segments. The result was a speedup of 8x using an NVIDIA Tesla C1060 GPU when compared to a 2.2 GHz dual core Intel CPU for a swarm (group of agents) size of 400. Another point to note is that the function discrete_swarm(), which models the swarm dynamics, is completely reused in the GPU version with no change at all. Moreover, with Jacket the simulation updates can be visualized live without incurring the huge overhead of memory transfers to video RAM as in the case of CPU code.
Simple software, powerful visual computing
Application developers designing high-performance computational systems are increasingly embracing GPU computing to enhance application speed and data visualization. Visualization is a critical step in the process of transforming data into knowledge. An integrated visualization capability alongside compute provides scientists and engineers a crucial capability when running complex simulations. Jacket makes this capability available to the technical computing community already familiar with the MATLAB programming language without any pains of rewrites or learning new APIs. For widespread multicore adoption, it is imperative that both scientists and engineers who do not want to be computer scientists continue to use familiar languages and constructs and focus on their algorithms rather than on implementation details.