Software: King of SWaP

Size, weight, and power (SWaP) considerations have always been important for embedded electronics but will become even more critical in the future. Driving this trend in embedded system design is the plethora of small platforms, such as unmanned aerial vehicles (UAVs), entering the inventory. Developers of the processing systems that will be deployed on this growing cohort of small, intelligent, sometimes battery-operated platforms at the tactical edge will have to scrutinize hardware and software components to ensure the most efficient use of resources.

Hardware issues

SWaP optimization involves hardware and software issues. System developers often start the design process with a list of hardware specifications: the size of a box, its processing throughput, the power budget, and cooling requirements. Other decisions involve the form factor of the board or boards, the memory space – it may not be much – the I/O fabric, and the processing chips.

Special-purpose graphics processing units (GPUs), for example, might improve throughput by such a wide margin that they are the go-to silicon. GPUs burn a lot of power, but if one GPU can do the work of four or five general-purpose processors, it would save system size, weight, and power. Small platforms need just enough performance to do the job – that is, not too much. Program managers want room for growth but do not want to leave unused MIPs or FLOPS – and associated power penalties – on the table.

To maximize collaboration between software and hardware, application developers require real-time, dynamic, and highly granular insights into hardware metrics such as CPU utilization and timing issues, CPU events such as cache and pipeline stalls, number of interrupts, memory bottlenecks, and inefficiencies at every step in the execution of an algorithm.

Software to optimize SWaP

Software-development tools are critical to SWaP optimization on target hardware. Good software-development tools provide windows into the software’s potential and actual interaction with the hardware, allowing the application to be tuned to execute with the fewest penalties. Among the most important tools at the developer’s disposal are middleware libraries; math libraries; and algorithm optimization, profiling, and visualization software.

Middleware is a key to application performance and power efficiency. Middleware has many roles, one of the most important of which – in embedded applications – is to enable the most effective distribution of data between hardware nodes. Modern CPUs typically are designed as complex multiprocessors on a chip, where each core can run many threads simultaneously. The smaller the embedded system, the more critical efficient data flows become.

A popular middleware choice is MPI (message passing interface), which is available in open-source and proprietary implementations. For reasons having to do with its origins in processor-rich environments such as data centers, some versions of MPI are more focused on low latency than on power efficiency. A typical example of a middleware-induced inefficiency is the “spin loop.” This event occurs when a processor waits for data to arrive while consuming cycles, burning power, and dissipating heat without accomplishing anything. A better approach for miserly embedded systems is to let the cores either sleep or perform other tasks while waiting for data to arrive.

Another tool for tuning applications involves optimization of algorithms, the core logic that instructs the computer how to execute its functions. The more efficiently the algorithm is coded, the faster the application.

Software analysis, however, requires insight not only into the structure of the code but into how well the program executes on the target hardware. Especially where deterministic performance is required, developers must identify bottlenecks and scrutinize functions that may be consuming excessive processor time or memory space. Much of this information can be gleaned by visualizing timing issues as the code is running via an intuitive graphical user interface (GUI) that highlights performance issues across multiple cores and their distributed threads.

An example of such technology is Abaco Systems’ AXIS Pro toolkit, which includes an integrated GUI, optimized MPI libraries, extensive math function libraries, and the EventView profiling and analysis program. (Figure 1.) All of the tools are aimed at reducing the complexity, time, and cost of developing and debugging multiprocessor embedded applications without getting lost in the hardware weeds.

Software is the star of embedded systems, but it is accelerated or constrained by its hardware base. Accurate, immediate, and detailed insight into the interaction between the software and the hardware is the best recipe for a successful, SWaP-optimized system.

Figure 1: Abaco’s AXIS software-development environment can help identify how well code runs on the target hardware, which enables designers to optimize SWaP up front.