Justifiably taboo: Avoiding malloc-free APIs in military/aerospace embedded code

Story

February 17, 2011

Steve Graves

McObject

Want to start a lively - even contentious - discussion among programmers? Just ask, "Is it safe to use dynamic memory allocation?"

Popularized in C/C++, dynamic allocation eases development by doling out system memory to application processes as needed at runtime and retrieving the memory when it is no longer needed.

But dynamic allocation is widely considered taboo in safety-critical embedded software. The use of the C runtime library’s malloc() and free() APIs, which do the grunt work of dynamic allocation, can introduce disastrous side effects such as memory leaks or fragmentation. Further, malloc() can exhibit wildly unpredictable performance and become a bottleneck in multithreaded programs on multicore systems. Due to its risk, dynamic memory allocation is forbidden, under the DO-178B standard, in safety-critical embedded avionics code.

Developers across the embedded industry seem to react viscerally to the topic. In a recent Internet technical group discussion, the question, “Do you use dynamic memory allocation in your embedded design?” garnered an astonishing 77 responses, typified by “generally, it is considered a violation of best practices” for fault-tolerant systems, and “if the requirements include ‘five-nines’ (99.999 percent uptime) reliability, hard real-time, or small memory footprint, the answer is ‘never.’” Job seekers, take note: One consulting engineer’s interview strategy “is to gently probe the prospective employee on their dynamic memory allocation usage in real-time apps. If they have no problem with it, they are not going to be hired.”

A better strategy – both for code safety and job interview success – is to replace the standard (default) allocators in safety-critical code with customized memory allocation functions that more closely match specific allocation scenarios. The following discussion describes two such custom memory managers: a stack-based allocator and a thread-local allocator. Another way to rid an application of malloc() and free() – and thus gain better performance, stability, and predictability – is to replace code based on standard allocation functions with off-the-shelf software that incorporates custom allocators. Use of an In-Memory Database System (IMDS) is discussed as an example of this “buy rather than build” approach.

It’s standard, but is it the best?

Why are standard (dynamic) memory managers a poor choice for mission-critical code? Typically they are based on list allocator algorithms that organize the memory pool into contiguous locations (free holes) in a singly linked list. The allocator then “walks” this chain looking for the appropriate hole to meet a request. List allocators are the quintessential general-purpose function: They do a pretty good job of allocating and deallocating memory across a wide range of situations – but “pretty good” is not good enough in mission- or safety-critical systems.

Stack-based algorithm: Allocate and rewind memory

Certain application scenarios call for allocating many short-lived objects, then freeing them all at once. A stack-based allocator (not be confused with the application call stack) is one type of custom allocator that works well here. With this algorithm, each allocation returns the address of the current position of the stack pointer and advances the pointer by the amount of the request (Figure 1). When memory is no longer needed, the stack pointer is rewound. Processing overhead is reduced because there is no chain of pointers to manage, nor are there any allocation sizes or free holes to track. This approach is safer, too: A memory leak can’t be accidentally introduced through improper deallocation because the application does not have to track specific allocations.

Figure 1: A custom stack-based allocator

(Click graphic to zoom by 1.9x)

The overhead eliminated by using a stack-based allocator versus a standard list allocator increases as the application continues to run. When memory is deallocated in random order, the list allocator often needs to add both a pointer and a size value to its chain (this is called fragmentation), so that pointers and size values represent an ever-larger percentage of total heap size. So the list allocator’s overhead (the amount of meta-data that must be managed and the likelihood of having to walk further to find a suitable free hole) grows as the application continues to run. (With the stack-based allocator, all chunks allocated from a point in time are returned to the heap in one action, avoiding fragmentation.)

The multithreaded, multicore allocation challenge

The default malloc() and free() functions that are controlled by a mutex are often to blame when multithreaded applications bog down on multiprocessor hardware. Threads using these allocators can cause locking conflicts, and the OS resolves these in part via performance-draining context switches. A custom thread-local allocator avoids conflicts by assigning a specific memory pool to each thread. The thread’s allocation is performed from this block without interfering with other threads’ requests, thus enhancing performance and predictability. When a thread allocator runs out of memory, some other allocator can assign it another block if the system allows it. The thread-local allocator uses a Pending Request List or PRL for each thread to coordinate the release of memory blocks that are freed by a thread other than the one that performed the original allocation. Memory that is allocated and deallocated by the same thread requires no coordination, and therefore no lock conflicts occur.

In short, problems are avoided in safety-critical code by removing memory management responsibility from malloc() and free() and assigning it to the application, which uses custom allocators that mesh with specific application tasks. The custom allocator sets aside a buffer for the exclusive use of that task, usually during boot-up, and satisfies memory allocation requests from it. If the buffer memory runs low, the application is informed and can free up memory inside the buffer. Or it can find more memory elsewhere to devote to the task. Exhausting the memory in this dedicated pool has no impact on other parts of the system. Custom allocators that might be chosen include the ones discussed, as well as bitmap allocators, block allocators, and others.

Allocation via third-party application

The benefits of custom memory allocators can also be harnessed by integrating third-party software that uses them. IMDSs are a good candidate to benefit from custom allocators, because they are designed expressly to manage application objects in RAM. Figure 2 illustrates allocation/deallocation using malloc() and free(). Figure 3 shows the same process using McObject’s eXtremeDB, an IMDS that incorporates custom allocators, including stack-based and thread-local. At the start of Figure 2, a C program defines a structure, declares a pointer to an instance of that structure, and allocates memory for it via malloc().

Figure 2: Memory allocation using malloc() and free()

(Click graphic to zoom)

Figure 3: Memory allocation using an in-memory database system

(Click graphic to zoom)

The programmer using the IMDS defines classes in a database schema file, which is processed (via a special compiler) to produce a .C file, as well as a .H file that contains type definitions and function prototypes.

If the program that uses malloc/free is multithreaded and threads will share the Sensor object, the developer must implement concurrency control. With an IMDS, concurrency is managed automatically via transactions. Figure 3 shows how a transaction begins (mco_trans_start) and gets a transaction handle.

Calling Sensor_new() claims some of the memory pool dedicated to the IMDS for a new Sensor object. (In a military/aerospace application, a sensor object could represent anything from optical sensors for tracking missile targets to biosensors for defense in chemical warfare or motion sensors to aid in navigating an aircraft.) Sensor_new() returns a handle to the database object, through which the object’s values can be written and/or read. In contrast, the C program works directly with the structure’s fields, creating the need for concurrent access controls in a multithreaded application.

When the C program finishes using the Sensor structure, free() returns memory to the heap. When the code with the IMDS finishes, the space in the database is relinquished, the transaction ended, and the memory used for the sensor object returned to the dedicated memory pool.

The eXtremeDB IMDS can run low on memory, but this would generate a “database full” error message that can be dealt with by the application. In contrast, memory fragmentation and leakage caused by malloc() and free() could destabilize the entire system. The IMDS offers a mechanism that works “behind the scenes” to allocate and free memory with greater efficiency and flexibility from using multiple underlying allocator types, avoiding the riskiness inherent in malloc() and free().

Custom memory managers, though, are not particular to an IMDS. For example, off-the-shelf code to manage sensor networks is well-suited to a type of custom memory manager called a block allocator. While discrete sensor values are not known in advance, these values’ size is fixed and known (like a 4-byte timestamp and an 8-byte value), and the block allocator excels in parceling out memory chunks of a predefined size. The stack-based allocator is useful for any computing requirement that can be divided into a first stage in which all the memory is needed, and a second stage in which all the memory is no longer needed. Any program that has to parse some input stream fits this description. For example, a communication surveillance program might parse a stream of text (spoken words), build a tree of tokens (words or phrases), then perform some post-processing on it. That post-processing could be deciding whether a given word or phrase is relevant within its context.

In fact, it is hard to think of an application type that would not benefit from memory management that is geared to its specific allocation patterns and challenges. Of course, customizing memory management adds another consideration to the already complex task of software development. But software engineers entering the safety-critical arena know the demands and stakes are higher than in consumer or business application development. Writing code that avoids dynamic memory allocation and instead uses one or more custom memory managers is less convenient. But it adds safety and stability, and that’s a trade-off engineers of safety-critical systems should embrace.

Steve Graves is cofounder and CEO of McObject, a provider of embedded Database Management System (DBMS) software. He can be contacted at steve.graves@mcobject.com.

McObject 425-888-8505 www.mcobject.com