Military Embedded Systems

Data-centric architectural best practices: Using DDS to integrate real-world distributed systems

Story

September 04, 2013

Rose Wahlin

Real Time Innovations (RTI)

More and more real-world, complex distributed systems are integrated using a Data-Centric Publish-Subscribe approach, specifically the programming model defined by the Object Management Group (OMG) known as the Data Distribution Service (DDS) specification. The DDS Publish-Subscribe approach meets many challenging requirements - supporting large-scale, high-performance, and constrained-bandwidth systems on both powerful machines and embedded platforms. DDS has been used across a wide variety of applications in the defense, robotics, transportation, medical, and financial industries. The following discussion provides a set of architectural "best practices" guidelines that should be applied when using DDS to integrate complex, real-world systems.

Teams often implement systems using a variety of technologies, programming languages, and operating systems. As a result, integrating and evolving these systems becomes complex. Traditional approaches rely on low-level messaging technologies, delegating much of the message interpretation and information management services to application logic. This complicates system integration because different applications could use inconsistent interpretations and implementations of information management services, such as detecting component presence, state management, reliability, and availability of the information, handling of component failures, and so on.

Integrating modern systems requires a new, modular network-centric approach that avoids these historic problems by relying on standard Application Programming Interfaces (APIs) and protocols that provide stronger information-management services. This approach is a programming model defined by the Object Management Group (OMG) and known as Data Distribution Service (DDS).

DDS simplifies the creation of distributed systems, but to use it effectively, it helps to follow best practices. These guidelines can help build a system that is scalable, maintainable, testable, and high performance. Best practices for DDS can be broken up into three broad categories:

·        Architectural: Best practices for architecting a distributed system with DDS for scalability, maintainability, and performance

·        Application design: Best practices for designing and implementing individual applications with DDS

·        Network configuration and QoS: Best practices for tuning systems for optimal scale and performance

 

 

This discussion focuses on architectural best practices, and how they can be used for optimal performance and scalability.

Creating a data model

When considering implementing DDS middleware, first and foremost, it’s important for system architects or developers to determine which data needs to be sent between applications within the distributed system. This becomes the conceptual data model that will be mapped to data streams.

In the first phase, it is important to focus on the data that needs to be sent rather than the mechanism for sending it. For example, if the system is monitoring a fleet of Army trucks, the first thing to consider is: Which data are important to monitor about this fleet of trucks? For example, the system may need to monitor the trucks’ positions, manifests, oil levels, and maintenance information. Defining the data that must be sent should be done before mapping the data to messages or middleware technologies.

The next step is to map from this conceptual data model to a model that takes into account the network concerns of when, where, and how the data should be sent. To do this, the system architect needs to design network data streams based on the structure of those data and their delivery characteristics. The following best practices should be considered when mapping from a conceptual data model to a network data model:

Best Practice #1: Create a network data model based on related data and delivery characteristics

To ensure future scalability and performance, an architect should, as mentioned, group together data items that are logically related and have similar delivery characteristics into a network data model.

For example, consider a data set representing the state of a fleet of Army trucks (Figure 1). The data being sent includes the license plate information, GPS information, manifest information, the oil level in each truck, and the maintenance information. Each of these types of data is being updated at different rates, or in some cases, aperiodically. For example, the maintenance information may be changing after a certain number of miles, rather than a certain amount of time. The architect must take into account the different types of data being sent and their varying rates when mapping them to data streams on the network.

 

Figure 1: A fleet of trucks has fuel data, GPS data, manifest data, and position data. The first step when building a distributed system is to define which data are being sent.

(Click graphic to zoom by 1.3x)


21

 

 

 

 

The system architect maps those data streams, in DDS terminology, to Topics.

To give an example of data that are logically related with similar delivery characteristics, the state of an Army truck may include:

·        Latitude

·        Longitude

·        Speed

·        Direction

 

 

This is all related and changes at the same rate, so it can be mapped to a single Topic called “TruckGPS.”

Issues occur when architects or developers haven’t thought through the data model, but have instead put everything into one data stream. In the truck example, this would mean that all truck data would be sent at the same time – the oil level, manifest, maintenance information, and all other truck data would be sent at the high rate of the GPS information. This may work with a limited amount of data, but even at a small scale it uses unnecessary bandwidth and CPU capacity. In a system with more data, this becomes unworkable. By executing Best Practice #1 up front, the architect can prevent scalability and performance problems down the line

Best Practice #2: Use typed data

Giving data an actual structure or type, instead of sending binary data around the network (unless it actually is binary data such as images and videos), can be very beneficial. One reason for this is that the middleware is going to handle discovery and distribution of type information.

With typed data, one can automatically plug in other applications and services, such as data visualization tools or distributed logging tools. This allows developers and system integrators to plug in COTS or custom-built tools to visualize data, to record data, to translate real-time data into a relational database, or to perform additional functions on the data that are not known in advance.

When applications use typed data it also enables capabilities beyond what was originally designed. New applications can be developed that choreograph, aggregate, split, or mediate data without having to worry about protocol-level details. This allows data to be changed in reasonable and well-organized ways, and systems to be maintained and upgraded gracefully.

In contrast, if an application is designed with data that are opaque to the middleware, the application developer must write the logic to convert the data to and from a network form, to support the required programming languages, and to handle endianness. This adds additional logic that must be tested and maintained. In addition, this prevents COTS tools from interacting with the data. Ultimately, this can increase the costs of evolving and upgrading the system.

Best Practice #3: Use keyed data

If multiple real-world objects are being represented in a system, the system architect should use key fields to inform the middleware what those objects are. Key fields are fields in your data type that form a unique identifier of a real-world object. If data are keyed, the middleware will recognize that each unique value of key fields represents a unique real-world object (Figure 2).

 

Figure 2: In this diagram, the user has told the middleware that it is representing actual trucks. The field is annotated to distinguish that it is a key field. Multiple key fields of different types are possible.

(Click graphic to zoom by 1.5x)


22

 

 

 

 

Circling back to the fleet of Army trucks example, if a developer would like to represent multiple trucks inside a single data stream, they would do this by adding a VIN number to each topic that was previously designed. They would then mark that VIN number as a key field in DDS. Now, if the middleware sees a new VIN identifier it hasn’t seen before, it recognizes that it is a different truck object that is unfamiliar. The DDS terminology for these unique real-world objects is to call them Instances.

The first benefit of letting the middleware know there are unique objects in the data stream is that it can keep track of life-cycle information for instance objects. For example, it can alert the user if there is an update from a new Army truck that is unfamiliar. If a new truck has been added to the fleet, and publishes the “TruckGPS” Topic, the subscribing application is notified of the position and also that this is the first time it has received an update about this truck.

The middleware can also keep track of life-cycle information about whether this instance is alive or not. For example, perhaps the user has stopped receiving updates about a particular Army truck. The user can be notified that this particular truck instance has become not alive. Applications can monitor these life-cycle events to detect problems like application errors or network disconnections.

In addition, the middleware can use Quality of Service (QoS) to control behavior per instance. A few examples of this include:

·        The middleware can alert the user if an update for a particular instance is delayed using the deadline QoS. For example, if the middleware is configured to expect GPS updates from each truck every two seconds, it can be notified that it did not receive an update from a particular truck within that time.

·        The middleware can allocate a cache per instance using the history QoS. For example, the user can tell the middleware they would like to keep the last five updates per truck. The middleware will then keep a maximum of five updates per truck, and prevent updates about one truck instance from overwriting updates for another.

·        The middleware can be configured to have a failover mechanism per-instance using the ownership QoS. For example, redundant sensors can be set up that update about a particular truck and can failover gracefully. One sensor might be the owner (primary writer) of the TruckGPS data for a particular truck. If it fails, another sensor can automatically take its place as the owner of that truck’s TruckGPS data. These types of redundancies can be set up for each individual object being monitored.

 

 

Alternately, if the system architect does not design the data to be keyed, the middleware will not understand that there are multiple objects being represented. This leads the application developer to write logic that duplicates what is already in the middleware to detect object life cycles, detect delayed object updates, and to provide failover between updates about an object. In addition, if the middleware is not aware that the application is representing multiple objects, it cannot keep a separate logical queue per object. This forces the application to have longer queues to ensure that updates are not lost for a particular object.

 Real-world applications

The discussed architectural “best practices” guidelines, gleaned from extensive experience with hundreds of DDS-based applications, should be considered when using DDS to integrate real-world systems. The reason: Real-world systems must operate continuously and interact directly with real-world objects. They must perform within the constraints and timing imposed by the physical world. In practice, this means they must be able to handle the information as it arrives and be robust to changes in the operating environment.

Such distributed systems are increasingly being integrated using a data-centric, publish-subscribe DDS approach. The main benefit of DDS is its ability to map the application data-model directly into application code. DDS is the only middleware standard that covers the programming language APIs (ensuring portability between implementations), the wire protocol (ensuring interoperability between components that use different implementations of the middleware), and QoS. It also supports multiple programming languages, such as C, C++, Java, .NET, and Ada.

Rose Wahlin is a Principal Software Engineer at RTI. She led the development team for the RTI implementation of a small-footprint DDS prototype for edge connectivity. She has been involved in RTI projects developing capabilities for static discovery, specialized network transports, RTI tools, and a range of customer architecture projects. She can be contacted at [email protected].

RTI

www.rti.com

 

Featured Companies

Real Time Innovations (RTI)

232 E. Java Drive
Sunnyvale, CA 94089