Mars, methodologies, and mastery of embedded development
Embedded software in military, defense and aerospace programs is on an exponential complexity curve. Yet fail-safe reliability durability and security have always been paramount for the military and aerospace sectors, which excel at building systems including significant hardware and software components. This discipline sent us to the moon, to Mars, and beyond. How can we leverage these defense industry lessons, with their focus on leading interdisciplinary communications and methodologies, augmented by today's new system and software development technologies? A JPL engineer reviews the Mars Viking 1976 program and explores the latest virtual platform technologies for embedded systems and software reliability and accelerated development.
Long ago, in a galaxy far, far away ... the U.S. space program did some special work. It wasn’t fantasy, it wasn’t Hollywood, it was within our solar system. I was fortunate to intern at the Jet Propulsion Laboratory (JPL) the summer that Viking landed on Mars. I was there at 5:12 a.m. PDT, July 20, 1976, celebrating with the rest of the Viking team, when the landing confirmation, transmitted from Viking, was received minutes later at JPL. Yes, I was just an intern, but I was still part of a team that had put an extraordinary system together, functioning with incredible versatility, reliability and durability, tens of millions of miles away. (Figure 1.)
Fast forward to the 1980s: As industry in the Silicon Valley was taking off, real and apocryphal stories popped up about the cost and overhead involved in the defense and aerospace arena. Having spent the first nine years of my professional career in this industry, I can attest that there’s a grain of truth to those stories. Silicon Valley and the semiconductor industry rejected – from a cultural and business perspective – the military-aerospace industry. With innovation and time to market as its drivers, Silicon Valley boomed.
The markets that drove Silicon Valley, personal computers and mobile phones, had much lower reliability or durability requirements than existed in the military/aerospace segments. What the PC and mobile industries did well – and continues to do well – is to build high-reliability, high-durability systems with significant hardware and software components. Military/defense/aerospace understood, and still understands, system design and how to get software and hardware working together.
Now, with automotive applications and IoT [internet of things] driving expected growth in the semiconductor industry over the next few years, and the dependence on embedded systems, people are struggling to get the systems right: reliable, safe, and secure. It’s not exactly “rocket science,” and yet it’s really hard. It takes engineers that understand the system, people in the various disciplines communicating early, often, and in great detail across those engineering boundaries.
Specifications, separation, systems, software, and hardware
Begin with specifications: Design engineers are waiting for this to start their design. Specifications can be in a variety of formats: What does the specification look like? How is it shared? The key issue we see is that teams are more successful when the specification is a living document; that is, living not just with the architectural team, but with design and verification teams. The more interdisciplinary development is, the better.
For example, one semiconductor company we know keeps a fair bit of separation between the different teams, with little cross-functional cooperation or feedback. As a result, in one case, relatively late in the process, they had to defeature their devices because some of the advanced features the hardware team was adding were not going to be supported by the OS team. If the architecture team had communicated with the OS team six months earlier, there were some pretty cool features that could have been supported. But the communication came too late. Too many people are not used to thinking about software up front, at least in the semiconductor world.
In contrast, we know an embedded systems customer that starts their process with considering the complete system they are developing, including a heavy software component. They know that a lot of the value lies in the software, as well as in the hardware. (Whereas for a system-on-chip coming from a semiconductor vendor, even though the vendor has to deliver a software stack, the perception remains that all the value is in the silicon.) Because this embedded systems company has the end-product perspective, there is much more focus on software, how they will partition software and hardware, and how to implement the overall design. Their interdisciplinary communication is incredibly important.
Speaking strictly on the semiconductor side, without good communication between architecture and design teams, potential problems obviously include respins, or specifying a chip that is really complex to implement or to verify, or cutting back on features. These risks are common and the costs are recognizably huge.
What people often don’t realize is that there are the same risks and financial tragedies on the software side; it’s just not as glaringly obvious. Having to do a hardware respin that costs $5 million or $10 million or more is like stepping off a cliff. The software side is a much more slippery slope: If your system has bugs, is more complex to support, or you need to defeature capabilities, you have to do release after release after release. This does have an associated cost. Often, management just rationalizes that more software releases are always necessary, so they just keep doing more releases, but it is difficult, affects the bottom line, and absolutely cuts into profits. It is the same vertical fall as the hardware respin cliff, but more like rolling down a grassy hill. You actually go down (cost- and profit-wise) just as far; if you look at the numbers they are close to the same over a one- or two-year period of a project. It’s just that when you step off a cliff you notice it more; even if you’re rolling down that grassy hill, it’s still a long way down.
A Rosetta Stone for software, hardware, and systems developers
What is the solution? One approach is a virtual platform methodology, which comprises a high-level software model of the system, including IP, hardware, and embedded software, along with a high-performance simulator. The virtual platform enables the entire system to be simulated, facilitating early architectural exploration and analysis of tradeoffs between hardware and software. It also serves as an “executable specification,” unifying the various interdisciplinary teams for detailed common understanding to enhance cooperation. It facilitates testing by executing code even before hardware is available and delivering the controllability and observability needed for rapid debug.
Although system-level codesign is challenging, virtual prototyping is a useful approach for collaboration between architects, designers, verification engineers, hardware engineers, and software engineers. It’s a methodology that can enforce best practices in interdisciplinary communication. It can also support an Agile methodology, with its short development, test, and evaluation sprints.
Today, developers, architects, and testers often no longer sit next to each other and talk to one another about what works and what doesn’t. Interdisciplinary system-level activity and communications is key and must be automated and institutionalized at a high level.
Back to Mars: Leveraging defense industry lessons
Interdisciplinary communications and methodologies to ensure reliability have always been a focus of the defense industry; this discipline goes back a long way in the industry and has worked well, getting us to the moon, Mars, and more.
The key issue that separated semiconductor industry practices from various proven defense industry development best practices had to do with accelerating time to market (TTM). Companies wanted to slash overhead, move fast, get products out quickly, and capitalize on innovation. In the process, we lost some of that methodology and forgot the reasons behind what the defense industry had been doing for years. The fact that we have landed on the moon and Mars, and orbit space with systems that still function years later, is not down to luck; it’s due to careful engineering work that brought a lot of people together to focus on systems that last and are reliable.
Today many companies are caught up in TTM and innovation, but perhaps we need to come back to center and consider reliability, safety, and quality, focusing much more on systems engineering, interdisciplinary approaches, and knowledge. In this, the military and defense markets have always lead the way. It’s also true that automotive companies have often stayed closer to the defense model because of the importance of reliability and consequences.
In my generation, graduating and spending our first years in defense working for a Hughes, Boeing, Douglas, Raytheon, or Lockheed was common. You would work three or four years for one of those companies and learn a lot about how projects were run and what it means to be a real-world engineer. Then you would go off to do something new. Many coming out of school today haven’t gotten that exposure; they may even go straight to a startup. But both large and small companies could be doing better with a systems methodology focus.
We do see that some universities recognize the need for a systems and team approach, which is a way forward. For example, the Imperas University Program encompasses 34 universities, with more than 7,000 students and academics from more than 1,000 universities subscribing to the Open Virtual Platforms website for freely available models, virtual platforms, and simulators. Another example of a university using the systems approach is the SystemX Alliance at Stanford.
Trends: safety, security, and extra-functional features
As we move forward, there is increased emphasis on what are called extra-functional features, such as power, timing, and security; important, even critical issues. Certainly we’ve seen individual discussions of these features but to some degree it’s like discussing the IoT: Yes, there’s an IoT, but what exactly is it? Can it be defined? How do people build businesses and profit with it? For security, everyone knows more is needed: but what is it exactly? Power: when do you start looking at power? At the very beginning? How can you actually do realistic power analysis? Given the vast range of systems scenarios and operational usage, this is impossible on a spreadsheet. Even the best power analysis tools at the gate level cannot run all the system scenarios.
To this end, new methodologies and tools must be adopted to address these issues, the ones that occur where the software meets the hardware, the hardware-dependent software layer. One initiative, with the prpl Foundation, is in benchmarking hypervisors. Hypervisors are useful in security and safety, but they add overhead. So, on the timing side, do they add latency, which would be unacceptable in a real time device? Are they adding power consumption? How can users ensure a secure boot?
There will be an interesting evolution in embedded systems development over the next three to five years as we start seeing these things come together out of necessity.
In summary, let’s truly consider revitalizing the military/defense/aerospace focus on quality and reliability that got us to Mars (Figure 2), across more arenas such as automotive, IoT, and other rapidly evolving markets, with all their TTM opportunities. Let’s continue to lead in defense by adopting the latest methodologies, such as virtual platforms, for interdisciplinary communications that contribute even more to systems development, reliability, and security.
Imperas Software www.imperas.com