Autonomous ISR software reduces operator workload - An interview with executives from Kitware
Open source software company Kitware is implementing “computer vision” into their military video surveillance programs, bringing only events of interest to the attention of human video analysts.
Editor’s Note: European citizens are familiar with the ubiquity of cameras everywhere, monitoring high-traffic public places and roadways. In most cases, video feeds back to operations centers where operators look for specific events. On today’s digital battlefield, ISR sensor data and video from UAS platforms, satellites, aircraft, and ground forces feed into command centers and TOCs where military analysts perform similar functions to their civilian counterparts, but with deadly seriousness. One missed event could mean death by IED to a warfighter. Yet how can a human watch and correctly identify threats within hundreds of hours of reconnaissance data? Open source software company Kitware, with some help from DARPA, has the answer. I sat down with executives from Kitware to learn about leading-edge autonomous software. Edited excerpts follow. – Chris A. Ciufo, Editor
Let’s start by talking about Kitware.
SCHROEDER: We’ve been around since 1998. We’re an open source company, and we develop technologies in the scientific computing arena. In most cases, we give that technology away in the form of open source code available from our website.
How do you license it, and how do you make any money?
SCHROEDER: It’s a very permissive BSD licensing agreement, which allows people to use it without necessarily giving anything back to us. We make money as a service company: By giving this stuff away, people start using it and they need help using it, integrating it into their systems.
Does your company maintain any intellectual property?
SCHROEDER: It’s minimal. The majority of our “intellectual property” is customer relationships and customer contact lists. We have a small number of patents and some patents pending. There are proprietary software systems we’ve developed for our customers, but we can’t necessarily sell those. We have a couple proprietary products, too. The value in our company is the employees and the knowledge that travels the open source world.
Which software technology areas do you focus on?
SCHROEDER: We have five key areas. We’re a software company, so we focus a lot on software quality and software process. Around that software core, we have four technology areas. One is computer vision, which Anthony [Hoogs] can talk more about.
HOOGS: Sure. Most of our funding comes from DARPA, which is mostly interested in aerial video. So primarily what we do is have a computer examine the video and determine whether the video contains anything of interest to a video analyst. The overall goal is really to make this enormous amount of video being collected by the military indexable and accessible so that you can know what’s in it without having humans look at it – until there’s an event that requires operator intervention.
Let’s set the stage. Are these super high-res cameras? Are there ever image quality issues?
HOOGS: The military uses consumer cameras that are high resolution and create lots of data. Sometimes image quality issues can arise, but they are not because of poor cameras; rather, they are because the scene is miles away from the UAV, such that atmospheric disturbances and imaging conditions have an effect. Additionally, if the camera’s on a moving sensor or platform, … you have issues of parallax: 3D objects in the scene like a building rising above the ground plane with apparent motion.
So the intention is to make this poor-quality video into high-quality video while discerning people and events.
HOOGS: Right. [A computer-flagged event] might be someone implanting IEDs; the computer flags things like that. For example, there might be a Predator sweeping along a road because the analyst is following a vehicle of interest. The camera might sweep right over somebody digging a hole off the side of the road. If the analyst is watching that vehicle, he may not even notice that person digging a hole. However, the computer doesn’t get distracted by watching the vehicle versus something on the side of the road.
Is a second or two of video enough to save that portion of the image?
HOOGS: The amount of time required to recognize a certain type of event or action depends on the event. It does vary quite a bit, but typically it’s just a few seconds. We don’t have any kind of control feedback in the systems now, but that’s certainly conceivable.
How does the software report the flagged events?
HOOGS: The core of the system is the ability to catalog everything happening in the video. What you do with that information can vary quite a bit. If you’re getting a video stream in real time, then the analyst can set up a standing alert that says, “Tell me whenever you see anything like this.” Maybe it’s digging, maybe it’s somebody loading a vehicle. The analyst can program their own set of alerts that can be at a given location; they can be contextual, in this type of data or at this time of day. Then whenever any of the computed information matches the alert criteria, the analyst is notified.
OK back to you, Will, now that we’ve delved into Kitware’s 1) software infrastructure and 2) computer vision, what were the other three Kitware core areas?
SCHROEDER: One of the other three areas is 3) scientific visualization. When I say visualization, most people think about 2D graphics. But scientific visualization involves large-scale 3D, 4D [time] graphics, and visualization where we take the output of an MRI scan, or an engineering simulation, or some large nuclear physics or oil and gas simulation or analysis.
And then our last two core areas are 4) medical imaging and 5) data management (Figure 1). Relative to scientific data management, we found that a lot of our customers have large data sets such as computer vision video streams or large biomedical MRI, CT, or confocal microscopy, and so on. And they’re not doing a very good job of managing these very large and complicated things. Traditional database methods don’t really manage scientific data very well, so we help them with data management, viewing, processing, and so on.
Scientific data management doesn’t necessarily have to be imagery data though, correct?
SCHROEDER: That’s right. When we say scientific visualization, that actually is producing video or images, but the input to that process is large data sets that might come from a supercomputing simulation. So the data actually can be a range of things, from input decks for simulation codes all the way to the output, which might be graphics. The sensors are becoming more accurate and resolution is increasing very quickly. So now the data sizes are no longer 640 x 480 video streams. We’re talking HDTV multiplied by 100. And confocal microscopy or electron microscopy are down to 5 nm in electron microscopy, which means you end up with images that are like 100,000^2 in depth or it might be 40,000 of these 100,000^2 images.
Do your algorithms and software work with sensor images such as radar?
HOOGS: Often it’s less rich in information, which is part of the issue. So some of the algorithms do work because we develop them to mostly work off trajectory-level information. As long as you have a track you can reason about that track. Is it starting or stopping? Are two vehicles coming together, and so on. And that is agnostic to how the tracks were created. So if you want to create tracks from radar, GMTI, or even radar imagery, then, in principle, it would work.
What about your software itself? Tell me about it.
HOOGS: It’s mostly C++, and our core framework is C++. But we work with a lot of universities. They work in MATLAB and sometimes Java, but the ability to integrate across languages is increasingly mature. So on ourVIRAT program, which is a technology we’re talking about here, we define an API that allows MATLAB modules to be dropped into the C++ system. And most of it runs on Linux and Windows, too.
VIRAT is source code. Are the MATLAB models also part of the source code?
HOOGS: No, that’s just the research software version. A deployed version wouldn’t have MATLAB.
VIRAT is a series of algorithmic modules. It’s generally pipeline software that’s typically at least six or eight processing stages, starting with video pixels and going through a bunch of different things, ending up in a database. So we take all this content in the video that we’re computing and we put it in a database and index it. We had to develop special customized indexing software because these descriptors and the approximate matching functions needed to look them up on are not really suitable for typical relational database systems.
How would you characterize your database?
HOOGS: It’s really a set of databases. We have different algorithms to describe the content in the video, and each has a set of what you might think of as tables in the database. (It doesn’t actually map that way, but that’s a good approximation.) A typical relational database has these structured fields. A given table might have 5 fields or 10 fields. And a field you can think of as a dimension. A field might contain an integer, or a string or something. And we have some descriptors, which are really mathematical ways of representing a piece of the videos, like “keep track of an object and how its appearance changes over time.” Some of these descriptors end up needing thousands of fields, thousands of floating-point numbers, to represent them on every frame of the video, or for every track.
Is VIRAT open source?
HOOGS: At this point, none of the code for the VIRAT system is open source. Many subcontractors and universities have contributed to this, but none of the companies are open source companies like Kitware. DARPA might not necessarily want it to be open source for security reasons. However, some components are good candidates to become open source, and we hope to get approval for them at some point.
What about your company’s products is open source then?
HOOGS: Well, in general, Kitware is built on an open source foundation. There’s the Visualization Toolkit, the Insight Toolkit, these big C++ systems for visualization, medical image analysis, and scientific data visualization and management. Some of those are used on our computer vision programs, and through those we are slurping in a lot of open source. We’re also developing elements for those toolkits, which will go back into the open source parts as we get approval. But the core vision stuff right now is not out in those open source toolkits.
Tell me about the processing requirements to execute the code you’re talking about.
HOOGS: It’s all on desktop PCs. Typically we like x86-based quad cores, or at least two or three cores. Our system can use as many cores as are thrown in its pipeline.
What are Kitware’s future technologies?
SCHROEDER: One thing we’re starting to put more work into is computational chemistry, which is becoming extremely important in the DoD.
Another area we’ve been growing is informatics, also known as information visualization, which is an extremely important field for homeland security issues. And so instead of looking at data that’s spatial-temporal – like a CT scan or a particle physics simulation – you’re looking at data that’s not related to space and time. And we’re working on automatic analysis of wide-area video, mostly on a DARPA program called PerSEAS. On PerSEAS we’re developing algorithms to automatically detect threats and insurgent activities in city-wide video.
Kitware 518-371-3971 www.kitware.com