U.S. Army's AI facial recognition works in the dark
A group of U.S. Army Research Laboratory (ARL) researchers developed an artificial intelligence (AI) and machine-learning technique capable of producing automatic facial recognition from a thermal image captured of a person’s face in low-light conditions or at night.
Soldiers working covert operations at night may benefit from a new form of facial-recognition technology in the not-too-distant future, and this development is also expected to lead to enhanced real-time biometrics and post-mission forensic analysis.
Thermal cameras – such as forward-looking infrared (FLIR) – or sensors are already actively deployed on aerial and ground vehicles, in watchtowers, and at checkpoints for surveillance and security purposes, and are now starting to make their way into body-worn gear.
The ability to perform automatic facial recognition at night via thermal camera is becoming increasingly important for soldiers to identify individuals of interest or on a watch list. ARL researchers – Benjamin S. Riggan, Nathaniel J. Short, and Shouowen “Sean” Hu – developed technology to enhance both automatic and human-matching capabilities.
Their technology “enables matching between thermal face images and existing biometric face databases and watch lists that only contain visible face imagery,” says Riggan, an ARL research scientist. It’s “a way for humans to visually compare visible and thermal facial imagery through thermal-to-visible face synthesis.”
Why thermal cameras? Low-light and night conditions, Riggan explains, provide insufficient light for conventional cameras to capture facial imagery for recognition without active illumination such as a flash or spotlight, which tends to give away the position of such surveillance cameras. But thermal cameras can capture the heat signature that emanates from living skin tissue, so it makes them ideal for low-light or dark conditions.
The main challenge of using thermal cameras for facial imagery “is that the captured thermal image must be matched against a watch list or gallery that only contains conventional visible imagery from known persons of interest,” Riggan says. “Therefore, the problem becomes what is referred to as ‘cross-spectrum’ or ‘heterogeneous face recognition.’ In this case, facial probe imagery acquired in one modality is matched against a gallery database acquired using a different imaging modality.”
The group’s approach leverages advanced domain adaptation techniques based on deep neural networks. It’s composed of two key parts: a nonlinear regression model that maps a given thermal image into a corresponding visible latent representation, and an optimization problem that projects the latent projection back into the image space.
By combining global information, such as the features from across the entire face, with local information – with features from “discriminative fiducial regions” of the eyes, nose, and mouth – the researchers demonstrated that their technology enhances the discriminability of the synthesized imagery. They also showed how the thermal-to-visible mapped representations from both global and local regions within the thermal face signature can be used to synthesize a refined visible image of a face.
The optimization problem for synthesizing an image attempts to jointly preserve the shape of the entire face and appearance of the local fiducial details. By using the synthesized thermal-to-visible imagery and existing visible gallery imagery, the researchers performed face-verification experiments using an open-source deep neural network architecture for face recognition. The architecture used is designed for visible-based facial recognition.
Surprisingly, according to the ARL group, their approach achieved even better verification performance than a generative adversarial network-based (GAN) approach, which previously showed photorealistic properties. GANs are artificial intelligence algorithms used in “unsupervised machine learning” and implemented by a system of two neural networks that “contest with each other” within a zero-sum game framework, say the researchers.
Riggan attributes their result to the fact that the game theoretic objective for GANs immediately seeks to generate imagery that’s sufficiently similar in dynamic range and photo-like appearance to the training imagery – while sometimes neglecting to preserve identifying characteristics. In contrast, ARL’s approach preserves identity information to enhance discriminability, for example, increased recognition accuracy for both automatic facial-recognition algorithms and human adjudication.
As the ARL group recently demonstrated during a proof of concept at IEEE’s Winter Conference on Applications of Computer Vision – which included the use of a FLIR Boson 320 thermal camera and a laptop running the algorithm – a captured thermal image of a person can be used to produce a synthesized visible image in situ.
Riggan and colleagues will continue exploring and expanding this research under the sponsorship of the Defense Forensics and Biometrics Agency to develop a robust nighttime facial-recognition capability for soldiers.