DOD SBIR 24.2 Annual

Active
No
Status
Open
Release Date
April 17th, 2024
Open Date
May 15th, 2024
Due Date(s)
June 12th, 2024
Close Date
June 12th, 2024
Topic No.
OSD242-D001

Topic

KLV-Enhanced Orthomosaic Image Generation from Full Motion Video (FMV)

Agency

Department of DefenseN/A

Program

Type: SBIRPhase: BOTHYear: 2024

Summary

The Department of Defense (DOD) is seeking proposals for the development of an algorithm to build accurate georegistered orthomosaic images from full motion video (FMV) by leveraging a combination of the video stream and Key-Length-Value (KLV) metadata stream. The objective is to improve the accuracy of orthomosaic images generated from FMV, which is a critical component of intelligence, surveillance, and reconnaissance (ISR) operations. The algorithm should be able to handle typical FMV collection use cases, including changing zoom levels, imaging modalities, rapid slewing, on-screen display obscuring pixels, and occasional data corruption. The algorithm should effectively fuse information from the KLV metadata, on-screen metadata, and video sources to construct the most accurate georegistered orthomosaic. Optical Character Recognition (OCR) algorithms are recommended to extract metadata values directly from the images when KLV metadata is absent but on-screen metadata is available. The resulting mosaic should feature distinct output layers for each modality within the input FMV. The algorithm should also address the challenges of orthorectification in nonplanar terrain by deriving or updating a higher resolution Digital Elevation Model (DEM) directly from the video stream. The proposal should demonstrate existing capabilities for working with KLV metadata, video mosaic construction, OCR applied to imagery, and orthorectification. The project will be implemented in two phases, with Phase I focusing on a feasibility study and Phase II involving the implementation of a prototype system. The technology developed in this project will have applications in the government and commercial sectors, including national security, targeting, intelligence, urban planning, environmental monitoring, and search and rescue. The proposal submission deadline is June 12, 2024. For more information, visit the solicitation link.

Description

OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Advanced Computing and Software

 

OBJECTIVE: Develop an algorithm to build accurate georegistered orthomosaic images from full motion video (FMV) by leveraging a combination of the video stream and Key-Length-Value (KLV) metadata stream.  The method must work under typical FMV collection use cases including changing zoom levels, changing imaging modalities, rapid slewing, on-screen display obscuring pixels, and occasional data corruption.

 

DESCRIPTION: FMV, as a critical component of intelligence, surveillance, and reconnaissance (ISR) operations, generates video feeds with embedded KLV metadata containing essential geospatial and temporal information.  The KLV metadata typically conforms to the MISB 0601 standard [1]. This SBIR topic focuses on harnessing this metadata to improve the accuracy of orthomosaic images [2].  Recognizing that neither the video stream nor the metadata stream alone suffices for accurate orthomosaic computations, this SBIR effort specifically addresses the challenges associated with the fusion of video and KLV metadata to build orthomosaics under typical operating conditions.

 

The KLV metadata stream offers crucial platform and camera pose information, guiding the georegistered mosaic construction process. However, its accuracy falls short for pixel-level alignment. While image-based mosaicking achieves subpixel alignment accuracy, it often lacks a holistic geospatial context and proves inadequate in scenarios involving rapid camera movements, zooming, or changes in image modality (e.g., electro-optical (EO) to infrared (IR)). The primary objective of this SBIR topic is to effectively fuse information from the KLV metadata, onscreen metadata and video sources, aiming to construct the most accurate georegistered orthomosaic.

 

In addition to KLV, metadata is frequently visually presented as on-screen text for FMV operators. In situations where KLV metadata is absent but on-screen metadata is available, the use of Optical Character Recognition (OCR) algorithms is highly recommended to extract metadata values directly from the images.  Furthermore, algorithms developed under this SBIR initiative should ensure that on-screen text does not corrupt the appearance of the constructed mosaic [3].

 

Given that FMV operators often switch between multiple sensing modalities (e.g., EO and IR), the mosaicking algorithm should seamlessly operate across dominating modalities. The resulting mosaic should feature distinct output layers for each modality within the input FMV, avoiding the blending of images across modalities.

 

Using the constructed mosaic in mensuration workflows requires the mosaic to be orthorectified. For accurate orthorectification in nonplanar terrain, a Digital Elevation Model (DEM) is needed [4]. Recognizing that accurate and up-to-date DEMs may not exist at the resolution of the FMV, offerors may need to derive or update a higher resolution DEM directly from the video stream to ensure improved accuracy in orthorectification processes. This SBIR effort aims to address these challenges comprehensively, advancing the capabilities of orthomosaic generation in the context of dynamic FMV scenarios.

 

PHASE I: This topic is intended for technology proven ready to move directly into a Phase II. Therefore, a Phase I award is not required. The offeror is required to provide detail and documentation in the Direct-to-Phase II proposal, which demonstrates accomplishment of a Phase I-like effort, including a feasibility study. This includes a review of the scientific and technical merit and feasibility of proposed ideas. The offeror should be able to demonstrate existing capabilities for working with KLV metadata encoded in FMV and video mosaic construction.  The offeror should also demonstrate experience with OCR applied to imagery and orthorectification.

 

PHASE II: Implement a prototype system for constructing georegistered orthomosaic images from FMV input that combines KLV metadata and video stream image registration. Demonstrate improved accuracy and reliability over using either metadata or image registration alone.  Demonstrate robustness in FMV with EO imagery modality with short IR interruptions (insertions), fast slewing or zooming cameras, FMV with on-screen metadata but no KLV, intermittent data corruption, and complex scene terrain not captured in available DEMs. All prototype development is considered Controlled Unclassified Information (CUI), subject to DFARS Clause 252.204.7012 and DoDI 5200.48. Phase II may require classified (SECRET) work to evaluate algorithms on operational data.

 

PHASE III DUAL USE APPLICATIONS: Technology enabling accurate orthomosaic generation from aerial video would be widely applicable across the government and commercial sectors. The technology may be adapted to other airborne or satellite sensors providing both imagery and metadata. Military applications include national security, targeting, and intelligence. Commercially, it will apply to urban planning, environmental monitoring, search and rescue, and all other domains that benefit from orthomosiacs from video.

 

REFERENCES:

Motion Imagery Standards Board (MISB). https://nsgreg.nga.mil/misb.jsp
Zhang, Jiguang, et al. "Aerial orthoimage generation for UAV remote sensing." Information Fusion 89 (2023): 91-120.
Dawkins, Matthew, Amitha Perera, and Anthony Hoogs. "Real-time heads-up display detection in video." 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 2014.
Bannari, Abderrazak, et al. "Multi-scale analysis of DEMS derived from unmanned aerial vehicle (UAV) in precision agriculture context." 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS. IEEE, 2021.

 

KEYWORDS: Orthomosaic Generation, Full Motion Video (FMV), KLV Metadata, Mosaicking, On-Screen Text Removal, OCR, Geospatial Analysis.