DOD SBIR 24.1 BAA

Active
No
Status
Closed
Release Date
November 29th, 2023
Open Date
January 3rd, 2024
Due Date(s)
February 21st, 2024
Close Date
February 21st, 2024
Topic No.
AF241-D004

Topic

Explainable Reinforcement Learning (XRL) for Command and Control (C2)

Agency

Department of DefenseN/A

Program

Type: SBIRPhase: BOTHYear: 2024

Summary

The Department of Defense (DOD) is seeking proposals for the topic of "Explainable Reinforcement Learning (XRL) for Command and Control (C2)" in their SBIR 24.1 BAA solicitation. The objective of this topic is to develop a prototype that enables practical applications of Reinforcement Learning (RL) to be explained for interpretability, trust, performance-explanation trade-off, accountability, safety, and human-AI collaboration. The technology is restricted under export control laws. The proposal should demonstrate feasibility and prior experience in XRL. Phase II of the project will focus on developing explainable AI/ML solutions using reinforcement learning for command and control applications. Phase III will involve transitioning the technology to a commercial or warfighter solution. The offeror will be responsible for seeking funding opportunities for Phase III.

Description

OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Trusted AI and Autonomy

 

The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.

 

OBJECTIVE: The objective of this topic is to develop an effective (SBIR Phase II) prototype to enable practical application (s) of Reinforcement Learning (RL) to be explained for interpretability (i.e., generating explanations that are intuitive and understandable to humans), trust (i.e., to verify an agent’s behavior), performance-explanation trade-off (i.e., strike a balance between the performance of the RL agent and the quality of explanations it provides), accountability and safety (i.e., RL agents to be held accountable for their actions to be able to identify and rectify potential risks/errors in agent’s behavior) and finally, human-AI collaboration (i.e., collaboration by effective communication and collaboration). This topic undertakes the operational imperatives as follows: • Operational Imperatives o II - Achieving Operationally Optimized Advanced Battle Management Systems (ABMS) / Air Force Joint All-Domain Command & Control (AF JADC2) o V - Defining optimized resilient basing, sustainment, and communications in a contested environment

 

DESCRIPTION: RL represents a groundbreaking technology with the ability to perform long-term decision-making in complex and dynamic domains at a level surpassing human capabilities [1]. Leveraging this capability holds immense strategic significance for the United States Department of Defense (DoD), given that RL-enabled systems have the potential to outperform even the most exceptional human minds in a wide range of tasks [2]. Its adoption in high-risk real-world domains like military applications has been limited due to the challenges associated with explaining RL agent decisions and establishing user trust in these agents, despite remarkable improvements. For instance, while the AI AlphaStar competes against highly skilled StarCraft 2 players, comprehending its inner workings necessitates extensive and impractical empirical investigations [3]. This substantial and inhibitory constraint arises because current Explainable Reinforcement Learning (XRL) methods inadequately address the fact that autonomous decision-making agents can alter future data observations through their actions and effectively reason about long-term objectives aligned with the agent's mission. Therefore, it is imperative to develop effective XRL approaches that overcome these limitations to unlock the widespread utilization of RL's capabilities. Therefore, we seek to have proposals that would adhere to effective and efficient models for XRL, which will be used for the US Air Force’s direct operational use.

 

PHASE I: As this is a Direct-to-Phase-II (D2P2) topic, no Phase I awards will be made as a result of this topic. To qualify for this D2P2 topic, the Government expects the Offeror to demonstrate feasibility by means of a prior “Phase I-type” effort that does not constitute work undertaken as part of a prior SBIR/STTR funding agreement. The Offeror is required to provide detail and documentation in the Direct-to-Phase-II (D2P2)proposal which demonstrates accomplishment of a “Phase I-type” effort where the Offeror demonstrate a case study or prototype of having performed explainable reinforcement learning for any practical applications where they have been able to provide intuitive and understandable explanations to humans based off their AI/ML inference findings to verify an agent behavior.

 

PHASE II: This phase II topic proposal seeks 6.2 explainable AI/ML solutions using reinforcement learning for command and control applications.  Proposals should include development, installation, integration, demonstration, test and evaluation of the proposed solution prototype system that verifies an agent behavior, provides performance trade-off, trust, quality explanation that ultimately translates into intuitive interpretability for human understanding of how the agent arrived at such decision.

 

PHASE III DUAL USE APPLICATIONS: Phase III efforts will focus on transitioning the developed technology to a working commercial or warfighter solution. The offeror will identify the transition partners. The technology will meet a minimum of TRL 6 and will be mature and operationally ready. Solution will be configured, tailored, further developed  to  match the customer requirements and specific environment configuration for deployment. A transition plan will be required to be developed and delivered.  Phase III are not competed thus it is the responsibility of the offeror to seek funding opportunities.

 

REFERENCES:

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski and S. Petersen, "Human-level control through deep reinforcement learning," Nature, vol. 518, pp. 529-533, 2015, February;
"THE NATIONAL ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT STRATEGIC PLAN: 2019 UPDATE, "https://www.nitrd.gov/pubs/National-AI-RD-Strategy-2019.pdf" A Report by the SELECT COMMITTEE ON ARTIFICIAL INTELLIGENCE of the NATIONAL SCIENCE & TECHNOLOGY COUNCIL, 2019, JUNE.;

 

KEYWORDS: Reinforcement Learning interpretability; Reinforcement Learning explanations;

Similar Opportunities

DOD SBIR 24.4 Annual
Department of Defense
The Department of Defense (DOD) is seeking proposals for an open topic on persistent experimentation. The U.S. Army, under the Office of the Under Secretary of Defense for Research and Engineering (OUSD (R&E)), is specifically interested in novel, disruptive concepts and technology solutions with dual-use capabilities. The goal is to address the Army's current needs and future concepts by experimenting, refining, and advancing technology solutions in operationally relevant environments. The Army encourages participation in its persistent experimentation events to mature and test the technology. Proposals should align with specific experimentation events and demonstrate potential for commercial applications. The phase I of the project will only accept Direct to Phase II (DP2) proposals, which should provide documentation of scientific and technical merit, feasibility, and potential commercial applications. DP2 awardees are expected to produce a prototype solution ready for field demonstration and deliver a technology transition and commercialization plan. Phase III focuses on the maturation of the technology to TRL 6/7 and further development and commercialization. The keywords for this solicitation include Human-Machine Integration (HMI), autonomy, artificial intelligence (AI), logistics, ground systems, air systems, robotics, sensors, and electromagnetic warfare (EW). The solicitation is open until March 31, 2025. For more information, visit the [solicitation link](https://www.sbir.gov/node/2603059).
DOD SBIR 24.4 Annual
Department of Defense
The Department of Defense (DOD) is seeking proposals for the xTechScalable AI 2 topic. This solicitation focuses on two main areas: 1. Scalable Tools for Automated AI Risk Management and Algorithmic Analysis: The Army is looking for automated tools to evaluate and mitigate risk against an AI Risk Management Framework (RMF). The tools should be able to evaluate multiple dimensions of AI risk, classify and quantify AI risk, and propose mitigation options. The Army is particularly interested in tools that can accept risk-related inputs from multiple data sources and modalities and have standardized evaluation methods and mitigation strategies. 2. Scalable Techniques for Robust Testing and Evaluation (T&E) of AI Operations Pipelines: The Army needs a robust and automated T&E approach for AI Operations Pipelines. This includes evaluating data integrity, data labeling, and model training. The Army is interested in tools that can identify and evaluate data integrity, assess the quality and accuracy of data labels, and evaluate model performance in terms of resource consumption, robustness, scalability, and privacy and security. The Phase I proposals can receive up to $250,000 for a 6-month period, while Direct to Phase II proposals can receive up to $2,000,000 for an 18-month period. Phase II involves producing prototype solutions that will be evaluated by soldiers, and Phase III focuses on maturing the technology and producing prototypes for further development and commercialization. The xTechScalable AI 2 prize competition will be used to identify small businesses eligible to submit proposals under this topic. The full solicitation can be found at the following link: [solicitation_agency_url].
DOD SBIR 24.4 Annual
Department of Defense
The Department of Defense (DOD) is seeking proposals for the xTech Search 8 SBIR Finalist Open Topic Competition. The objective of this solicitation is to find novel and disruptive concepts and technology solutions with dual-use capabilities that can address the Army's current needs and apply to current Army concepts. The technology areas of interest include Electronics, Human Systems, and Sensors. The Army is particularly interested in technologies related to Artificial Intelligence/Machine Learning, Advanced Materials, Advanced Manufacturing, Autonomy, Cyber, Human Performance, Immersive, Network Technologies, Position, Navigation and Timing (PNT), Power, Software Modernization, and Sensors. The Phase I of the project requires a feasibility study and concept plans, while Phase II involves producing prototype solutions that can be easily operated by soldiers. Phase III focuses on the maturation of the technology and its transition to TRL 6/7, as well as further development and commercialization. The solicitation is open until March 31, 2025. For more information, visit the [solicitation agency website](https://www.defensesbirsttr.mil/SBIR-STTR/Opportunities/).