Microsoft releases ASSERT tool for AI behavior testing

Original: New Microsoft tool lets devs spin up AI behavior tests using text descriptions

Why This Matters

Addresses growing need for specialized AI testing as models integrate into products

Microsoft launched ASSERT, an open-source framework that converts natural language descriptions into AI behavior tests. The tool generates test cases, scores results, and monitors compliance for application-specific AI systems.

Microsoft unveiled ASSERT (Adaptive Spec-driven Scoring for Evaluation and Regression Testing), an open-source framework designed to simplify AI behavior evaluation for developers. The tool converts plain-language descriptions of expected AI behavior and policies into structured tests with scoring capabilities. ASSERT generates problem scenarios, runs them against target systems, and records AI decision paths including intermediate actions. Developers can specify system context, tools, and constraints for customized evaluations. For example, a document research AI agent could be tested to ensure it doesn't email external contacts and limits confidential information to C-level executives. Microsoft's Sarah Bird, chief product officer of Responsible AI, emphasized the need for application-specific evaluations beyond general assessments. The framework addresses gaps in current AI evaluation methods as models become more capable and require repeatable testing protocols.

Source

techcrunch.com — Read original →