Validation

The Validation Analytic Engine employs specialized transformer models to perform zero-shot classification for content validation, detecting problematic content, and assessing regulatory compliance risks.

Validation Analytic Engine

Use Case: Content validation and classification
Technology: Transformer-based zero-shot classification
Valid Inputs: Text in Exchanges
Available Evaluators:
- PromptInjection-Validation
- ToxicLanguage-Validation
Last Engine Update: 2025-03-03
Dependencies: HuggingFace Transformers

Detailed Description

The Validation Analytic Engine uses specialized transformer models to classify content against predefined categories without requiring specific training examples. This "zero-shot" approach allows for flexibility in detecting a wide range of content concerns, from harmful language to regulatory compliance issues. All transformer models used by this engine are hosted within the ThirdLaw VPC to ensure data security and privacy.

How It Works

The Validation Analytic Engine processes input text from prompts and/or completions using transformer-based models that perform zero-shot classification. When an Evaluator using this engine is initialized, it loads a pre-trained transformer model (e.g., a DeBERTa or BERT variant) and configures it with a set of classification topics that represent categories of interest. During analysis, the engine processes the input text through the transformer model, which evaluates the likelihood that the content belongs to each of the configured categories.

After classification, the engine applies configurable thresholds to the resulting classification scores to determine which categories, if any, are present in the content. The engine is capable of simultaneously evaluating content against multiple categories, producing a comprehensive assessment of potential concerns. This multi-category analysis makes the Validation Analytic Engine particularly effective for content moderation, safety checks, and compliance validation where multiple aspects of the content need to be evaluated against established guidelines.

Configuration Options

The Validation Analytic Engine supports the following configuration parameters:

Parameter	Description	Default
`topics`	List of topics to classify against	Required
`apply_to`	Where to apply analysis (prompt, completion, both)	both
`model_id`	HuggingFace model ID to use	MoritzLaurer/deberta-v3-large-zeroshot-v2.0
`thresholds`	Category-specific score thresholds	Required

Finding Structure

A generic Finding structure for the Validation Analytic Engine:

Finding Structure
{
    "EvaluatorName-Validation": True/False,
    "EvaluatorName-Validation.any": True/False,
    "EvaluatorName-Validation.topic_name.found": True/False,
    "EvaluatorName-Validation.topic_name.confidence": [0-1],
}

Available Evaluators

The following table lists common Evaluators that can be created using the Validation Analytic Engine:

Evaluator Name	Description	Common Use Cases
PromptInjection-Validation	Detects attempts to manipulate or hijack the prompt	Security monitoring, prompt security
ToxicLanguage-Validation	Identifies toxic, harmful, or inappropriate language	Content moderation, user safety

Dependencies

PyTorch: Deep learning framework
Transformers: HuggingFace Transformers library

Revision History

2025-03-03: Initial documentation creation

Detailed Description​

How It Works​

Configuration Options​

Finding Structure​

Available Evaluators​

Dependencies​

Revision History​