Validation
The Validation Analytic Engine employs specialized transformer models to perform zero-shot classification for content validation, detecting problematic content, and assessing regulatory compliance risks.
- Use Case: Content validation and classification
- Technology: Transformer-based zero-shot classification
- Valid Inputs: Text in Exchanges
- Available Evaluators:
- Last Engine Update: 2025-03-03
- Dependencies: HuggingFace Transformers
Detailed Description
The Validation Analytic Engine uses specialized transformer models to classify content against predefined categories without requiring specific training examples. This "zero-shot" approach allows for flexibility in detecting a wide range of content concerns, from harmful language to regulatory compliance issues. All transformer models used by this engine are hosted within the ThirdLaw VPC to ensure data security and privacy.
How It Works
The Validation Analytic Engine processes input text from prompts and/or completions using transformer-based models that perform zero-shot classification. When an Evaluator using this engine is initialized, it loads a pre-trained transformer model (e.g., a DeBERTa or BERT variant) and configures it with a set of classification topics that represent categories of interest. During analysis, the engine processes the input text through the transformer model, which evaluates the likelihood that the content belongs to each of the configured categories.
After classification, the engine applies configurable thresholds to the resulting classification scores to determine which categories, if any, are present in the content. The engine is capable of simultaneously evaluating content against multiple categories, producing a comprehensive assessment of potential concerns. This multi-category analysis makes the Validation Analytic Engine particularly effective for content moderation, safety checks, and compliance validation where multiple aspects of the content need to be evaluated against established guidelines.
Configuration Options
The Validation Analytic Engine supports the following configuration parameters:
| Parameter | Description | Default |
|---|---|---|
topics | List of topics to classify against | Required |
apply_to | Where to apply analysis (prompt, completion, both) | both |
model_id | HuggingFace model ID to use | MoritzLaurer/deberta-v3-large-zeroshot-v2.0 |
thresholds | Category-specific score thresholds | Required |
Finding Structure
A generic Finding structure for the Validation Analytic Engine:
{
"EvaluatorName-Validation": True/False,
"EvaluatorName-Validation.any": True/False,
"EvaluatorName-Validation.topic_name.found": True/False,
"EvaluatorName-Validation.topic_name.confidence": [0-1],
}
Available Evaluators
The following table lists common Evaluators that can be created using the Validation Analytic Engine:
| Evaluator Name | Description | Common Use Cases |
|---|---|---|
| PromptInjection-Validation | Detects attempts to manipulate or hijack the prompt | Security monitoring, prompt security |
| ToxicLanguage-Validation | Identifies toxic, harmful, or inappropriate language | Content moderation, user safety |
Dependencies
- PyTorch: Deep learning framework
- Transformers: HuggingFace Transformers library
Revision History
- 2025-03-03: Initial documentation creation