Skip to main content

Validation

The Validation Analytic Engine employs specialized transformer models to perform zero-shot classification for content validation, detecting problematic content, and assessing regulatory compliance risks.

Validation Analytic Engine
  • Use Case: Content validation and classification
  • Technology: Transformer-based zero-shot classification
  • Valid Inputs: Text in Exchanges
  • Available Evaluators:
  • Last Engine Update: 2025-03-03
  • Dependencies: HuggingFace Transformers

Detailed Description

The Validation Analytic Engine uses specialized transformer models to classify content against predefined categories without requiring specific training examples. This "zero-shot" approach allows for flexibility in detecting a wide range of content concerns, from harmful language to regulatory compliance issues. All transformer models used by this engine are hosted within the ThirdLaw VPC to ensure data security and privacy.

How It Works

The Validation Analytic Engine processes input text from prompts and/or completions using transformer-based models that perform zero-shot classification. When an Evaluator using this engine is initialized, it loads a pre-trained transformer model (e.g., a DeBERTa or BERT variant) and configures it with a set of classification topics that represent categories of interest. During analysis, the engine processes the input text through the transformer model, which evaluates the likelihood that the content belongs to each of the configured categories.

After classification, the engine applies configurable thresholds to the resulting classification scores to determine which categories, if any, are present in the content. The engine is capable of simultaneously evaluating content against multiple categories, producing a comprehensive assessment of potential concerns. This multi-category analysis makes the Validation Analytic Engine particularly effective for content moderation, safety checks, and compliance validation where multiple aspects of the content need to be evaluated against established guidelines.

Configuration Options

The Validation Analytic Engine supports the following configuration parameters:

ParameterDescriptionDefault
topicsList of topics to classify againstRequired
apply_toWhere to apply analysis (prompt, completion, both)both
model_idHuggingFace model ID to useMoritzLaurer/deberta-v3-large-zeroshot-v2.0
thresholdsCategory-specific score thresholdsRequired

Finding Structure

A generic Finding structure for the Validation Analytic Engine:

Finding Structure
{
"EvaluatorName-Validation": True/False,
"EvaluatorName-Validation.any": True/False,
"EvaluatorName-Validation.topic_name.found": True/False,
"EvaluatorName-Validation.topic_name.confidence": [0-1],
}

Available Evaluators

The following table lists common Evaluators that can be created using the Validation Analytic Engine:

Evaluator NameDescriptionCommon Use Cases
PromptInjection-ValidationDetects attempts to manipulate or hijack the promptSecurity monitoring, prompt security
ToxicLanguage-ValidationIdentifies toxic, harmful, or inappropriate languageContent moderation, user safety

Dependencies

  • PyTorch: Deep learning framework
  • Transformers: HuggingFace Transformers library

Revision History

  • 2025-03-03: Initial documentation creation