Model Routing Controls

Control which models can touch which investigative work

Approve public, sovereign, and on-prem lanes by policy. Keep routine extraction cheap, keep sensitive material inside the boundary, and log every routed decision back to the case.

Review AI Workflow Review Deployment Options

Routing posture

Public, sovereign, and on-prem options in one control surface

When extraction and classification stay on controlled lanes

82-97%

Prompts, outputs, reviewers, and model versions logged

100%

Operators keep working if one provider or region becomes unavailable

Multi-lane

Approved model lanes

One operator workflow, multiple approved model lanes

Analysts stay in the same case workflow while policy decides whether a task can use an external provider or must stay private.

Use approved external providers for complex reasoning, multimodal review, and long-context synthesis when the workload is cleared to leave your controlled environment.

Keep CJI, classified material, and routine high-volume work on private infrastructure or agency-hosted environments where data control is non-negotiable.

The routing layer evaluates sensitivity, complexity, and required review before choosing a model lane.

Linked case context

Model output should resolve into entities, links, and case decisions

Summaries and extracted findings stay attached to the investigation graph so analysts can review, merge, reject, or promote them inside the live case workflow.

Use approved external providers for complex reasoning, multimodal review, and long-context synthesis when the workload is cleared to leave your controlled environment.

Approved provider

Google Gemini

Capabilities

Multimodal reviewLong contextImage reasoning

Best For

Photo, video, and long-form document analysis cleared for external handling

Compliance

FedRAMP High path

Approved provider

OpenAI GPT

Capabilities

Deep reasoningStructured extractionCase synthesis

Best For

Cross-file reasoning, entity extraction, and analyst drafting outside controlled data lanes

Compliance

FedRAMP High pathCJIS-ready deployment patterns

Approved provider

Amazon Bedrock

Capabilities

Multi-model accessGovCloud alignmentEnterprise scale

Best For

GovCloud-aligned deployments, partner environments, and approved multi-model routing

Compliance

FedRAMP HighDoD IL4/IL5 pathCJIS-ready deployment patterns

Approved provider

Anthropic Claude

Capabilities

Long contextGuardrailed draftingReview support

Best For

Long briefings, case review, and cautious drafting where operators need strong review discipline

Compliance

FedRAMP High path

Approved provider

xAI Grok

Capabilities

Current-public dataPattern triage

Best For

Open-source and social-media context where current public signals matter more than controlled records

Compliance

Contract-specific approval

Routine workload savings

When extraction and classification stay on controlled lanes

Approved model lanes

Public, sovereign, and on-prem options in one control surface

Trace coverage

Prompts, outputs, reviewers, and model versions logged

Fallback posture

Operators keep working if one provider or region becomes unavailable

Routing Control

See the routing decision before the prompt leaves the boundary

The routing layer evaluates sensitivity, complexity, and required review before choosing a model lane.

Input Prompt

Extract names, locations, and organizations from this arrest report.

Routing Control

Routing Review

Sensitivity

Low

Complexity

Low

Decision

Private

Routine structured extraction stays in the cheapest controlled lane so volume work does not spill to premium providers.

Selected Model

Llama 3.1 8B

Cost

$0.00002

Cost Control

Estimate routed workload spend

Compare a premium public-only posture with routed workloads that keep routine casework on cheaper controlled lanes.

Monthly AI Operations

100,000 prompts/month

Task Weighting

Adjust the workload weights. The calculator normalizes them to 100% of monthly volume automatically.

Entity Extraction40

Classification30

Summarization20

Complex Analysis10

Current weight total: 100

Monthly Cost Comparison

Public-only routing

$289.00

With policy-based routing

$158.60

Monthly savings

$130.40

(45%)

Annualized savings

$1,564.80

Operating Advantages

Why routed multi-model control matters in operations

The point is not model novelty. It is keeping provider choice, cost control, and evidentiary discipline inside one repeatable case workflow.

fallback paths

Policy insulation

Provider terms and approved-use rules change. A routed architecture preserves continuity when one provider becomes unavailable or restricted.

approved lanes

Capability matching

Extraction, long-context review, and multimodal analysis do not belong on one default model. Route the job instead of forcing a compromise.

single-vendor dependence

Negotiating leverage

Avoid provider lock-in and keep procurement leverage by proving that operators can work across more than one approved lane.

Continuous

lane refresh

Continuous refresh

Add, replace, or retire providers without retraining operators on a new interface or breaking the evidence trail.

Governance Posture

Every routed prompt needs case-grade traceability

Each routed operation preserves who initiated it, why it was sent to that lane, which model answered, and how the result moved into casework.

Governance Posture

Model-assisted work that survives audit, policy review, and disclosure

Prompts, outputs, reviewer actions, and routing reasons stay tied to the investigation record instead of disappearing into a chat transcript.

FIPS 140-3 encryption

Model version tracking

Prompt and response logging

Chain of custody support

Governance Posture

Model-assisted work that survives audit, policy review, and disclosure

Prompts, outputs, reviewer actions, and routing reasons stay tied to the investigation record instead of disappearing into a chat transcript.

CJIS

Controls for criminal justice information and controlled investigative work

Ready

FIPS 140-3 encryption
Role-based access control
3+ year audit retention
US-only data residency

FedRAMP

Deployment lanes aligned to federal hosting and continuous monitoring expectations

Ready

Moderate baseline (Cloudflare)
High via GovCloud partners
Continuous monitoring
Third-party assessment

Evidentiary

Traceability needed for disclosure, challenge, and court review

Ready

Complete audit trails
Model version tracking
Prompt and response logging
Chain of custody support

Governance Posture

Sample Audit Record

Each routed operation preserves who initiated it, why it was sent to that lane, which model answered, and how the result moved into casework.

{
  "operation_id": "op_7f8a9b2c",
  "timestamp": "2024-12-08T14:32:17.842Z",
  "user_id": "det_martinez",
  "organization": "metro_pd",
  "model": "llama-3.1-8b-instruct",
  "tier": "private",
  "task_type": "entity_extraction",
  "input_tokens": 847,
  "output_tokens": 156,
  "cost_usd": 0.00002,
  "sensitivity_classification": "CJI",
  "routing_reason": "CJIS data - private tier required",
  "prompt_hash": "sha256:e3b0c442...",
  "response_hash": "sha256:5d41402a...",
  "latency_ms": 183
}

Review model routing against your policy constraints

Walk through sensitivity rules, provider choices, and review controls on your own workload instead of watching a generic AI demo.

Review AI Workflow Review Deployment Options