Skip to main content

Testing Agentic Units (AUs)

Version: 1.0 Status: AU Testing Guide Last Updated: 2025-11-17

This document provides comprehensive testing guidelines for Agentic Units in the AGEniX ecosystem.


Table of Contents

  1. Overview
  2. Test Strategy
  3. Unit Testing
  4. Integration Testing
  5. Contract Testing
  6. Security Testing
  7. Performance Testing
  8. Test Organization
  9. CI/CD Integration
  10. Testing Checklist

1. Overview

Why Test AUs?

AUs execute in production worker environments with:

  • Untrusted user inputs
  • Zero-trust security model
  • Limited resources
  • Strict contracts

Comprehensive testing ensures:

  • Contract compliance (--describe, stdin/stdout, JSON)
  • Security (no command injection, path traversal)
  • Reliability (handles malformed input gracefully)
  • Performance (bounded memory, reasonable speed)

###Testing Philosophy

  • Test-Driven Development (TDD): Write tests first
  • 100% Contract Coverage: All required behaviors tested
  • Security-First: Test attack scenarios explicitly
  • Realistic Inputs: Use real-world examples

2. Test Strategy

2.1 Test Pyramid

        ��������������
 E2E Tests  (Few)
 (via AGX) 
��������������
����������������
 Integration  (Some)
 Tests 
����������������
������������������
 Contract Tests  (Many)
 (--describe, 
 stdin/stdout) 
������������������
��������������������
 Unit Tests  (Most)
 (Core logic) 
��������������������

2.2 Test Categories

Test TypePurposeFrequencyScope
UnitTest individual functions100sMilliseconds
ContractVerify AU contract compliance10sSeconds
IntegrationTest with real inputs10sSeconds
SecurityTest attack vectors10sSeconds
PerformanceBenchmark speed/memoryFewMinutes
E2ETest via AGX plannerFewMinutes

3. Unit Testing

3.1 Core Logic Tests

Test pure functions independently:

// src/ocr.rs
#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_parse_regions() {
let engine_output = vec![
EngineRegion {
text: "Hello".to_string(),
confidence: 0.98,
bbox: [10.0, 20.0, 100.0, 50.0],
}
];

let regions = convert_to_au_regions(&engine_output);

assert_eq!(regions.len(), 1);
assert_eq!(regions[0].text, "Hello");
assert_eq!(regions[0].confidence, 0.98);
}

#[test]
fn test_combine_text() {
let regions = vec![
OcrRegion { text: "Hello".into(), confidence: 0.98, bbox: [...] },
OcrRegion { text: "World".into(), confidence: 0.95, bbox: [...] },
];

let combined = combine_region_text(&regions);
assert_eq!(combined, "Hello World");
}
}

3.2 Model Configuration Tests

// src/model.rs
#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_model_config_from_path() {
let config = ModelConfig::from_cli(Some(PathBuf::from("/models/test.gguf")));
assert!(config.is_ok());
}

#[test]
fn test_model_config_requires_path() {
let config = ModelConfig::from_cli(None);
assert!(config.is_err());
assert!(config.unwrap_err().to_string().contains("MODEL_PATH"));
}

#[test]
fn test_model_config_rejects_nonexistent() {
let config = ModelConfig::from_cli(Some(PathBuf::from("/nonexistent.gguf")));
assert!(config.is_err());
}
}

4. Integration Testing

4.1 End-to-End Binary Execution

Test the AU as a black box:

// tests/integration_test.rs
use std::process::{Command, Stdio};
use std::io::Write;

#[test]
fn test_ocr_basic_png() {
// Load test image
let image_bytes = include_bytes!("fixtures/hello_world.png");

// Run AU
let mut child = Command::new("cargo")
.args(&["run", "--", "--model-path", "/models/test.gguf"])
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.unwrap();

// Write image to stdin
child.stdin.as_mut().unwrap().write_all(image_bytes).unwrap();

// Get output
let output = child.wait_with_output().unwrap();

// Verify success
assert!(output.status.success());

// Parse JSON output
let result: serde_json::Value = serde_json::from_slice(&output.stdout).unwrap();
assert_eq!(result["text"], "Hello World");
}

4.2 Testing with Real Files

#[test]
fn test_ocr_invoice() {
let image = std::fs::read("tests/fixtures/invoice.png").unwrap();

let output = Command::new("cargo")
.args(&["run", "--", "--model-path", "/models/ocr.gguf"])
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn()
.unwrap()
.stdin.unwrap()
.write_all(&image)
.unwrap();

let output = child.wait_with_output().unwrap();
let result: serde_json::Value = serde_json::from_slice(&output.stdout).unwrap();

// Verify invoice fields extracted
assert!(result["text"].as_str().unwrap().contains("INVOICE"));
assert!(result["text"].as_str().unwrap().contains("$"));
}

4.3 Test Fixtures

Create tests/fixtures/ directory with sample inputs:

tests/
�� fixtures/
 �� hello_world.png # Simple text
 �� multi_line.png # Multiple lines
 �� rotated.png # Rotated text
 �� low_quality.jpg # Low resolution
 �� corrupt.png # Malformed PNG
 �� empty.png # Blank image
�� integration_test.rs

5. Contract Testing

5.1 --describe Output

#[test]
fn test_describe_output_valid_json() {
let output = Command::new("cargo")
.args(&["run", "--", "--describe"])
.output()
.unwrap();

assert!(output.status.success());

// Must be valid JSON
let card: serde_json::Value = serde_json::from_slice(&output.stdout)
.expect("--describe output must be valid JSON");

// Required fields
assert!(card.get("name").is_some());
assert!(card.get("version").is_some());
assert!(card.get("description").is_some());
assert!(card.get("capabilities").is_some());
}

5.2 Schema Validation

use jsonschema::JSONSchema;

#[test]
fn test_describe_matches_schema() {
let output = Command::new("cargo")
.args(&["run", "--", "--describe"])
.output()
.unwrap();

let card: serde_json::Value = serde_json::from_slice(&output.stdout).unwrap();

// Load schema
let schema_json = std::fs::read_to_string("../../agenix/specs/describe.schema.json").unwrap();
let schema: serde_json::Value = serde_json::from_str(&schema_json).unwrap();

let compiled = JSONSchema::compile(&schema).unwrap();

// Validate
let result = compiled.validate(&card);
if let Err(errors) = result {
for error in errors {
eprintln!("Validation error: {}", error);
}
panic!("--describe output does not match schema");
}
}

5.3 stdin/stdout Contract

#[test]
fn test_stdout_is_valid_json() {
let image = include_bytes!("fixtures/hello_world.png");

let output = run_au(image);

// stdout must be parseable JSON
let json: serde_json::Value = serde_json::from_slice(&output.stdout)
.expect("stdout must be valid JSON");

// Must have expected fields
assert!(json.get("text").is_some());
assert!(json.get("regions").is_some());
}

#[test]
fn test_errors_go_to_stderr() {
let corrupt_image = b"not an image";

let output = run_au(corrupt_image);

// Should fail
assert!(!output.status.success());

// stdout should be empty or invalid JSON
assert!(serde_json::from_slice::<serde_json::Value>(&output.stdout).is_err());

// stderr should contain error message
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(stderr.contains("Error") || stderr.contains("error"));
}

5.4 Exit Codes

#[test]
fn test_exit_code_success() {
let output = run_au(include_bytes!("fixtures/hello_world.png"));
assert_eq!(output.status.code(), Some(0));
}

#[test]
fn test_exit_code_invalid_input() {
let output = run_au(b"not an image");
assert_eq!(output.status.code(), Some(2)); // Invalid input
}

#[test]
fn test_exit_code_missing_model() {
let output = Command::new("cargo")
.args(&["run"]) // No --model-path
.stdin(Stdio::piped())
.output()
.unwrap();

assert_ne!(output.status.code(), Some(0));
}

6. Security Testing

6.1 Input Validation

#[test]
fn test_rejects_oversized_input() {
// Create 100MB input
let huge_input = vec![0u8; 100 * 1024 * 1024];

let output = run_au(&huge_input);

// Should fail gracefully
assert!(!output.status.success());

let stderr = String::from_utf8_lossy(&output.stderr);
assert!(stderr.contains("too large") || stderr.contains("size"));
}

#[test]
fn test_rejects_malformed_png() {
let bad_png = b"\x89PNG\x00\x00\x00\x00"; // Invalid PNG

let output = run_au(bad_png);

assert!(!output.status.success());

let stderr = String::from_utf8_lossy(&output.stderr);
assert!(stderr.to_lowercase().contains("invalid") ||
stderr.to_lowercase().contains("corrupt"));
}

6.2 Path Traversal

#[test]
fn test_rejects_path_traversal() {
let output = Command::new("cargo")
.args(&["run", "--", "--model-path", "../../../etc/passwd"])
.stdin(Stdio::piped())
.output()
.unwrap();

assert!(!output.status.success());

let stderr = String::from_utf8_lossy(&output.stderr);
assert!(stderr.contains("not found") || stderr.contains("traversal"));
}

#[test]
fn test_rejects_absolute_paths_outside_allowed() {
let output = Command::new("cargo")
.args(&["run", "--", "--model-path", "/etc/passwd"])
.stdin(Stdio::piped())
.output()
.unwrap();

assert!(!output.status.success());
}

6.3 Command Injection

#[test]
fn test_no_shell_execution() {
// Attempt command injection via model path
let output = Command::new("cargo")
.args(&["run", "--", "--model-path", "model.gguf; rm -rf /"])
.stdin(Stdio::piped())
.output()
.unwrap();

// Should fail (model not found), NOT execute shell command
assert!(!output.status.success());

// Ensure no shell execution occurred
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(!stderr.contains("rm") || stderr.contains("not found"));
}

7. Performance Testing

7.1 Benchmarks

use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn bench_ocr(c: &mut Criterion) {
let image = include_bytes!("../tests/fixtures/hello_world.png");
let config = ModelConfig::from_cli(Some(PathBuf::from("/models/test.gguf"))).unwrap();

c.bench_function("ocr_hello_world", |b| {
b.iter(|| {
run_ocr(black_box(image), &config, None)
});
});
}

criterion_group!(benches, bench_ocr);
criterion_main!(benches);

7.2 Memory Profiling

# Run with memory profiling
cargo build --release
valgrind --tool=massif --massif-out-file=massif.out \
./target/release/agx-ocr --model-path /models/test.gguf < test.png

# Analyze memory usage
ms_print massif.out

7.3 Load Testing

#[test]
#[ignore] // Long-running test
fn test_1000_images_no_leak() {
let image = include_bytes!("fixtures/hello_world.png");

for i in 0..1000 {
let output = run_au(image);
assert!(output.status.success(), "Failed on iteration {}", i);
}

// If we get here, no memory leak or panic
}

8. Test Organization

8.1 Directory Structure

agx-ocr/
�� src/
 �� main.rs
 �� ocr.rs
  �� #[cfg(test)] mod tests { ... }
 �� model.rs
  �� #[cfg(test)] mod tests { ... }
 �� types.rs
�� tests/
 �� fixtures/
  �� hello_world.png
  �� invoice.png
  �� corrupt.png
 �� integration_test.rs
 �� contract_test.rs
 �� security_test.rs
�� benches/
�� ocr_benchmark.rs

8.2 Test Naming

// Unit tests: test_<function>_<scenario>
#[test]
fn test_parse_regions_empty() { }

#[test]
fn test_parse_regions_multiple() { }

// Integration tests: test_<feature>_<case>
#[test]
fn test_ocr_hello_world() { }

#[test]
fn test_ocr_rotated_text() { }

// Security tests: test_<attack>_<defense>
#[test]
fn test_command_injection_rejected() { }

#[test]
fn test_path_traversal_blocked() { }

9. CI/CD Integration

9.1 GitHub Actions

name: Test AU

on: [push, pull_request]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3

- name: Install Rust
uses: actions-rs/toolchain@v1
with:
toolchain: stable

- name: Run unit tests
run: cargo test --lib

- name: Run integration tests
run: cargo test --test '*'

- name: Run contract tests
run: |
cargo build --release
./tests/validate_contract.sh

- name: Check --describe schema
run: |
cargo run -- --describe > describe.json
jsonschema -i describe.json ../agenix/specs/describe.schema.json

- name: Security audit
run: cargo audit

- name: Coverage
run: |
cargo install cargo-tarpaulin
cargo tarpaulin --fail-under 80

9.2 Contract Validation Script

#!/bin/bash
# tests/validate_contract.sh

set -e

AU_BIN="./target/release/agx-ocr"

echo "Testing --describe..."
$AU_BIN --describe > /tmp/describe.json
jq empty /tmp/describe.json || (echo "Invalid JSON" && exit 1)

echo "Testing stdin/stdout..."
cat tests/fixtures/hello_world.png | $AU_BIN --model-path /models/test.gguf > /tmp/out.json
jq empty /tmp/out.json || (echo "Invalid output JSON" && exit 1)

echo "Testing exit codes..."
echo "invalid" | $AU_BIN --model-path /models/test.gguf > /dev/null 2>&1 || [ $? -eq 2 ]

echo "All contract tests passed!"

10. Testing Checklist

Before merging AU code, verify:

Core Contract

  • --describe outputs valid JSON matching schema
  • stdin � JSON stdout works for valid input
  • Errors go to stderr only
  • Exit codes correct (0, 1, 2)
  • Binary stdin handled correctly

Unit Tests

  • All core functions have unit tests
  • Edge cases tested (empty, null, max values)
  • Error paths tested

Integration Tests

  • At least 3 real-world test fixtures
  • Happy path: valid input � expected output
  • Sad path: invalid input � graceful error

Security Tests

  • Oversized input rejected
  • Malformed input handled gracefully
  • Path traversal blocked
  • No command injection vulnerabilities
  • No secrets in logs

Performance Tests

  • Typical input completes in < 10s
  • Memory usage bounded
  • No memory leaks (1000 iterations)

CI/CD

  • Tests run automatically on push
  • Contract validation script passes
  • Coverage > 80%
  • Security audit clean


Maintained by: AGX Core Team Review cycle: Quarterly Questions? Open an issue in the AU repository