TDD Workflow
This document describes the Test-Driven Development (TDD) workflow for the caro project, including test watch setup, development cycles, agent collaboration, and project-specific standards.
Table of Contents
Section titled “Table of Contents”- Quick Start
- The TDD Cycle
- Watch Management
- Agent Collaboration
- Project-Specific Standards
- Common Workflows
Quick Start
Section titled “Quick Start”Prerequisites
Section titled “Prerequisites”Before starting TDD development, ensure Rust and cargo are properly configured:
# Verify cargo is in PATHwhich cargo
# If not found, load Rust environment. "$HOME/.cargo/env"Installing cargo-watch
Section titled “Installing cargo-watch”cargo-watch provides continuous test execution on file changes:
. "$HOME/.cargo/env" && cargo install cargo-watchStarting the Test Watch
Section titled “Starting the Test Watch”Launch cargo-watch in the background for continuous feedback:
. "$HOME/.cargo/env" && cargo watch -x testThis command:
- Watches all Rust source files
- Runs
cargo teston any change - Provides immediate feedback on test status
Checking Running Watches
Section titled “Checking Running Watches”To see all background shells (including test watchers):
/bashesOr use the BashOutput tool with the shell ID to inspect specific output.
The TDD Cycle
Section titled “The TDD Cycle”caro follows strict Red-Green-Refactor methodology aligned with spec-driven development.
Phase 1: RED - Write Failing Contract Tests
Section titled “Phase 1: RED - Write Failing Contract Tests”Goal: Express desired behavior through a failing test before any implementation.
- Review the specification from
specs/[feature-id]/spec.md - Identify the contract from
specs/[feature-id]/contracts/ - Write the test in the appropriate
tests/subdirectory:tests/contract/- Module public API teststests/integration/- Cross-module workflow teststests/property/- Property-based invariant tests
Example Contract Test (from Feature 003):
#[tokio::test]async fn test_cache_manager_retrieves_model() { let cache = CacheManager::new().expect("cache creation failed"); let result = cache.get_model("test-model").await; assert!(result.is_ok());}- Verify the test fails by checking cargo-watch output
- Confirm failure reason is correct (not a compilation error)
Phase 2: GREEN - Implement Minimal Code
Section titled “Phase 2: GREEN - Implement Minimal Code”Goal: Make the test pass with the simplest possible implementation.
- Review the failing test output from cargo-watch
- Identify the minimal change needed
- Implement only what’s required to pass the test
- Observe cargo-watch turn green
- Verify all tests still pass (no regressions)
Example Implementation:
impl CacheManager { pub async fn get_model(&self, model_id: &str) -> Result<PathBuf, CacheError> { // Minimal implementation to pass test if self.is_cached(model_id) { // Return cached path } else { self.download_model(model_id).await } }}Phase 3: REFACTOR - Improve Code Quality
Section titled “Phase 3: REFACTOR - Improve Code Quality”Goal: Enhance code structure, readability, and performance while keeping tests green.
-
Identify improvement opportunities:
- Extract duplicated code
- Simplify complex logic
- Improve naming
- Add documentation
- Optimize performance
-
Make incremental changes while watching tests
-
Ensure tests stay green after each refactor
-
Document public APIs with rustdoc comments
Refactoring Principles:
- Never refactor on red (always start with green tests)
- Make one change at a time
- Run tests after each change
- Prefer clarity over cleverness
Watch Management
Section titled “Watch Management”Inspecting Test Output
Section titled “Inspecting Test Output”Using BashOutput tool (when shell ID is known):
BashOutput with bash_id: 6d7f77Using /bashes command (to see all shells):
/bashesOutput shows:
- Shell ID and command
- Current status (running/failed)
- Recent stdout/stderr
- Compilation errors
- Test failures with file:line references
Understanding Watch Output
Section titled “Understanding Watch Output”Green (All tests passing):
[Running 'cargo test'][Finished running. Exit status: 0]Red (Tests failing):
error[E0599]: no function or associated item named `new` found for struct `CacheManager` --> tests/cache_contract.rs:10:26 |10 | let cache = CacheManager::new().expect("cache creation failed"); | ^^^ function or associated item not foundWarnings (should be addressed but don’t block tests):
warning: unused variable: `manifest` --> src/cache/manifest.rs:161:17Filtering Test Output
Section titled “Filtering Test Output”Run specific test suites:
# Contract tests onlycargo test --test cache_contract
# Integration tests onlycargo test --test integration
# Specific test by namecargo test test_cache_manager_retrieves_modelStopping the Watch
Section titled “Stopping the Watch”To stop cargo-watch:
- Find the shell ID:
/bashes - Kill it:
KillShellwith the shell_id
Or manually:
kill $(pgrep -f "cargo watch")Agent Collaboration
Section titled “Agent Collaboration”caro uses specialized agents for TDD workflows. Choose the right agent based on your development phase.
tdd-rust-watcher Agent
Section titled “tdd-rust-watcher Agent”When to use:
- Active TDD development session
- Working through Red-Green-Refactor cycles
- Debugging failing tests
- Need real-time test feedback
Key behaviors:
- Maintains continuous test watch
- Guides through strict Red→Green→Refactor
- Provides minimal, incremental fixes
- Never runs ad-hoc
cargo testcommands
Example invocation:
User: "I need to implement cache validation"Assistant: Uses tdd-rust-watcher agent to guide through:1. Write failing test2. Observe red output3. Implement minimal fix4. Verify green5. Optional refactortdd-rust-engineer Agent
Section titled “tdd-rust-engineer Agent”When to use:
- Designing new features from scratch
- Implementing complete modules
- Need broader architectural guidance
- Starting new test suites
Key behaviors:
- Emphasizes contract-first design
- Focuses on library-first architecture
- Ensures comprehensive test coverage
- Applies Rust best practices
Example invocation:
User: "Implement the logging module"Assistant: Uses tdd-rust-engineer agent to:1. Review logging contract (specs/003/contracts/logging-api.md)2. Design module structure3. Write contract tests4. Implement with TDD cyclesAgent Coordination
Section titled “Agent Coordination”For complex features, agents may work together:
- Planning phase:
spec-driven-dev-guidecreates specification - Architecture phase:
rust-cli-architectdesigns module structure - Implementation phase:
tdd-rust-engineerwrites contracts - Development phase:
tdd-rust-watcherguides Red-Green-Refactor - Quality phase:
qa-testing-expertvalidates coverage
Project-Specific Standards
Section titled “Project-Specific Standards”Contract-First Testing
Section titled “Contract-First Testing”Based on caro’s spec-driven development (see specs/003-implement-core-infrastructure/plan.md):
- Specifications drive contracts: Each feature has a spec in
specs/[feature-id]/ - Contracts define APIs: Public APIs documented in
specs/[feature-id]/contracts/ - Tests express contracts: Contract tests in
tests/contract/validate APIs - Implementation satisfies tests: Code in
src/makes contract tests pass
Test Organization:
tests/├── contract/ # Module public API tests (must align with contracts/)├── integration/ # Cross-module scenarios (from quickstart.md)└── property/ # Invariant validation (proptest-based)Library-First Architecture
Section titled “Library-First Architecture”All modules must be:
- Exported via lib.rs: Public API accessible programmatically
- Self-contained: Clear single responsibility
- Reusable: Usable beyond CLI context
- Testable: Contract tests validate public API
Example lib.rs structure:
pub mod cache;pub mod config;pub mod execution;pub mod logging;pub mod models;pub mod safety;Performance Validation
Section titled “Performance Validation”Tests must validate NFR requirements:
- Startup time: CLI initialization < 100ms
- Validation time: Safety checks < 50ms per command
- Inference time: MLX backend < 2s on M1 Mac
- Cache operations: Model retrieval < 5s for <1GB models
Example performance test:
#[tokio::test]async fn test_safety_validation_performance() { let validator = SafetyValidator::new(); let start = Instant::now();
let result = validator.validate("ls -la").await;
assert!(start.elapsed() < Duration::from_millis(50)); assert!(result.is_ok());}Cross-Platform Testing
Section titled “Cross-Platform Testing”Support required for:
- Primary: macOS, Linux
- Secondary: Windows
Tests should:
- Use
PathBuffor cross-platform paths - Detect shell type (
ShellType::BashvsShellType::Cmd) - Handle platform-specific dangerous patterns
- Use
tempfilefor filesystem tests
Error Handling Standards
Section titled “Error Handling Standards”- No panics in production: Use
Resulttypes - User-friendly errors: Clear messages with actionable context
- Error chains: Use
thiserrorfor library errors,anyhowfor binary - Graceful degradation: Continue with defaults on non-critical failures
Example error definition:
#[derive(Debug, thiserror::Error)]pub enum CacheError { #[error("Failed to download model: {0}")] DownloadFailed(String),
#[error("Model not found: {0}")] ModelNotFound(String),
#[error("I/O error: {0}")] IoError(#[from] std::io::Error),}Common Workflows
Section titled “Common Workflows”Workflow 1: Fixing Compilation Errors
Section titled “Workflow 1: Fixing Compilation Errors”Scenario: cargo-watch shows compilation errors
- Read the error message from watch output
- Identify the issue:
- Missing import?
- Type mismatch?
- Function signature change?
- Make minimal fix to resolve compilation
- Verify tests run (even if failing)
- Address test failures using TDD cycle
Example:
error[E0061]: this function takes 1 argument but 0 arguments were supplied --> src/execution/shell.rs:122:21 |122 | let shell = ShellDetector::detect(); | ^^^^^^^^^^^^^^^^^^^^^-- argument #1 of type `&self` is missingFix: Change to instance method call:
let detector = ShellDetector::new();let shell = detector.detect();Workflow 2: Addressing Failing Contract Tests
Section titled “Workflow 2: Addressing Failing Contract Tests”Scenario: Contract test fails after GREEN phase implementation
- Review test expectations in
tests/contract/ - Check implementation in corresponding
src/module - Identify mismatch:
- Wrong return type?
- Missing validation?
- Incorrect error handling?
- Update implementation to satisfy contract
- Verify all contract tests pass
Example from Feature 003:
// Contract expects builder pattern#[test]fn test_config_builder() { let config = LogConfig::builder() .level(LogLevel::Debug) .output(LogOutput::Stdout) .build(); assert_eq!(config.level, LogLevel::Debug);}
// Implementation must provide builderimpl LogConfig { pub fn builder() -> LogConfigBuilder { LogConfigBuilder::default() }}Workflow 3: Running Specific Test Suites
Section titled “Workflow 3: Running Specific Test Suites”Scenario: Need to focus on particular module or feature
Contract tests only:
cargo test --test cache_contractcargo test --test config_contractcargo test --test execution_contractcargo test --test logging_contractIntegration tests only:
cargo test --test infrastructure_integrationProperty tests only:
cargo test --test property_testsSpecific test by name:
cargo test test_cache_manager_retrieves_model -- --show-outputAll tests with verbose output:
cargo test -- --nocapture --test-threads=1Workflow 4: Integration Test Scenarios
Section titled “Workflow 4: Integration Test Scenarios”Scenario: Validate cross-module workflows from quickstart.md
- Review scenario in
specs/[feature-id]/quickstart.md - Write integration test in
tests/integration/ - Use multiple modules together (cache + config + logging)
- Assert end-to-end behavior
- Verify performance requirements
Example integration test:
#[tokio::test]async fn test_first_time_user_experience() { // Scenario 1 from quickstart.md let config = ConfigManager::new().expect("config init"); let cache = CacheManager::new().expect("cache init"); let context = ExecutionContext::capture().expect("context capture");
// User runs caro for first time assert!(!config.file_exists()); assert_eq!(cache.stats().total_models, 0); assert!(context.shell_type != ShellType::Unknown);}Workflow 5: Performance Validation
Section titled “Workflow 5: Performance Validation”Scenario: Ensure NFR requirements are met
- Identify performance target from spec
- Add timing instrumentation to test
- Run test multiple times for consistency
- Profile if needed with
cargo flamegraphorperf - Optimize hot paths if below target
Example:
#[tokio::test]async fn test_cli_startup_performance() { let iterations = 10; let mut timings = Vec::new();
for _ in 0..iterations { let start = Instant::now(); let _app = CliApp::new().expect("cli init"); timings.push(start.elapsed()); }
let avg = timings.iter().sum::<Duration>() / iterations as u32; assert!(avg < Duration::from_millis(100), "Startup too slow: {:?}", avg);}Additional Resources
Section titled “Additional Resources”- Project guidance: See
CLAUDE.mdfor architecture and development commands - Repository guidelines: See
AGENTS.mdfor coding standards and commit conventions - Feature specifications: See
specs/[feature-id]/spec.mdfor requirements - API contracts: See
specs/[feature-id]/contracts/for expected behaviors - Agent documentation: See
.claude/agents/for specialized agent guidance
Quick Reference
Section titled “Quick Reference”Start watch:
. "$HOME/.cargo/env" && cargo watch -x testCheck watch status:
/bashesRun specific tests:
cargo test --test cache_contractcargo test test_specific_function -- --show-outputFormat and lint:
cargo fmt --allcargo clippy -- -D warningsFull validation (before PR):
cargo test && cargo fmt --check && cargo clippy -- -D warningsLast Updated: 2025-10-03 Related: CLAUDE.md, AGENTS.md, specs/002-implement-tdd-green/spec.md, .claude/agents/tdd-rust-watcher.md