Sweep All Tiers

Find dead code after changes -- diff-aware reverse impact analysis

When you rename a function, delete a class, or refactor a module, the old code doesn't clean itself up. Dead imports, orphaned tests, stale helpers, and unreachable exports all pass CI because nothing calls them. They accumulate until the codebase feels like an archaeological dig.

bpsai-pair sweep solves this. Given a diff (branch, commit range, or working tree), it identifies what the change made unnecessary and reports actionable findings grouped by category and confidence.

Works on Any Stack

Sweep ships with a Generic provider that works on any programming language via regex heuristics (~80% accuracy), plus a precise Python provider using the ast module (~95% accuracy). No configuration needed -- language is detected from file extensions.

Quick Start

# Sweep current branch against its merge base
bpsai-pair sweep

# Sweep staged changes only
bpsai-pair sweep --staged

# Sweep against a specific ref
bpsai-pair sweep --since v2.23.0

# JSON output for CI integration
bpsai-pair sweep --json

How It Works

Sweep runs three phases on every invocation:

PhaseWhat It Does
1. Diff Analysis Parses the git diff and extracts changed, deleted, and renamed symbols at the function/class/variable level -- not just line-level changes
2. Reference Scan Searches the codebase for remaining references to old symbol names using word-boundary matching. Filters definition sites, comments, and ignored directories
3. Classification Categorizes each finding by type and confidence, with suggested actions for each

Finding Categories

CategoryWhat It MeansSuggested Action
dead_import A module imports a symbol that was deleted or renamed Remove the import
orphaned_test A test references a function or class that no longer exists Delete or update the test
stale_helper A helper function wraps or calls a deleted function Remove the helper
unreachable_export A symbol is exported in __init__.py or __all__ but no longer exists Remove from exports
stale_reference A deleted symbol is mentioned in comments, docstrings, or config files Update the reference

Confidence Levels

Each finding is assigned a confidence level based on how certain the classification is:

  • High -- Exact match in an import statement or test function name. Safe to act on.
  • Medium -- Contextual match (function call, variable reference). Review before acting.
  • Low -- Found in a comment, string literal, or config file. May be intentional.

Command Reference

FlagDescriptionDefault
--since <ref> Diff against a specific git ref (commit, tag, or branch) Auto-detect merge base (dev or main)
--staged Only sweep staged changes false
--working Sweep uncommitted working tree changes false
--json Structured JSON output for CI, MCP, or agent consumption false
--category <cat> Filter findings by category (dead_import, orphaned_test, etc.) All categories
--confidence <level> Minimum confidence threshold (high, medium, low) low (show all)
--fix Auto-remove high-confidence dead imports (safe subset only) false
--deep Request deep analysis from Amunet via A2A (when available) false

Exit Codes

  • 0 -- No high-confidence findings (or no findings at all)
  • 1 -- High-confidence findings exist. Use in CI to fail builds with dead code.

Language Support

Sweep auto-detects the language from file extensions and selects the best available provider:

LanguageProviderAccuracyMethod
Python PythonProvider ~95% AST module -- full function, class, import, constant extraction
JavaScript, TypeScript, React GenericProvider ~80% Regex heuristics for function, class, import patterns
Go, Rust, Java, Ruby, C# GenericProvider ~80% Regex heuristics for common definition patterns
Deep Analysis with Amunet

When Amunet is registered on your A2A network, --deep sends the diff for full dependency graph analysis. Amunet traces reverse impacts through the actual import/call graph rather than grep, catching transitive dead code that local analysis misses. Results merge with local findings automatically.

Post-Engage Integration

Sweep runs automatically after each bpsai-pair engage sprint. Findings appear in the PR body under a "Cleanup Opportunities" section. This is advisory only -- it does not block PR creation.

## Cleanup Opportunities

| Category | File | Line | Confidence | Suggestion |
|----------|------|------|------------|------------|
| dead_import | src/utils.py | 3 | high | Remove import of deleted `calculate_total` |
| orphaned_test | tests/test_calc.py | 45 | high | Test references deleted `calculate_total` |

Ignoring Files

Create a .sweepignore file in your project root to exclude paths from reference scanning. Uses the same format as .gitignore:

# .sweepignore
vendor/
generated/
*.pb.go
*_generated.ts

CI Integration

Add sweep to your CI pipeline to catch dead code before it merges:

# GitHub Actions example
- name: Check for dead code
  run: bpsai-pair sweep --since origin/main --confidence high --json

The --json output includes structured findings that can be parsed by other tools or posted as PR comments.

Examples

After a refactor

# You just renamed calculate_total to compute_line_items
bpsai-pair sweep

# Found 3 dead references across 2 files (2 high, 1 medium confidence)
#
#   dead_import   src/billing.py:3       high    Remove import of `calculate_total`
#   orphaned_test tests/test_billing.py:45  high    Test `test_calculate_total` references deleted symbol
#   stale_reference docs/api.md:12       medium  Mentions `calculate_total` in documentation

Auto-fix dead imports

# Remove high-confidence dead imports automatically
bpsai-pair sweep --fix

# Fixed 2 dead imports in 2 files

Filter by category

# Only show orphaned tests
bpsai-pair sweep --category orphaned_test