Shared memory and context tools for agentic work.
Code Rooms
# Real-World Agent Lane: m1nd-3
Round: `real-world-v2-20260513T231822Z`
Arm: `m1nd_available`
Use m1nd first for orientation, localization, impact, connected context, docs/code binding, and risky change prep. If m1nd is blocked or stale, record the recovery path and verify final truth with files, tests, and git.
Do not guess the benchmark hypothesis. Work as if this is a normal coding task.
Keep public claims out of the result. Record missing proof instead of smoothing it away.
Do not commit, publish, or push fixture repo changes.
## Fixture Repositories
- click-python-cli (python): `.m1nd-benchmark-fixtures/real-world/click-python-cli`
- p-limit-node (typescript): `.m1nd-benchmark-fixtures/real-world/p-limit-node`
- human-panic-rust-cli (rust): `.m1nd-benchmark-fixtures/real-world/human-panic-rust-cli`
## Isolated Lane Workspaces
Use your isolated workspace paths for patch tasks. Do not edit shared fixture repos.
- click-python-cli: `.m1nd-benchmark-fixtures/real-world-lanes/real-world-v2-20260513T231822Z/m1nd-3/click-python-cli`
- p-limit-node: `.m1nd-benchmark-fixtures/real-world-lanes/real-world-v2-20260513T231822Z/m1nd-3/p-limit-node`
- human-panic-rust-cli: `.m1nd-benchmark-fixtures/real-world-lanes/real-world-v2-20260513T231822Z/m1nd-3/human-panic-rust-cli`
If a fixture is missing, clone it from the URL in `round.json` or mark the affected task invalidated.
## Task Battery
- repo_architecture_audit on `click-python-cli`: Explain the repo architecture, main modules, entrypoints, data/control flow, and top risks. Fixed payload: `{"focus": "Audit Click's public export layer, decorators, command core, parser, and testing harness.", "must_cover": ["public API re-exports", "command and group invocation path", "parameter/type conversion", "test runner IO isolation"]}` Expected evidence: main entrypoints named, module boundaries named, at least two real file references, risk list separates proven facts from hypotheses.
- feature_location on `p-limit-node`: Find where a named feature or public behavior is implemented and identify the tests that protect it. Fixed payload: `{"feature": "The rejectOnClear and clearQueue behavior for pending tasks.", "must_find": ["runtime implementation", "type definition", "test coverage", "README/API docs"]}` Expected evidence: implementation file named, test file named or missing test stated, false-positive files avoided.
- flow_explanation on `human-panic-rust-cli`: Explain a realistic request/command/API flow from public entrypoint to internal behavior. Fixed payload: `{"flow": "Explain what happens when setup_panic!() is installed and a release-mode panic occurs.", "must_cover": ["public macro or setup entrypoint", "panic hook behavior", "report writing path", "observable user-facing output"]}` Expected evidence: entrypoint named, intermediate calls named, observable output or side effect named.
- bug_symptom_triage on `click-python-cli`: Given a realistic symptom, isolate the most likely fault boundary and name the next verification step. Fixed payload: `{"must_answer": ["most likely fault boundary", "why it is not a parser/runtime invocation issue", "next focused regression test"], "symptom": "A callable instance used as a custom Click option type crashes during command construction with AttributeError because the object has no __name__ attribute."}` Expected evidence: most likely fault file or function named, alternative theory preserved or rejected, next command/test/file named.
- safe_change_plan on `p-limit-node`: Plan a small behavior change, including blast radius, files to edit, and proof gates. Fixed payload: `{"change_request": "Plan a backwards-compatible change so clearQueue() returns the number of pending tasks it discarded or rejected, without touching already running tasks.", "must_cover": ["runtime edit target", "types/docs/test targets", "rejectOnClear behavior", "no change to activeCount semantics"]}` Expected evidence: edit targets named, downstream callers or tests named, risky assumptions explicit.
- small_feature_patch on `human-panic-rust-cli`: Implement a tiny feature or option consistent with local style and run focused checks. Fixed payload: `{"change_request": "Add Metadata::name(...) and Metadata::version(...) builder methods that preserve the existing non-empty string guard style.", "must_cover": ["minimal implementation", "focused unit tests", "no public panic/report behavior rewrite"]}` Expected evidence: minimal patch, test or example updated when appropriate, focused check result recorded.
- seeded_bug_fix on `click-python-cli`: Fix a seeded or clearly described bug without broad refactors. Fixed payload: `{"bug": "The lane workspace contains a seeded regression test proving callable instances should work as custom option types.", "must_cover": ["root cause", "minimal fix", "seeded regression test result"], "seeded_artifact_id": "click-callable-instance-type-test-v1"}` Expected evidence: root cause named, patch is scoped, regression proof recorded.
- bounded_refactor_plan on `p-limit-node`: Prepare a bounded refactor and identify hidden coupling before any edit. Fixed payload: `{"must_cover": ["resumeNext", "next", "enqueue", "clearQueue", "concurrency setter"], "refactor_scope": "Queue scheduling and draining helpers only."}` Expected evidence: coupled files named, safe ordering proposed, rollback or proof boundary named.
- code_review_diff on `human-panic-rust-cli`: Review a supplied or seeded diff for real bugs, regressions, and missing tests. Fixed payload: `{"must_cover": ["duplicate or noisy support output when homepage and repository coexist", "missing regression test", "avoid style-only findings"], "review_focus": "Find real user-visible regressions and missing tests in the supplied diff.", "supplied_diff": "benchmark-payloads/review-diff-human-panic.patch"}` Expected evidence: findings ordered by severity, file/line references when available, style-only comments avoided.
- docs_drift_check on `click-python-cli`: Compare README/docs claims against implementation and identify drift or missing documentation. Fixed payload: `{"claim": "README/docs say Click supports lazy loading of subcommands at runtime.", "must_compare": ["README and docs/index claim", "docs/complex lazy loading pattern", "actual Group behavior"]}` Expected evidence: claim source named, code truth named, drift or no-drift conclusion justified.
## Required Result
Fill a JSON result using `lane-result-template.json`.
Append raw investigation events to `event-streams/m1nd-3.jsonl`.
Use event_source=agent for events you create. Keep one JSON object per line.
Scores must be integers from 0 to 4. Use `4` for excellent; do not use `5`.