Shared memory and context tools for agentic work.
Code Rooms
{
"schema": "m1nd-bug-hunt-audit-result-v0",
"round_id": "bughunt-humanize-20260514T021500Z",
"lane_id": "audit-02",
"started_at": "2026-05-14T02:46:44.227Z",
"finished_at": "2026-05-14T02:53:30Z",
"repo_path": "/Users/kle1nz/m1nd/.m1nd-field-workspaces/bughunt-humanize-20260514T021500Z/audit-02/humanize",
"findings": [
"title": "naturaltime future=True is ignored for positive numeric seconds",
"severity": "high",
"file": "src/humanize/time.py",
"symbol": "naturaltime",
"cause": "The tense inference guard uses `if isinstance(value, (dt.datetime, dt.timedelta)) or future`, so passing `future=True` for numeric seconds immediately recomputes `future = date > now`. Numeric seconds are converted to `date = now - delta`, so positive values become past-tense despite the explicit future flag.",
"impact": "The documented API says integers and floats are past tense by default unless `future=True`, but `humanize.naturaltime(30, future=True)` returns `30 seconds ago` instead of `30 seconds from now`. Callers using numeric durations for countdowns or future estimates silently get reversed wording.",
"evidence": [
"src/humanize/time.py:262 documents the numeric future flag behavior.",
"src/humanize/time.py:284 computes `date, delta = _date_and_delta(value, now=now)`.",
"src/humanize/time.py:288 recomputes `future` when the caller already set `future=True`.",
"tests/test_time.py:175-178 cover numeric seconds only with the default past-tense mode.",
"Focused proof command printed: `naturaltime future flag: actual='30 seconds ago' expected='30 seconds from now' FAIL`."
],
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.naturaltime(30, future=True) == '30 seconds from now'\nPY",
"confidence": "high"
},
"title": "intcomma skips thousands separators for negative numbers",
"severity": "medium",
"file": "src/humanize/number.py",
"symbol": "intcomma",
"cause": "The comma insertion regex is anchored to digits only: `^(\\d+)(\\d{3})`. Once the formatted value starts with `-`, the loop cannot match and returns the ungrouped number unchanged.",
"impact": "Negative integers, floats, and numeric strings are accepted inputs but lose the function's core formatting behavior. Examples: `-1234567` returns `-1234567` and `'-1234567.89'` returns `'-1234567.89'` instead of preserving the sign while grouping digits.",
"src/humanize/number.py:110 defines `intcomma` for integers, floats, and strings.",
"src/humanize/number.py:164-167 builds the formatted string before separator insertion.",
"src/humanize/number.py:171-175 applies a regex that cannot match a leading minus sign.",
"tests/test_number.py:41-83 cover positive values and non-finite values, but no negative finite values.",
"Focused proof command printed negative int/string failures for `intcomma`."
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.intcomma(-1234567) == '-1,234,567'\nassert humanize.intcomma('-1234567.89') == '-1,234,567.89'\nPY",
"title": "clamp marks exact floor and ceiling values as out of range",
"symbol": "clamp",
"cause": "`clamp` uses inclusive comparisons, `value <= floor` and `value >= ceil`, even though the docstring and examples define clamping only for values smaller than the floor or larger than the ceiling.",
"impact": "Boundary values that are already valid receive misleading under/over tokens. The documented example `clamp(0.99, format='{:.0%}', ceil=0.99)` should be `99%`, but the current implementation returns `>99%`; the floor boundary has the same problem with `<`.",
"src/humanize/number.py:464-465 says tokens are for values larger than ceil or smaller than floor.",
"src/humanize/number.py:473-476 documents equality at the ceiling as unclamped.",
"src/humanize/number.py:509-514 uses inclusive comparisons.",
"tests/test_number.py:223-240 cover below-floor and above-ceiling cases, but omit equality at either boundary.",
"Focused proof command printed equality failures for both ceil and floor."
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.clamp(0.99, format='{:.0%}', ceil=0.99) == '99%'\nassert humanize.clamp(0.01, format='{:.0%}', floor=0.01) == '1%'\nPY",
"title": "fractional drops the sign for negative proper fractions",
"symbol": "fractional",
"cause": "When the absolute value is less than one, `whole_number` is zero and the function returns `abs(numerator)`, discarding the sign from the fractional part.",
"impact": "Negative proper fractions are reported as positive values, which is direct data loss for accepted numeric input. `humanize.fractional(-0.5)` returns `1/2` instead of `-1/2`.",
"src/humanize/number.py:362-371 parses accepted values through `float` and `Fraction`.",
"src/humanize/number.py:379-380 formats proper fractions with `abs(numerator)`.",
"tests/test_number.py:166-189 cover positive fractions and non-finite values, but no negative finite fractions.",
"Focused proof command printed: `fractional negative proper: actual='1/2' expected='-1/2' FAIL`."
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.fractional(-0.5) == '-1/2'\nPY",
"title": "natural_list conflates an empty list with a list containing None",
"severity": "low",
"file": "src/humanize/lists.py",
"symbol": "natural_list",
"cause": "The empty-list branch returns the literal string `None`, while the one-item branch stringifies the item. That makes `[]` and `[None]` produce the same output.",
"impact": "A valid empty collection loses information and becomes indistinguishable from a list that actually contains `None`. This can leak misleading UI copy in callers that use `natural_list` for optional labels, recipients, tags, or summaries.",
"src/humanize/lists.py:12 defines `natural_list` for list inputs.",
"src/humanize/lists.py:31-34 returns `None` for `[]` and `str(items[0])` for one-item lists.",
"tests/test_lists.py:8-23 cover one or more items, including an empty string item, but not an empty list or `[None]`.",
"Focused proof command printed: `natural_list empty distinct from [None]: empty='None' [None]='None' FAIL`."
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.natural_list([]) != humanize.natural_list([None])\nPY",
"confidence": "medium"
}
"commands_run": [
"sed -n '1,240p' /Users/kle1nz/m1nd/docs/benchmarks/bug-hunt-rounds/bughunt-humanize-20260514T021500Z/lane-prompts/audit-02.md",
"sed -n '1,220p' /Users/kle1nz/.codex/skills/m1nd-operator/SKILL.md",
"python3 /Users/kle1nz/.codex/skills/m1nd-operator/scripts/probe_m1nd.py tools",
"python3 /Users/kle1nz/.codex/skills/m1nd-operator/scripts/probe_m1nd.py run '[trust_selftest, session_handshake, ingest, audit/search/seek...]'",
"M1ND_WORKSPACE_ROOT=/Users/kle1nz/m1nd/.m1nd-field-workspaces/bughunt-humanize-20260514T021500Z/audit-02/humanize python3 /Users/kle1nz/.codex/skills/m1nd-operator/scripts/probe_m1nd.py run '[ingest, impact, validate_plan, surgical_context_v2 for number.py]'",
"M1ND_WORKSPACE_ROOT=/Users/kle1nz/m1nd/.m1nd-field-workspaces/bughunt-humanize-20260514T021500Z/audit-02/humanize python3 /Users/kle1nz/.codex/skills/m1nd-operator/scripts/probe_m1nd.py run '[ingest, impact, validate_plan for time.py naturaltime]'",
"M1ND_WORKSPACE_ROOT=/Users/kle1nz/m1nd/.m1nd-field-workspaces/bughunt-humanize-20260514T021500Z/audit-02/humanize python3 /Users/kle1nz/.codex/skills/m1nd-operator/scripts/probe_m1nd.py run '[ingest, impact, validate_plan for lists.py natural_list]'",
"rg --files",
"rg -n 'natural_list|naturalsize|fractional|clamp|metric|naturalday|naturaldate|naturaltime|empty|None' README.md docs tests src/humanize",
"sed/nl reads of src/humanize/number.py, src/humanize/time.py, src/humanize/filesize.py, src/humanize/lists.py, src/humanize/i18n.py and related tests",
"uv run --extra tests pytest -q",
"uv run python snippets for boundary behavior in naturaltime, intcomma, clamp, fractional, natural_list, naturalsize, and non-finite time values",
"cd /tmp && uv run --with humanize --no-project python comparison snippets",
"diff -u cached humanize 4.15.0 package files against src/humanize/*.py",
"git status --short --ignored"
"files_inspected": [
"pyproject.toml",
"README.md",
"docs/number.md",
"docs/time.md",
"docs/filesize.md",
"docs/lists.md",
"src/humanize/__init__.py",
"src/humanize/number.py",
"src/humanize/time.py",
"src/humanize/filesize.py",
"src/humanize/lists.py",
"src/humanize/i18n.py",
"tests/test_number.py",
"tests/test_time.py",
"tests/test_filesize.py",
"tests/test_lists.py",
"tests/test_i18n.py"
"tests_run": [
"command": "uv run --extra tests pytest -q",
"result": "685 passed, 69 skipped in 0.64s"
"command": "uv run python focused boundary proof for naturaltime future, intcomma negatives, clamp equality, fractional negative proper fraction, and natural_list empty distinction",
"result": "Exited 1 and printed seven FAIL lines matching the reported boundary regressions."
"m1nd_usage": [
"Used `probe_m1nd.py tools`; 96 tools were available, including trust_selftest, session_handshake, ingest, audit, search, seek, impact, validate_plan, recovery_playbook, and surgical_context_v2.",
"Initial `session_handshake` with the audit repo scope reported wrong_workspace_binding because the active graph was bound to the Codex app cwd; followed the prompt by ingesting the intended audit workspace.",
"Ran `ingest` on the assigned humanize repo; m1nd parsed 67-68 files and built a graph with 213-214 nodes and 224 edges.",
"Ran `audit`; after correcting the required `path` argument, it reported repo identity, clean source graph verification, and one uncommitted m1nd-created `ingest_roots.json` file.",
"Ran exact/intent retrieval for requested symbols and topics. `seek` pointed to src/humanize/time.py, src/humanize/number.py, src/humanize/filesize.py, and tests.",
"Ran `recovery_playbook` after one over-broad literal search returned blocked/zero candidates, then used narrower searches and direct verification.",
"Ran `impact`, `validate_plan`, and `surgical_context_v2` for number.py findings; plan risk was low and affected files were src/humanize/number.py and tests/test_number.py.",
"Ran `impact` and `validate_plan` for the naturaltime and natural_list findings; plan risk was low and affected files were localized to their modules plus tests."
"final_notes": "No code patches, commits, pushes, publication, or broad rewrites were performed. I did not read operator-only files. The main full test suite passes, so the report is focused on uncovered boundary regressions with direct reproduction commands. `uv run` created ignored local runtime artifacts (`.venv`, `.pytest_cache`, `__pycache__`, generated _version.py), and m1nd ingest created an untracked `ingest_roots.json` inside the assigned repo workspace.",
"agent_testimony": "I followed the audit prompt as a correctness/regression review: m1nd first for trust, ingest, orientation, impact, and plan validation; then direct file reads, baseline tests, focused boundary probes, and source comparison for extra confidence. The strongest signals were places where docs/tests promised behavior at an API boundary but the code returned the opposite tense, skipped sign-aware formatting, or labeled in-range values as clamped.",