Shared memory and context tools for agentic work.
Code Rooms
{
"schema": "m1nd-bug-hunt-audit-result-v0",
"round_id": "bughunt-humanize-20260514T021500Z",
"lane_id": "audit-10",
"repo_path": "/Users/kle1nz/m1nd/.m1nd-field-workspaces/bughunt-humanize-20260514T021500Z/audit-10/humanize",
"findings": [
"title": "intcomma leaves negative numbers ungrouped",
"severity": "medium",
"file": "src/humanize/number.py",
"symbol": "intcomma",
"cause": "After converting the value to a string, the grouping regex only matches strings that start with digits. A leading '-' prevents any substitution, so the loop returns the original ungrouped negative value.",
"impact": "Negative amounts and measurements render inconsistently with positive values, e.g. -1234567 becomes -1234567 instead of -1,234,567. The same path affects negative floats and numeric strings.",
"evidence": [
"src/humanize/number.py:164 formats orig from str(value)",
"src/humanize/number.py:171 uses r'^(\\d+)(\\d{3})', which cannot match a leading minus sign",
"tests/test_number.py:43-78 covers positive intcomma and non-finite values but no negative grouping cases",
"proof: humanize.intcomma(-1234567) returned '-1234567'"
],
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.intcomma(-1234567) == '-1,234,567'\nPY",
"confidence": "high"
},
"title": "fractional drops or duplicates the sign for negative fractions",
"severity": "high",
"symbol": "fractional",
"cause": "The function truncates negative values toward zero with int(number), then formats the residual fraction without normalizing the sign. Proper fractions use abs(numerator), losing the sign entirely, while mixed fractions print a negative whole number and a negative numerator.",
"impact": "Negative numeric input is not faithfully represented: -0.3 becomes '3/10' and -1.3 becomes '-1 -3/10'. This is a data-preservation bug for any caller formatting signed ratios or deltas.",
"src/humanize/number.py:370 computes whole_number = int(number), which truncates toward zero",
"src/humanize/number.py:379-382 uses abs(numerator) only for proper fractions and leaves negative numerators in mixed fractions",
"tests/test_number.py:162-189 covers positive fractions and non-finite values but no negative fractions",
"proof: humanize.fractional(-0.3) returned '3/10'; humanize.fractional(-1.3) returned '-1 -3/10'"
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.fractional(-0.3) == '-3/10'\nassert humanize.fractional(-1.3) == '-1 3/10'\nPY",
"title": "clamp marks exact floor and ceil values as out of range",
"symbol": "clamp",
"cause": "The comparisons use value <= floor and value >= ceil, but the docstring and argument text define clamping for values smaller than the floor or larger than the ceil. The embedded example for value == ceil currently fails.",
"impact": "Boundary values are labeled as under/above even though they are already within the allowed range, e.g. a value exactly equal to a threshold gets a '<' or '>' token.",
"src/humanize/number.py:464 says clamping applies when a number is larger than ceil or smaller than floor",
"src/humanize/number.py:473 documents clamp(0.99, format='{:.0%}', ceil=0.99) -> '99%'",
"src/humanize/number.py:509-514 uses <= and >=",
"doctest.testmod(humanize.number) reported the clamp equality example expected '99%' but got '>99%'"
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.clamp(0.99, format='{:.0%}', ceil=0.99) == '99%'\nassert humanize.clamp(0.01, format='{:.0%}', floor=0.01) == '1%'\nPY",
"title": "naturalsize crashes or fabricates suffixes for NaN and infinity",
"file": "src/humanize/filesize.py",
"symbol": "naturalsize",
"cause": "The function coerces input with float(value) and then immediately uses log(abs_bytes, base) without checking math.isfinite. NaN reaches int(min(...)) and raises ValueError; infinity is clamped to the largest suffix and formatted as an infinite quantity of QB.",
"impact": "Callers that pass non-finite values from upstream calculations either crash or receive misleading output such as 'inf QB'. Other number-formatting APIs in this package explicitly normalize NaN and infinities, so this is an inconsistent boundary regression.",
"src/humanize/filesize.py:90 converts value to float",
"src/humanize/filesize.py:99 calculates int(min(log(abs_bytes, base), len(suffix))) with no finite guard",
"tests/test_filesize.py:12-97 covers negative sizes but no NaN or infinity",
"tests/test_number.py:30-34, 73-77, 126-130, 181-185 show the number module expects stable NaN/Inf formatting",
"proof: humanize.naturalsize(float('nan')) raised ValueError; humanize.naturalsize(float('inf')) returned 'inf QB'"
"reproduction_or_test": "uv run python - <<'PY'\nimport math, humanize\nassert humanize.naturalsize(math.nan) == 'NaN'\nassert humanize.naturalsize(math.inf) == '+Inf'\nassert humanize.naturalsize(-math.inf) == '-Inf'\nPY",
"title": "metric does not promote when rounding reaches 1000 of a prefix",
"symbol": "metric",
"cause": "The SI prefix is selected from the original base-10 exponent before formatting. If the scaled value rounds to 1000, the function returns 1000 of the old prefix instead of promoting to the next prefix.",
"impact": "Values near SI boundaries produce awkward and less normalized strings such as '1000 k', '1000 m', or '-1000 k'. This contradicts the function's own claim that it chooses prefixes to avoid non-significant zero digits.",
"src/humanize/number.py:547-550 says the prefix is chosen so non-significant zero digits are not required",
"src/humanize/number.py:570-582 selects ordinal_ before formatting value_ and never rechecks a rounded value of 1000",
"tests/test_number.py:254-309 covers many metric values but no boundary values like 999500",
"proof: humanize.metric(999500) returned '1000 k'; humanize.metric(0.9995) returned '1000 m'"
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.metric(999_500) == '1.00 M'\nassert humanize.metric(0.9995) == '1.00'\nPY",
"confidence": "medium"
"title": "ordinal uses the wrong suffix for negative integers",
"severity": "low",
"symbol": "ordinal",
"cause": "The suffix logic applies Python modulo directly to the signed value. Negative numbers such as -1 and -2 map to suffix slots 9 and 8 instead of using the suffix for the absolute last digit.",
"impact": "The function says it works for any integer, but negative ordinals render as '-1th', '-2th', and '-21th' instead of preserving the expected st/nd/rd suffix.",
"src/humanize/number.py:37 says ordinal works for any integer",
"src/humanize/number.py:105-107 uses value % 100 and value % 10 directly on negative values",
"tests/test_number.py:13-38 has no negative ordinal cases",
"proof: humanize.ordinal(-1) returned '-1th'; humanize.ordinal(-22) returned '-22th'"
"reproduction_or_test": "uv run python - <<'PY'\nimport humanize\nassert humanize.ordinal(-1) == '-1st'\nassert humanize.ordinal(-22) == '-22nd'\nPY",
"title": "Runtime type-hint introspection fails for public APIs",
"symbol": "NumberOrString annotations",
"cause": "The package uses postponed annotations but defines NumberOrString, dt, Any, and Iterable only inside TYPE_CHECKING blocks. Those names are absent from module globals at runtime, so typing.get_type_hints cannot resolve public function annotations.",
"impact": "The package ships py.typed, but runtime consumers that inspect annotations for docs, validation, wrappers, or schema generation get NameError on public functions such as intcomma and naturaldelta. Static mypy checks pass, so this regression is easy to miss.",
"src/humanize/py.typed exists",
"src/humanize/number.py:12-18 defines NumberOrString only under TYPE_CHECKING",
"src/humanize/time.py:15-19 imports dt and Iterable only under TYPE_CHECKING while annotations reference them",
"proof: typing.get_type_hints(humanize.number.intcomma) raised NameError: name 'NumberOrString' is not defined",
"proof: typing.get_type_hints(humanize.time.naturaldelta) raised NameError: name 'dt' is not defined"
"reproduction_or_test": "uv run python - <<'PY'\nimport typing, humanize.number, humanize.time\ntyping.get_type_hints(humanize.number.intcomma)\ntyping.get_type_hints(humanize.time.naturaldelta)\nPY",
}
"commands_run": [
"sed -n '1,260p' /Users/kle1nz/m1nd/docs/benchmarks/bug-hunt-rounds/bughunt-humanize-20260514T021500Z/lane-prompts/audit-10.md",
"rg --files",
"git status --short",
"sed/nl reads of pyproject.toml, README.md, src/humanize/*.py, tests/test_*.py, and docs/*.md",
"uv run --extra tests pytest -q",
"uv run --with mypy -q mypy src",
"uv run python proof scripts for intcomma, fractional, ordinal, clamp, naturalsize, metric, and get_type_hints",
"uv run python doctest.testmod(humanize.number)",
"rg -n searches for documented behavior and missing boundary tests"
"files_inspected": [
"pyproject.toml",
"README.md",
"docs/number.md",
"docs/filesize.md",
"docs/time.md",
"src/humanize/__init__.py",
"src/humanize/number.py",
"src/humanize/time.py",
"src/humanize/filesize.py",
"src/humanize/lists.py",
"src/humanize/i18n.py",
"src/humanize/py.typed",
"tests/test_number.py",
"tests/test_time.py",
"tests/test_filesize.py",
"tests/test_lists.py",
"tests/test_i18n.py"
"tests_run": [
"command": "uv run --extra tests pytest -q",
"result": "passed",
"detail": "685 passed, 69 skipped in 0.70s"
"command": "uv run --with mypy -q mypy src",
"detail": "Success: no issues found in 7 source files"
"command": "uv run python - <<'PY' import doctest; import humanize.number; print(doctest.testmod(humanize.number)) PY",
"result": "failed-by-evidence",
"detail": "1 doctest failure: clamp(0.99, format='{:.0%}', ceil=0.99) expected '99%' but got '>99%'; 55 attempted"
"command": "uv run python proof script for discovered boundary cases",
"result": "evidence-collected",
"detail": "Printed actual outputs for negative intcomma/fractional/ordinal, clamp equality, metric boundary promotion, naturalsize non-finites, and type-hint introspection failures"
"m1nd_usage": [],
"final_notes": "No source patches, commits, pushes, or m1nd MCP/CLI/graph tools were used. The existing test suite passes, so the findings are uncovered boundary regressions with focused reproduction commands rather than current suite failures. Translation binary generation was not attempted; i18n-specific behavior beyond source/test inspection is not claimed.",
"agent_testimony": "I followed the lane prompt as a local correctness audit, prioritizing source-level causes that could be reproduced with short commands. The useful signal came from comparing function docstrings, existing test coverage, and focused boundary probes after the full test suite passed. I avoided broad rewrites and left the repo source tree clean.",