stacknil · stacknil · Jun 30, 2026 · Jun 30, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -25,6 +25,8 @@ All notable user-visible changes should be recorded here.
 
 ### Docs
 
+- Added a rule-by-rule false-positive taxonomy for NAT, bastion, internal scanner,
+  lab replay, scheduled admin task, and shared-account contexts.
 - Expanded the parser conformance matrix with explicit Ubuntu / Debian
   `auth.log`, RHEL-family `secure`, `journalctl --output=short-full`, `sshd`,
   `sudo`, `pam_unix`, `pam_faillock`, and `pam_sss` style coverage.

diff --git a/README.md b/README.md
@@ -29,7 +29,7 @@ A compact finding summary is a bounded triage signal, not attribution:
 
 LogLens is an MVP / early release. The repository is stable enough for public review, local experimentation, and extension, but the parser and detection coverage are intentionally narrow.
 
-Reviewing the project quickly? Start with [`docs/reviewer-path.md`](./docs/reviewer-path.md), [`docs/reviewer-brief.md`](./docs/reviewer-brief.md), and the [`quality gates map`](./docs/quality-gates.md). For detection reasoning, read the forensic-style [`Linux auth brute-force case study`](./docs/case-study-linux-auth-bruteforce.md) and the [`rule catalog`](./docs/rule-catalog.md). For local scale expectations, see the [`performance envelope`](./docs/performance-envelope.md).
+Reviewing the project quickly? Start with [`docs/reviewer-path.md`](./docs/reviewer-path.md), [`docs/reviewer-brief.md`](./docs/reviewer-brief.md), and the [`quality gates map`](./docs/quality-gates.md). For detection reasoning, read the forensic-style [`Linux auth brute-force case study`](./docs/case-study-linux-auth-bruteforce.md), the [`rule catalog`](./docs/rule-catalog.md), and the [`false-positive taxonomy`](./docs/false-positive-taxonomy.md). For local scale expectations, see the [`performance envelope`](./docs/performance-envelope.md).
 
 ## Why This Project Exists
 
@@ -113,7 +113,7 @@ classes: `unknown_timestamp`, `unknown_program`,
 `known_program_unknown_message`, `malformed_source_ip`, and
 `unsupported_pam_variant`.
 
-For rule-by-rule semantics and signal boundaries, see [`docs/rule-catalog.md`](./docs/rule-catalog.md). For a forensic-style evidence walkthrough, see [`docs/case-study-linux-auth-bruteforce.md`](./docs/case-study-linux-auth-bruteforce.md). For the parser behavior contract, supported modes, and fixture map, see [`docs/parser-contract.md`](./docs/parser-contract.md). For the deliberately noisy parser-coverage sample, see [`docs/parser-coverage-notes.md`](./docs/parser-coverage-notes.md).
+For rule-by-rule semantics and signal boundaries, see [`docs/rule-catalog.md`](./docs/rule-catalog.md). For benign-context hypotheses and the evidence needed to support them, see [`docs/false-positive-taxonomy.md`](./docs/false-positive-taxonomy.md). For a forensic-style evidence walkthrough, see [`docs/case-study-linux-auth-bruteforce.md`](./docs/case-study-linux-auth-bruteforce.md). For the parser behavior contract, supported modes, and fixture map, see [`docs/parser-contract.md`](./docs/parser-contract.md). For the deliberately noisy parser-coverage sample, see [`docs/parser-coverage-notes.md`](./docs/parser-coverage-notes.md).
 
 LogLens does not currently detect:
 

diff --git a/docs/case-study-linux-auth-bruteforce.md b/docs/case-study-linux-auth-bruteforce.md
@@ -119,7 +119,7 @@ These warnings are useful because they prevent silent overconfidence. A reviewer
 
 ## False-positive boundary
 
-The findings should be read as triage statements and checked against the rule-by-rule taxonomy in [`rule-catalog.md`](./rule-catalog.md):
+The findings should be read as triage statements and checked against the rule semantics in [`rule-catalog.md`](./rule-catalog.md) and the evidence-review matrices in [`false-positive-taxonomy.md`](./false-positive-taxonomy.md):
 
 - `203.0.113.10` is a documentation-range placeholder; in a real case, the same pattern could be an external scanner, shared gateway, internal test, or replayed lab traffic.
 - Username spread supports a probing interpretation, but intent is not observable from these lines alone.

diff --git a/docs/false-positive-taxonomy.md b/docs/false-positive-taxonomy.md
@@ -0,0 +1,76 @@
+# False-Positive Taxonomy
+
+This document records benign or ambiguous contexts that can satisfy a LogLens rule threshold. A matching context does not erase the finding: it changes how a reviewer should interpret the normalized evidence and what external records are needed before disposition.
+
+The taxonomy is not an allow-list, suppression policy, or incident verdict. LogLens reports the rule match, its evidence window, and its `verdict_boundary`; authorization, intent, compromise, and attribution remain outside the tool.
+
+## Taxonomy Sources
+
+| Source | Meaning in this catalog |
+| --- | --- |
+| NAT | Several clients are represented by one network address, weakening source-IP identity assumptions. |
+| bastion | Administrative traffic is concentrated through an approved jump host or access gateway. |
+| internal scanner | Authorized assessment, compliance, or account-audit tooling deliberately generates authentication activity. |
+| lab replay | Training, test, demonstration, or pipeline validation data reproduces a finding-shaped sequence. |
+| scheduled admin task | A recurring operational job produces repeated authentication failures or privileged commands. |
+| shared account | Several operators or services use one account, weakening individual attribution and concentrating activity. |
+
+`bastion` and `shared account` are separate hypotheses. A bastion explains host or network concentration; a shared account explains identity concentration. Either can exist without the other.
+
+## Brute Force
+
+Rule evidence: at least 5 terminal SSH failure signals grouped by `source_ip` within 10 minutes by default.
+
+Verdict boundary: `triage_signal_not_compromise_or_attribution`.
+
+| Source | Why the threshold can match | Evidence that supports the explanation | Residual uncertainty |
+| --- | --- | --- | --- |
+| NAT | Independent users or services behind one egress address contribute failures to the same `source_ip` group. | VPN, proxy, firewall, or DHCP records map the source address and window to multiple internal clients. | Aggregation can explain volume but does not establish that every attempt was authorized. |
+| bastion | Multiple administrators or automation jobs originate from one approved jump host. | Bastion inventory, session audit records, and operator mappings cover the finding window and evidence event IDs. | An approved bastion can still carry stale credentials, misuse, or a compromised session. |
+| internal scanner | An authorized scanner tests SSH exposure or credential controls and produces terminal failures by design. | Scanner ownership, target scope, source-address inventory, and a matching scan schedule or change record. | Scanner identity supports authorization but does not validate target scope or configuration. |
+| lab replay | A fixture, demonstration, or validation job replays a concentrated failure sequence. | Ingestion provenance, replay job logs, fixture hashes, or known synthetic timestamps match the evidence. | Replayed data in a production evidence path is still a provenance or pipeline-quality issue. |
+| scheduled admin task | A recurring job repeatedly uses an expired, rotated, or mistyped credential. | Scheduler logs, service ownership, credential-rotation history, and matching execution timestamps. | A job explanation does not prove the credential failures are harmless or properly contained. |
+| shared account | Several operators or services retry the same shared credential from one source. | Account ownership records, approved-use policy, bastion or session logs, and change-window context. | The shared identity prevents reliable attribution to an individual operator. |
+
+## Multi-User Probing
+
+Rule evidence: at least 3 distinct usernames in attempt-evidence signals grouped by `source_ip` within 15 minutes by default.
+
+Verdict boundary: `triage_signal_not_intent_or_attribution`.
+
+| Source | Why the threshold can match | Evidence that supports the explanation | Residual uncertainty |
+| --- | --- | --- | --- |
+| NAT | Separate legitimate users behind one egress address attempt their own usernames during the same window. | Network translation, VPN, proxy, or DHCP records map the grouped address to distinct clients and expected users. | NAT explains source aggregation but not whether every attempted username was expected. |
+| bastion | An access gateway handles sessions for several named administrators or service accounts. | Bastion session records map each attempted username and timestamp to approved operators or workflows. | Missing session attribution leaves the username spread unexplained. |
+| internal scanner | Account-audit or exposure tooling tries a configured username set to validate controls. | Scanner configuration, approved account list, target scope, and execution schedule match the finding. | A broad or outdated username list may still represent a control or scope problem. |
+| lab replay | Synthetic data preserves username diversity to exercise parser or detector behavior. | Fixture provenance, replay logs, and expected username lists match the evidence event IDs. | Synthetic data must still be separated from operational evidence before conclusions are drawn. |
+| scheduled admin task | Migration, monitoring, or account-validation automation cycles through several service identities. | Job definition, account inventory, owner confirmation, and scheduler timestamps match the rule window. | Unexpected usernames or executions outside the approved window remain unexplained. |
+| shared account | Operators or tooling fall back across several shared or service accounts from one source. | Account-use policy, workflow configuration, and session logs explain the full observed username set. | One shared account alone does not create distinct-username spread; the explanation requires evidence of multiple accounts being tried. |
+
+## Sudo Burst
+
+Rule evidence: at least 3 `sudo_command` signals grouped by `username` within 5 minutes by default.
+
+Verdict boundary: `triage_signal_not_maliciousness_or_authorization`.
+
+| Source | Why the threshold can match | Evidence that supports the explanation | Residual uncertainty |
+| --- | --- | --- | --- |
+| NAT | NAT does not directly increase this username-grouped rule, but it can confuse attempts to correlate the finding with nearby source-IP findings. | Session records and host-local audit context link the sudo commands to a specific login independently of the network address. | Without session linkage, network proximity is not evidence that SSH and sudo findings share an actor. |
+| bastion | An approved administrator reaches the host through a jump path and executes several maintenance commands quickly. | Bastion session records, target-host login records, and a change ticket align with the sudo evidence window. | A valid access path does not establish that each command was authorized. |
+| internal scanner | Compliance, inventory, or endpoint assessment tooling executes a short privileged command sequence. | Agent identity, scanner policy, command allow-list, and execution logs match the reported commands and timestamps. | Unexpected commands or host scope remain reviewable even when the tool is authorized. |
+| lab replay | Demonstration or test evidence contains a compact sudo sequence. | Dataset provenance, replay job records, and known synthetic account or host values match the finding. | Replayed privileged activity mixed into operational logs still weakens evidence provenance. |
+| scheduled admin task | Package updates, service repair, backup, or maintenance automation runs several sudo commands in one window. | Scheduler records, automation definitions, change windows, and command text match the evidence event IDs. | Execution outside schedule or divergence from the expected command set remains unexplained. |
+| shared account | Several administrators use one account, or automation and humans share the same identity, concentrating commands under one `username`. | Session attribution, privileged access management records, operator rosters, and command ownership cover the complete window. | The account model prevents reliable individual attribution and may itself be a control weakness. |
+
+## Cross-Rule Interpretation
+
+- A `brute_force` and `multi_user_probing` finding over the same source and window are two views of overlapping evidence, not automatically two independent actors or incidents.
+- A nearby `sudo_burst` finding is not causally linked to an SSH finding unless external session evidence establishes that relationship.
+- `evidence_event_ids`, `window_start`, and `window_end` define exactly what LogLens counted. Review those records before applying contextual explanations.
+- Parser warnings and unsupported lines describe evidence completeness. They do not count toward findings, but a high unsupported-line rate weakens claims that an activity is absent.
+
+## Evidence Integrity Boundary
+
+Duplicate recognized lines, replayed collections, or merged log exports can inflate a rule count even when every line parses successfully. That is an evidence-provenance question, distinct from unsupported parser warnings. Review ingestion history and source hashes when replay or duplication is plausible.
+
+The appropriate conclusion is therefore bounded: a taxonomy source may explain why a threshold was met, but only corroborating records can support a benign disposition. LogLens does not make that disposition automatically.
diff --git a/docs/reviewer-path.md b/docs/reviewer-path.md
@@ -10,6 +10,7 @@ This path is for reviewers who want to understand LogLens quickly without readin
 | What log formats are supported? | [`docs/parser-contract.md`](./parser-contract.md) | Can name `syslog_legacy` and `journalctl_short_full` behavior |
 | What artifacts does it produce? | [`docs/report-artifacts.md`](./report-artifacts.md) and report-contract fixtures | Can inspect Markdown, JSON, and optional CSV outputs |
 | How do rules use evidence? | [`docs/rule-catalog.md`](./rule-catalog.md) | Can explain grouping keys, windows, thresholds, and unsupported-evidence boundaries |
+| What benign context can match a rule? | [`docs/false-positive-taxonomy.md`](./false-positive-taxonomy.md) | Can distinguish rule-true evidence from compromise, intent, attribution, or authorization claims |
 | Can the parser behavior be trusted? | Parser contract, fixture matrix, and [`assets/mixed_auth_parser_coverage.json`](../assets/mixed_auth_parser_coverage.json) | Can see known, unknown, and malformed line handling |
 | What proves the main claims? | [`docs/quality-gates.md`](./quality-gates.md) | Can map claims to tests, fixtures, docs, and repeatable commands |
 | How should a finding be interpreted? | [`docs/case-study-linux-auth-bruteforce.md`](./case-study-linux-auth-bruteforce.md) | Can trace raw evidence to normalized events, findings, warnings, and non-goals |
@@ -46,6 +47,7 @@ Inspect:
 - [`assets/mixed_auth_parser_coverage.json`](../assets/mixed_auth_parser_coverage.json)
 - [`docs/quality-gates.md`](./quality-gates.md)
 - [`docs/rule-catalog.md`](./rule-catalog.md)
+- [`docs/false-positive-taxonomy.md`](./false-positive-taxonomy.md)
 - [`docs/case-study-linux-auth-bruteforce.md`](./case-study-linux-auth-bruteforce.md)
 
 Look for the evidence route:
@@ -54,6 +56,7 @@ Look for the evidence route:
 - normalized event
 - signal mapping boundary
 - rule grouping, window, and threshold
+- false-positive hypotheses and required corroborating context
 - report finding or parser warning
 
 Look for parser coverage fields:

diff --git a/docs/rule-catalog.md b/docs/rule-catalog.md
@@ -50,16 +50,16 @@ Current `verdict_boundary` values are:
 
 ## False-Positive Taxonomy
 
-The taxonomy names benign or ambiguous explanations a reviewer should consider before interpreting a finding. It is not an allow-list, suppression policy, or automatic disposition.
+The taxonomy names benign or ambiguous explanations a reviewer should consider before interpreting a finding. It is not an allow-list, suppression policy, or automatic disposition. The detailed evidence-review matrices are in [`false-positive-taxonomy.md`](./false-positive-taxonomy.md).
 
 Each rule uses the same review buckets:
 
 - NAT
+- bastion
 - internal scanner
 - lab replay
-- shared bastion
 - scheduled admin task
-- malformed log replay
+- shared account
 
 ## Brute Force
 
@@ -112,11 +112,13 @@ The finding is a triage signal. It is not a compromise verdict, attribution clai
 | Bucket | Review interpretation |
 | --- | --- |
 | NAT | Multiple legitimate clients behind one egress address can collapse into one `source_ip`. |
+| bastion | An approved jump host can concentrate many operators or jobs under one source address. |
 | internal scanner | Authorized credential auditing or exposure scanning can intentionally generate repeated failures. |
 | lab replay | Sanitized sample data, training fixtures, or repeated demos can preserve concentrated failure patterns. |
-| shared bastion | A managed jump host or administrative relay can make many failed attempts appear to come from one source. |
 | scheduled admin task | A recurring job with stale credentials can fail repeatedly inside the rule window. |
-| malformed log replay | Duplicated or replayed log material can inflate apparent volume; unsupported malformed lines remain warnings and are not counted. |
+| shared account | Several operators or services can retry one shared credential from the grouped source. |
+
+See the [brute-force review matrix](./false-positive-taxonomy.md#brute-force) for corroborating evidence and residual uncertainty.
 
 ### Why unsupported evidence is not counted
 
@@ -180,11 +182,13 @@ The rule does not infer intent. It only states that one source IP produced attem
 | Bucket | Review interpretation |
 | --- | --- |
 | NAT | Different users behind one egress address can look like one source probing multiple accounts. |
+| bastion | A shared administrative entry point can originate expected attempts for several accounts. |
 | internal scanner | Authorized username-enumeration tests or account-audit tooling can touch many usernames by design. |
 | lab replay | Replayed lab logs can preserve synthetic username spread without representing live probing. |
-| shared bastion | Shared administrative entry points can produce attempts for several accounts from one source IP. |
 | scheduled admin task | Account validation, migration, or monitoring jobs can try multiple service or user accounts in one window. |
-| malformed log replay | Replayed or partially malformed evidence can duplicate username variety; unsupported records remain parser warnings and do not add usernames. |
+| shared account | Shared-account workflows can include fallback attempts across several shared or service identities. |
+
+See the [multi-user probing review matrix](./false-positive-taxonomy.md#multi-user-probing) for corroborating evidence and residual uncertainty.
 
 ### Why unsupported evidence is not counted
 
@@ -243,11 +247,13 @@ The finding is strongest when reviewed with session context, change windows, hos
 | Bucket | Review interpretation |
 | --- | --- |
 | NAT | Usually not a primary explanation because this rule groups by `username`, but it may matter when reviewed alongside source-IP findings. |
+| bastion | Approved jump-host workflows can precede a compact sequence of privileged maintenance commands. |
 | internal scanner | Endpoint assessment, compliance checks, or privileged inventory tooling can run several sudo commands quickly. |
 | lab replay | Demo or training logs can replay a compact privileged-command sequence. |
-| shared bastion | Shared administrative accounts or jump-host workflows can concentrate privileged commands under one username. |
 | scheduled admin task | Maintenance windows, package updates, service repair, or scripted operations can produce bursty sudo activity. |
-| malformed log replay | Duplicated sudo lines or replayed command logs can inflate the command count; unsupported malformed sudo-like lines stay out of rule input. |
+| shared account | Several administrators or services can concentrate commands under one username. |
+
+See the [sudo-burst review matrix](./false-positive-taxonomy.md#sudo-burst) for corroborating evidence and residual uncertainty.
 
 ### Why unsupported evidence is not counted