Engineering Precision: The Unix Pipe of AI Workflows

🤖 Read Raw Markdown

Setting the Stage: Context for the Curious Book Reader

This entry explores an important methodology for the Age of AI: how to replace imprecise semantic search with deterministic lexical intersection. By building a custom shell alias, we transform how we explore personal knowledge bases, proving that simple, robust tooling often out-performs complex vector indices for specific, intent-driven discovery.


Technical Journal Entry Begins

MikeLev.in: What I need now is the next logical step after this. What we just did with rg giving us the ability to do this:

(nix) pipulate $ rg -l "black swan" ~/repos/trimnoir/_posts/ | xargs rg -l "outlier" | sort | posts --stdin
# 🎯 Target: MikeLev.in (Public) [Oldest First]

/home/mike/repos/trimnoir/_posts/2025-03-21-navigating-ai-influence.md  # [Idx: 1 | Order: 2 | Tokens: 8,059 | Bytes: 38,675]
/home/mike/repos/trimnoir/_posts/2025-03-23-local-ai-workflows-jupyter-nix.md  # [Idx: 2 | Order: 1 | Tokens: 11,270 | Bytes: 50,010]
/home/mike/repos/trimnoir/_posts/2025-03-31-dunning-kruger-deep-research-ai-seo.md  # [Idx: 3 | Order: 2 | Tokens: 20,990 | Bytes: 92,806]
/home/mike/repos/trimnoir/_posts/2025-04-13-modern-seo-techniques.md  # [Idx: 4 | Order: 1 | Tokens: 19,270 | Bytes: 87,220]
/home/mike/repos/trimnoir/_posts/2025-05-20-ai-workflow-automation-living-book.md  # [Idx: 5 | Order: 1 | Tokens: 12,203 | Bytes: 56,244]
/home/mike/repos/trimnoir/_posts/2025-09-21-antifragile-intelligence-scheduler-commitment.md  # [Idx: 6 | Order: 4 | Tokens: 8,872 | Bytes: 42,953]
/home/mike/repos/trimnoir/_posts/2025-09-26-unix-pipe-of-life.md  # [Idx: 7 | Order: 1 | Tokens: 11,439 | Bytes: 51,787]
/home/mike/repos/trimnoir/_posts/2025-10-12-ai-amnesia-chisel-strike-wet-beats-dry.md  # [Idx: 8 | Order: 5 | Tokens: 9,451 | Bytes: 43,340]
/home/mike/repos/trimnoir/_posts/2025-10-27-navigating-ai-web-urlinspector-ecommerce-variant-dilemma.md  # [Idx: 9 | Order: 3 | Tokens: 12,589 | Bytes: 51,459]
/home/mike/repos/trimnoir/_posts/2025-11-09-skyhook-method-deterministic-systems.md  # [Idx: 10 | Order: 1 | Tokens: 8,261 | Bytes: 37,575]
/home/mike/repos/trimnoir/_posts/2025-12-02-wright-brothers-ai-on-rails.md  # [Idx: 11 | Order: 1 | Tokens: 19,651 | Bytes: 80,911]
/home/mike/repos/trimnoir/_posts/2025-12-04-forever-machine-digital-sovereignty-ai.md  # [Idx: 12 | Order: 1 | Tokens: 3,612 | Bytes: 16,387]
/home/mike/repos/trimnoir/_posts/2026-01-04-surgical-sovereignty-phantom-scripts-specificity.md  # [Idx: 13 | Order: 1 | Tokens: 17,500 | Bytes: 72,621]
/home/mike/repos/trimnoir/_posts/2026-02-23-sovereign-perception-ai-web-eyes.md  # [Idx: 14 | Order: 2 | Tokens: 25,158 | Bytes: 104,039]
/home/mike/repos/trimnoir/_posts/2026-02-25-the-levinix-blueprint-ai-content-negotiation-moat.md  # [Idx: 15 | Order: 1 | Tokens: 48,706 | Bytes: 200,784]
/home/mike/repos/trimnoir/_posts/2026-03-08-holographic-context-engineering-ai-ready-semantic-maps-web-native-llms.md  # [Idx: 16 | Order: 2 | Tokens: 77,852 | Bytes: 246,239]
/home/mike/repos/trimnoir/_posts/2026-03-08-llmectomy-ai-agnosticism-nixos-python.md  # [Idx: 17 | Order: 4 | Tokens: 33,275 | Bytes: 141,958]
/home/mike/repos/trimnoir/_posts/2026-03-13-the-topological-healer-ai-driven-nginx-redirects-dynamic-site-architectures.md  # [Idx: 18 | Order: 4 | Tokens: 46,974 | Bytes: 179,174]
/home/mike/repos/trimnoir/_posts/2026-03-19-pipulate-full-stack-python-htmx-durable-state.md  # [Idx: 19 | Order: 1 | Tokens: 24,277 | Bytes: 110,130]
/home/mike/repos/trimnoir/_posts/2026-04-04-strange-loop-forever-machine-governing-ai-distillation.md  # [Idx: 20 | Order: 3 | Tokens: 25,003 | Bytes: 104,070]
/home/mike/repos/trimnoir/_posts/2026-05-06-infinite-context-pipeline-narrative-code.md  # [Idx: 21 | Order: 2 | Tokens: 18,801 | Bytes: 78,190]
/home/mike/repos/trimnoir/_posts/2026-05-13-desert-kite-wet-workflows-ai.md  # [Idx: 22 | Order: 5 | Tokens: 16,023 | Bytes: 76,813]
/home/mike/repos/trimnoir/_posts/2026-05-15-cathedral-of-one-prompt-fu-forever-machine.md  # [Idx: 23 | Order: 1 | Tokens: 30,102 | Bytes: 124,546]
/home/mike/repos/trimnoir/_posts/2026-06-01-player-piano-four-audio-interrupts.md  # [Idx: 24 | Order: 3 | Tokens: 30,148 | Bytes: 123,811]
/home/mike/repos/trimnoir/_posts/2026-06-27-forever-machine-robust-systems.md  # [Idx: 25 | Order: 3 | Tokens: 16,362 | Bytes: 77,226]
/home/mike/repos/trimnoir/_posts/2026-07-01-meeting-in-the-middle-engineering-ai-workflows.md  # [Idx: 26 | Order: 1 | Tokens: 38,581 | Bytes: 166,203]
/home/mike/repos/trimnoir/_posts/2026-07-02-unix-pipe-of-life-ai-workflows.md  # [Idx: 27 | Order: 4 | Tokens: 28,403 | Bytes: 117,738]
(nix) pipulate $

…should be reduced to being this simple:

rgx "black swan" "outlier"

I’m not sure what the proper form here would be and what latitude we have for mixing unquoted versions like:

rgx "black swan" outlier

Or

rgx outlier "black swan"

Would commas be appropriate here on the command-line? And we’re making a system that will have to be expendable to N n-grams, like so:

rgx outlier "black swan" "lamb shift"

Is this becoming clear?

The Logic of the Intersection Chain

Claude Sonnet 5 Medium: Context verified. Good instinct pushing this into a first-class alias — the two-arg case already proved out the pattern, the only real work is making the intersection chain loop over N terms instead of being hand-typed each time.

Cheapest falsifying probe: rgx "black swan" outlier run in a nix develop shell after the patch, confirmed against the known-good manual pipeline output you already have above (27 files, same paths).

Answering your quoting questions directly

Bash already solves this for you — no commas needed, and mixing quoted/unquoted is fine:

  • rgx outlier "black swan" — two positional args: outlier and black swan (one arg, because of the quotes holding the space together)
  • rgx "black swan" outlier — same two args, order doesn’t matter functionally
  • rgx outlier "black swan" "lamb shift" — three positional args, extends cleanly

Quotes are only needed when a term has a space in it. Single-word terms don’t need them. Commas would be actively wrong here — "black swan", would search for the literal string with a trailing comma. "$@" in bash already gives you a clean N-length array regardless of how many terms you pass, so the loop generalizes for free.

The function

rgx() {
  if [ "$#" -eq 0 ]; then
    echo "Usage: rgx TERM [TERM...]"
    return 1
  fi
  local posts_dir="$HOME/repos/trimnoir/_posts"
  local matches
  matches=$(rg -l -- "$1" "$posts_dir")
  shift
  for term in "$@"; do
    [ -z "$matches" ] && break
    matches=$(echo "$matches" | xargs rg -l -- "$term")
  done
  echo "$matches" | sort | posts --stdin
}

The -- before each search term is a small hardening you didn’t ask for but is worth having: it stops rg from interpreting a term that happens to start with - (e.g. rgx -1 outlier) as a flag. The early [ -z "$matches" ] && break avoids xargs choking on empty stdin once the intersection has already collapsed to zero files — cheap short-circuit for a long n-gram chain.

Hardening the Tooling for Daily Use

Target: flake.nix
[[[SEARCH]]]
          slugs() { (cd ~/repos/pipulate && python scripts/articles/lsa.py -t 1 --slugs "$@" --fmt paths); }
          # slugs-ordered preserves input order for narrative control
          sluggo() { for slug in "$@"; do (cd ~/repos/pipulate && python scripts/articles/lsa.py -t 1 --match "$slug" --fmt paths); done; }
[[[DIVIDER]]]
          slugs() { (cd ~/repos/pipulate && python scripts/articles/lsa.py -t 1 --slugs "$@" --fmt paths); }
          # slugs-ordered preserves input order for narrative control
          sluggo() { for slug in "$@"; do (cd ~/repos/pipulate && python scripts/articles/lsa.py -t 1 --match "$slug" --fmt paths); done; }
          # rgx: N-gram intersection search across the article corpus.
          # rgx TERM [TERM...] -- quote multi-word terms, unquoted for single words.
          # Chains `rg -l` through each term, sorts, and hands the result to `posts --stdin`.
          rgx() {
            if [ "$#" -eq 0 ]; then
              echo "Usage: rgx TERM [TERM...]"
              return 1
            fi
            local posts_dir="$HOME/repos/trimnoir/_posts"
            local matches
            matches=$(rg -l -- "$1" "$posts_dir")
            shift
            for term in "$@"; do
              [ -z "$matches" ] && break
              matches=$(echo "$matches" | xargs rg -l -- "$term")
            done
            echo "$matches" | sort | posts --stdin
          }
[[[REPLACE]]]

Post-patch verification, once you’re back in a fresh nix develop:

rgx "black swan" outlier
rgx bootcamp true_schema_discoverer
rgx actuator "player piano" tool

The first two should reproduce the exact file lists you already validated by hand; the third is the deeper n-gram case that only rgx makes cheap to try. If a chain collapses to zero matches partway through, posts --stdin will just get empty input and print nothing — worth knowing that’s silent-but-correct rather than an error.

MikeLev.in: Okay, the patch:

$ git status
On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean
(nix) pipulate $ patch
(nix) pipulate $ cat patch | app
✅ DETERMINISTIC PATCH APPLIED: Successfully mutated 'flake.nix'.
(nix) pipulate $ d
diff --git a/flake.nix b/flake.nix
index 69d3397a..40bfa3ec 100644
--- a/flake.nix
+++ b/flake.nix
@@ -651,6 +651,24 @@ print(max(1, n))
           slugs() { (cd ~/repos/pipulate && python scripts/articles/lsa.py -t 1 --slugs "$@" --fmt paths); }
           # slugs-ordered preserves input order for narrative control
           sluggo() { for slug in "$@"; do (cd ~/repos/pipulate && python scripts/articles/lsa.py -t 1 --match "$slug" --fmt paths); done; }
+          # rgx: N-gram intersection search across the article corpus.
+          # rgx TERM [TERM...] -- quote multi-word terms, unquoted for single words.
+          # Chains `rg -l` through each term, sorts, and hands the result to `posts --stdin`.
+          rgx() {
+            if [ "$#" -eq 0 ]; then
+              echo "Usage: rgx TERM [TERM...]"
+              return 1
+            fi
+            local posts_dir="$HOME/repos/trimnoir/_posts"
+            local matches
+            matches=$(rg -l -- "$1" "$posts_dir")
+            shift
+            for term in "$@"; do
+              [ -z "$matches" ] && break
+              matches=$(echo "$matches" | xargs rg -l -- "$term")
+            done
+            echo "$matches" | sort | posts --stdin
+          }
           alias release='python release.py --release --force'
           alias g='clear && echo "$ git status" && git status'
           m() {
(nix) pipulate $ m
📝 Committing: refactor: Implement rgx n-gram intersection search across articles
[main 0de4b11e] refactor: Implement rgx n-gram intersection search across articles
 1 file changed, 18 insertions(+)
(nix) pipulate $

And we do the post-patch test:

(nix) pipulate $ exit
exit
(sys) pipulate $ ndq
warning: updating lock file '/home/mike/repos/pipulate/flake.lock':
• Added input 'flake-utils':
    'github:numtide/flake-utils/11707dc2f618dd54ca8739b309ec4fc024de578b?narHash=sha256-l0KFg5HjrsfsO/JpG%2Br7fRrqm12kzFHyUHqHCVpMMbI%3D' (2024-11-13)
• Added input 'flake-utils/systems':
    'github:nix-systems/default/da67096a3b9bf56a91d16901293e51ba5b49a27e?narHash=sha256-Vy1rq5AaRuLzOxct8nz4T6wlgyUR7zLU309k9mBC768%3D' (2023-04-09)
• Added input 'nixpkgs':
    'github:NixOS/nixpkgs/b5aa0fbd538984f6e3d201be0005b4463d8b09f8?narHash=sha256-oPXCU/SSUokcGaJREHibG1CBX3%2Bs/W7orDWQOZDsEeQ%3D' (2026-06-29)
(nix) pipulate $ rgx "black swan" outlier
# 🎯 Target: MikeLev.in (Public) [Oldest First]

/home/mike/repos/trimnoir/_posts/2025-03-21-navigating-ai-influence.md  # [Idx: 1 | Order: 2 | Tokens: 8,059 | Bytes: 38,675]
/home/mike/repos/trimnoir/_posts/2025-03-23-local-ai-workflows-jupyter-nix.md  # [Idx: 2 | Order: 1 | Tokens: 11,270 | Bytes: 50,010]
/home/mike/repos/trimnoir/_posts/2025-03-31-dunning-kruger-deep-research-ai-seo.md  # [Idx: 3 | Order: 2 | Tokens: 20,990 | Bytes: 92,806]
/home/mike/repos/trimnoir/_posts/2025-04-13-modern-seo-techniques.md  # [Idx: 4 | Order: 1 | Tokens: 19,270 | Bytes: 87,220]
/home/mike/repos/trimnoir/_posts/2025-05-20-ai-workflow-automation-living-book.md  # [Idx: 5 | Order: 1 | Tokens: 12,203 | Bytes: 56,244]
/home/mike/repos/trimnoir/_posts/2025-09-21-antifragile-intelligence-scheduler-commitment.md  # [Idx: 6 | Order: 4 | Tokens: 8,872 | Bytes: 42,953]
/home/mike/repos/trimnoir/_posts/2025-09-26-unix-pipe-of-life.md  # [Idx: 7 | Order: 1 | Tokens: 11,439 | Bytes: 51,787]
/home/mike/repos/trimnoir/_posts/2025-10-12-ai-amnesia-chisel-strike-wet-beats-dry.md  # [Idx: 8 | Order: 5 | Tokens: 9,451 | Bytes: 43,340]
/home/mike/repos/trimnoir/_posts/2025-10-27-navigating-ai-web-urlinspector-ecommerce-variant-dilemma.md  # [Idx: 9 | Order: 3 | Tokens: 12,589 | Bytes: 51,459]
/home/mike/repos/trimnoir/_posts/2025-11-09-skyhook-method-deterministic-systems.md  # [Idx: 10 | Order: 1 | Tokens: 8,261 | Bytes: 37,575]
/home/mike/repos/trimnoir/_posts/2025-12-02-wright-brothers-ai-on-rails.md  # [Idx: 11 | Order: 1 | Tokens: 19,651 | Bytes: 80,911]
/home/mike/repos/trimnoir/_posts/2025-12-04-forever-machine-digital-sovereignty-ai.md  # [Idx: 12 | Order: 1 | Tokens: 3,612 | Bytes: 16,387]
/home/mike/repos/trimnoir/_posts/2026-01-04-surgical-sovereignty-phantom-scripts-specificity.md  # [Idx: 13 | Order: 1 | Tokens: 17,500 | Bytes: 72,621]
/home/mike/repos/trimnoir/_posts/2026-02-23-sovereign-perception-ai-web-eyes.md  # [Idx: 14 | Order: 2 | Tokens: 25,158 | Bytes: 104,039]
/home/mike/repos/trimnoir/_posts/2026-02-25-the-levinix-blueprint-ai-content-negotiation-moat.md  # [Idx: 15 | Order: 1 | Tokens: 48,706 | Bytes: 200,784]
/home/mike/repos/trimnoir/_posts/2026-03-08-holographic-context-engineering-ai-ready-semantic-maps-web-native-llms.md  # [Idx: 16 | Order: 2 | Tokens: 77,852 | Bytes: 246,239]
/home/mike/repos/trimnoir/_posts/2026-03-08-llmectomy-ai-agnosticism-nixos-python.md  # [Idx: 17 | Order: 4 | Tokens: 33,275 | Bytes: 141,958]
/home/mike/repos/trimnoir/_posts/2026-03-13-the-topological-healer-ai-driven-nginx-redirects-dynamic-site-architectures.md  # [Idx: 18 | Order: 4 | Tokens: 46,974 | Bytes: 179,174]
/home/mike/repos/trimnoir/_posts/2026-03-19-pipulate-full-stack-python-htmx-durable-state.md  # [Idx: 19 | Order: 1 | Tokens: 24,277 | Bytes: 110,130]
/home/mike/repos/trimnoir/_posts/2026-04-04-strange-loop-forever-machine-governing-ai-distillation.md  # [Idx: 20 | Order: 3 | Tokens: 25,003 | Bytes: 104,070]
/home/mike/repos/trimnoir/_posts/2026-05-06-infinite-context-pipeline-narrative-code.md  # [Idx: 21 | Order: 2 | Tokens: 18,801 | Bytes: 78,190]
/home/mike/repos/trimnoir/_posts/2026-05-13-desert-kite-wet-workflows-ai.md  # [Idx: 22 | Order: 5 | Tokens: 16,023 | Bytes: 76,813]
/home/mike/repos/trimnoir/_posts/2026-05-15-cathedral-of-one-prompt-fu-forever-machine.md  # [Idx: 23 | Order: 1 | Tokens: 30,102 | Bytes: 124,546]
/home/mike/repos/trimnoir/_posts/2026-06-01-player-piano-four-audio-interrupts.md  # [Idx: 24 | Order: 3 | Tokens: 30,148 | Bytes: 123,811]
/home/mike/repos/trimnoir/_posts/2026-06-27-forever-machine-robust-systems.md  # [Idx: 25 | Order: 3 | Tokens: 16,362 | Bytes: 77,226]
/home/mike/repos/trimnoir/_posts/2026-07-01-meeting-in-the-middle-engineering-ai-workflows.md  # [Idx: 26 | Order: 1 | Tokens: 38,581 | Bytes: 166,203]
/home/mike/repos/trimnoir/_posts/2026-07-02-unix-pipe-of-life-ai-workflows.md  # [Idx: 27 | Order: 4 | Tokens: 28,403 | Bytes: 117,738]
(nix) pipulate $ rgx bootcamp true_schema_discoverer
# 🎯 Target: MikeLev.in (Public) [Oldest First]

/home/mike/repos/trimnoir/_posts/2025-09-22-ai-ergonomics-developer-workflows.md  # [Idx: 1 | Order: 5 | Tokens: 14,386 | Bytes: 53,222]
/home/mike/repos/trimnoir/_posts/2025-09-22-ai-architectural-xrays-code-pruning.md  # [Idx: 2 | Order: 6 | Tokens: 16,819 | Bytes: 106,025]
/home/mike/repos/trimnoir/_posts/2025-09-23-git-repo-gardening-surgical-history.md  # [Idx: 3 | Order: 2 | Tokens: 24,032 | Bytes: 87,401]
/home/mike/repos/trimnoir/_posts/2026-02-20-engineered-context-llm-limits.md  # [Idx: 4 | Order: 6 | Tokens: 55,905 | Bytes: 194,541]
/home/mike/repos/trimnoir/_posts/2026-03-11-self-auditing-ai-context-compiler.md  # [Idx: 5 | Order: 5 | Tokens: 30,509 | Bytes: 109,354]
/home/mike/repos/trimnoir/_posts/2026-05-05-forever-machine-deterministic-ai-code-editing.md  # [Idx: 6 | Order: 3 | Tokens: 26,255 | Bytes: 102,183]
/home/mike/repos/trimnoir/_posts/2026-06-02-grumbling-graybeard-local-credentials.md  # [Idx: 7 | Order: 1 | Tokens: 32,652 | Bytes: 129,029]
/home/mike/repos/trimnoir/_posts/2026-06-20-curing-ai-blindness-patching-ai-py.md  # [Idx: 8 | Order: 3 | Tokens: 74,900 | Bytes: 302,509]
/home/mike/repos/trimnoir/_posts/2026-06-24-retargetable-publishing-nervous-system.md  # [Idx: 9 | Order: 3 | Tokens: 11,836 | Bytes: 50,438]
/home/mike/repos/trimnoir/_posts/2026-07-02-unix-pipe-of-life-ai-workflows.md  # [Idx: 10 | Order: 4 | Tokens: 28,403 | Bytes: 117,738]
(nix) pipulate $ rgx actuator "player piano" tool
# 🎯 Target: MikeLev.in (Public) [Oldest First]

/home/mike/repos/trimnoir/_posts/2025-09-22-code-is-truth-actuator-ai.md  # [Idx: 1 | Order: 1 | Tokens: 20,359 | Bytes: 93,834]
/home/mike/repos/trimnoir/_posts/2026-02-24-cybernetic-software-architecture-llms-semantic-governors.md  # [Idx: 2 | Order: 3 | Tokens: 86,034 | Bytes: 353,634]
/home/mike/repos/trimnoir/_posts/2026-03-05-self-healing-ai-404-redirects-nginx-nixos.md  # [Idx: 3 | Order: 1 | Tokens: 48,718 | Bytes: 184,658]
/home/mike/repos/trimnoir/_posts/2026-03-05-topological-healer-ai-automated-redirects.md  # [Idx: 4 | Order: 3 | Tokens: 50,190 | Bytes: 190,069]
/home/mike/repos/trimnoir/_posts/2026-03-06-automated-404-healing-semantic-router.md  # [Idx: 5 | Order: 6 | Tokens: 48,777 | Bytes: 182,777]
/home/mike/repos/trimnoir/_posts/2026-03-10-single-pass-llm-optics-engine-causal-fidelity.md  # [Idx: 6 | Order: 6 | Tokens: 8,246 | Bytes: 37,213]
/home/mike/repos/trimnoir/_posts/2026-03-11-idempotent-token-ledger-llm-context.md  # [Idx: 7 | Order: 4 | Tokens: 33,745 | Bytes: 121,068]
/home/mike/repos/trimnoir/_posts/2026-03-13-the-topological-healer-ai-driven-nginx-redirects-dynamic-site-architectures.md  # [Idx: 8 | Order: 4 | Tokens: 46,974 | Bytes: 179,174]
/home/mike/repos/trimnoir/_posts/2026-03-18-bauhaus-bouncehouse-immutable-ai-environments.md  # [Idx: 9 | Order: 4 | Tokens: 23,948 | Bytes: 100,209]
/home/mike/repos/trimnoir/_posts/2026-03-25-the-forever-machine-pipulates-state-driven-workflows-beyond-cloud-dependency.md  # [Idx: 10 | Order: 2 | Tokens: 25,598 | Bytes: 106,331]
/home/mike/repos/trimnoir/_posts/2026-03-30-the-invisible-handshake-http-content-negotiation-llmo.md  # [Idx: 11 | Order: 2 | Tokens: 28,308 | Bytes: 125,576]
/home/mike/repos/trimnoir/_posts/2026-04-03-automated-book-decanting-ai-editing-pipeline.md  # [Idx: 12 | Order: 1 | Tokens: 47,656 | Bytes: 191,417]
/home/mike/repos/trimnoir/_posts/2026-04-10-rigging-the-sail-of-intelligence.md  # [Idx: 13 | Order: 1 | Tokens: 33,205 | Bytes: 138,218]
/home/mike/repos/trimnoir/_posts/2026-04-27-sentinel-file-progressive-ui.md  # [Idx: 14 | Order: 2 | Tokens: 24,311 | Bytes: 97,050]
/home/mike/repos/trimnoir/_posts/2026-04-29-mechanical-sympathy-deterministic-ai.md  # [Idx: 15 | Order: 2 | Tokens: 18,515 | Bytes: 85,760]
/home/mike/repos/trimnoir/_posts/2026-05-03-building-the-forever-machine.md  # [Idx: 16 | Order: 1 | Tokens: 34,849 | Bytes: 147,559]
/home/mike/repos/trimnoir/_posts/2026-05-17-escape-postback-deterministic-ai-workflows.md  # [Idx: 17 | Order: 2 | Tokens: 18,207 | Bytes: 83,548]
/home/mike/repos/trimnoir/_posts/2026-05-21-vim-transposition-human-ai-patchmaker.md  # [Idx: 18 | Order: 5 | Tokens: 17,873 | Bytes: 76,331]
/home/mike/repos/trimnoir/_posts/2026-05-22-tale-of-two-alans-turing-church-local-ai.md  # [Idx: 19 | Order: 2 | Tokens: 40,847 | Bytes: 179,433]
/home/mike/repos/trimnoir/_posts/2026-05-22-air-gapped-developer-cockpit.md  # [Idx: 20 | Order: 3 | Tokens: 19,345 | Bytes: 84,875]
/home/mike/repos/trimnoir/_posts/2026-05-23-clipboard-control-bus-transactional-ai-workflows.md  # [Idx: 21 | Order: 1 | Tokens: 12,047 | Bytes: 52,791]
/home/mike/repos/trimnoir/_posts/2026-05-23-two-player-pianos-automation.md  # [Idx: 22 | Order: 2 | Tokens: 9,548 | Bytes: 47,385]
/home/mike/repos/trimnoir/_posts/2026-05-24-wax-seal-untokenized-checksums-ai.md  # [Idx: 23 | Order: 1 | Tokens: 31,851 | Bytes: 134,475]
/home/mike/repos/trimnoir/_posts/2026-05-24-ascii-integrity-software-drift.md  # [Idx: 24 | Order: 3 | Tokens: 8,139 | Bytes: 34,658]
/home/mike/repos/trimnoir/_posts/2026-05-27-waxascii-preventing-generative-drift.md  # [Idx: 25 | Order: 1 | Tokens: 14,803 | Bytes: 67,811]
/home/mike/repos/trimnoir/_posts/2026-05-27-wax-on-wax-off-waxascii.md  # [Idx: 26 | Order: 2 | Tokens: 13,237 | Bytes: 63,019]
/home/mike/repos/trimnoir/_posts/2026-05-27-out-of-band-integrity.md  # [Idx: 27 | Order: 3 | Tokens: 32,171 | Bytes: 177,956]
/home/mike/repos/trimnoir/_posts/2026-05-28-waxascii-seal-in-band-code-integrity.md  # [Idx: 28 | Order: 2 | Tokens: 21,387 | Bytes: 93,214]
/home/mike/repos/trimnoir/_posts/2026-05-29-building-the-take-off-ramp.md  # [Idx: 29 | Order: 2 | Tokens: 11,270 | Bytes: 49,840]
/home/mike/repos/trimnoir/_posts/2026-05-29-dopamine-canary-environment-boundaries-local-automation.md  # [Idx: 30 | Order: 3 | Tokens: 24,264 | Bytes: 106,225]
/home/mike/repos/trimnoir/_posts/2026-05-29-patronus-blueprint-popups-clipboard-pipelines.md  # [Idx: 31 | Order: 4 | Tokens: 24,467 | Bytes: 111,288]
/home/mike/repos/trimnoir/_posts/2026-05-31-magic-users-guide-ai-actuators-anti-churn.md  # [Idx: 32 | Order: 1 | Tokens: 31,076 | Bytes: 141,506]
/home/mike/repos/trimnoir/_posts/2026-05-31-return-of-viewstate-ai-chatbot-postback.md  # [Idx: 33 | Order: 3 | Tokens: 29,855 | Bytes: 126,142]
/home/mike/repos/trimnoir/_posts/2026-06-01-repo-talks-back-ai-context.md  # [Idx: 34 | Order: 1 | Tokens: 22,759 | Bytes: 90,877]
/home/mike/repos/trimnoir/_posts/2026-06-01-git-hooks-breaking-news-bell.md  # [Idx: 35 | Order: 2 | Tokens: 21,517 | Bytes: 89,743]
/home/mike/repos/trimnoir/_posts/2026-06-01-player-piano-four-audio-interrupts.md  # [Idx: 36 | Order: 3 | Tokens: 30,148 | Bytes: 123,811]
/home/mike/repos/trimnoir/_posts/2026-06-01-boring-way-epistemic-balance-anti-entropy-workspaces.md  # [Idx: 37 | Order: 4 | Tokens: 33,535 | Bytes: 145,184]
/home/mike/repos/trimnoir/_posts/2026-06-01-player-piano-markdown-sheet-music.md  # [Idx: 38 | Order: 5 | Tokens: 7,601 | Bytes: 33,841]
/home/mike/repos/trimnoir/_posts/2026-06-01-player-piano-voice-visual-sync.md  # [Idx: 39 | Order: 6 | Tokens: 28,283 | Bytes: 111,904]
/home/mike/repos/trimnoir/_posts/2026-06-01-designing-for-amnesia-deterministic-ai-pipeline.md  # [Idx: 40 | Order: 7 | Tokens: 9,900 | Bytes: 43,690]
/home/mike/repos/trimnoir/_posts/2026-06-02-local-first-security-context-compiler.md  # [Idx: 41 | Order: 2 | Tokens: 8,466 | Bytes: 41,289]
/home/mike/repos/trimnoir/_posts/2026-06-06-plain-text-control-plane.md  # [Idx: 42 | Order: 4 | Tokens: 26,756 | Bytes: 101,212]
/home/mike/repos/trimnoir/_posts/2026-06-08-forest-and-trees-cybernetic-broadcast-engine.md  # [Idx: 43 | Order: 1 | Tokens: 32,021 | Bytes: 137,350]
/home/mike/repos/trimnoir/_posts/2026-06-08-rigging-the-vibe-autonomous-seamanship.md  # [Idx: 44 | Order: 2 | Tokens: 16,963 | Bytes: 79,147]
/home/mike/repos/trimnoir/_posts/2026-06-08-slaying-hardcoded-paths-portable-ai-dev.md  # [Idx: 45 | Order: 3 | Tokens: 18,707 | Bytes: 75,296]
/home/mike/repos/trimnoir/_posts/2026-06-08-player-piano-piper-tts-terminal-audio-bus.md  # [Idx: 46 | Order: 4 | Tokens: 9,074 | Bytes: 39,108]
/home/mike/repos/trimnoir/_posts/2026-06-09-attention-prosthesis-rebound-tax.md  # [Idx: 47 | Order: 2 | Tokens: 32,797 | Bytes: 141,437]
/home/mike/repos/trimnoir/_posts/2026-06-12-syncing-player-piano-audio-git-discipline.md  # [Idx: 48 | Order: 1 | Tokens: 17,254 | Bytes: 77,582]
/home/mike/repos/trimnoir/_posts/2026-06-12-unix-way-llm-surgical-precision.md  # [Idx: 49 | Order: 2 | Tokens: 12,227 | Bytes: 56,264]
/home/mike/repos/trimnoir/_posts/2026-06-13-vibe-coding-binary-search-debugging.md  # [Idx: 50 | Order: 2 | Tokens: 38,347 | Bytes: 149,667]
/home/mike/repos/trimnoir/_posts/2026-06-15-best-probe-is-deleted-dependency.md  # [Idx: 51 | Order: 1 | Tokens: 19,771 | Bytes: 83,416]
/home/mike/repos/trimnoir/_posts/2026-06-15-plain-text-epistemic-memory-ai.md  # [Idx: 52 | Order: 4 | Tokens: 27,785 | Bytes: 118,798]
/home/mike/repos/trimnoir/_posts/2026-06-20-llm-optics-progressive-web-inspection.md  # [Idx: 53 | Order: 2 | Tokens: 34,998 | Bytes: 139,981]
/home/mike/repos/trimnoir/_posts/2026-06-24-truth-actually-actuator-systems.md  # [Idx: 54 | Order: 1 | Tokens: 24,935 | Bytes: 105,169]
/home/mike/repos/trimnoir/_posts/2026-06-24-funnel-architecture-automation.md  # [Idx: 55 | Order: 2 | Tokens: 8,629 | Bytes: 38,717]
/home/mike/repos/trimnoir/_posts/2026-06-24-retargetable-publishing-nervous-system.md  # [Idx: 56 | Order: 3 | Tokens: 11,836 | Bytes: 50,438]
/home/mike/repos/trimnoir/_posts/2026-06-27-demoscene-compression-onboarding.md  # [Idx: 57 | Order: 2 | Tokens: 7,288 | Bytes: 33,123]
/home/mike/repos/trimnoir/_posts/2026-07-01-meeting-in-the-middle-engineering-ai-workflows.md  # [Idx: 58 | Order: 1 | Tokens: 38,581 | Bytes: 166,203]
/home/mike/repos/trimnoir/_posts/2026-07-01-solving-os-argument-limits-ai-pipelines.md  # [Idx: 59 | Order: 2 | Tokens: 14,108 | Bytes: 62,074]
/home/mike/repos/trimnoir/_posts/2026-07-01-operationalizing-the-edge-scaling-without-complexity.md  # [Idx: 60 | Order: 3 | Tokens: 17,635 | Bytes: 79,109]
/home/mike/repos/trimnoir/_posts/2026-07-02-safe-playground-declarative-machine.md  # [Idx: 61 | Order: 1 | Tokens: 24,901 | Bytes: 112,929]
/home/mike/repos/trimnoir/_posts/2026-07-02-unix-pipe-of-life-ai-workflows.md  # [Idx: 62 | Order: 4 | Tokens: 28,403 | Bytes: 117,738]
(nix) pipulate $

How about that?

Claude Sonnet 5 Medium: The rgx patch landed clean and all three post-patch tests confirm it’s working exactly as designed:

  • rgx "black swan" outlier → 27 files, matching your original manual pipeline exactly
  • rgx bootcamp true_schema_discoverer → 10 files, a clean two-term intersection
  • rgx actuator "player piano" tool → 62 files, proving the three-term chain generalizes correctly

Scaling Discovery Beyond Simple N-Grams

Context verified — no gaps here. This is a solid, self-contained win: rg -l ... | xargs rg -l ... | sort | posts --stdin collapsed into a single memorable command that scales to arbitrary N without any code changes needed for N=4, N=5, etc.

A few small observations, not action items unless you want them:

The commit message is slightly stale. m generated “refactor: Implement rgx n-gram intersection search across articles” — accurate, though “feat” might fit better than “refactor” since this is net-new capability, not a restructuring of existing code. Not worth amending over, just noting it for pattern-recognition on future m commits.

Silent-empty-result behavior is confirmed but undocumented at the call site. You already know from our last exchange that a fully-collapsed intersection prints nothing rather than erroring. Given rgx is meant to become a daily-driver muscle-memory command, it might be worth a one-line comment or a [ -z "$matches" ] && echo "# No matches" >&2 before the final posts --stdin — purely a UX nicety so a zero-result search doesn’t look indistinguishable from a hung terminal. Your call on whether that’s worth the extra line.

Natural next step, if you want one: sluggo still does the ordered, one-slug-at-a-time loop pattern for narrative sequencing — rgx is now the disjoint sibling for ad hoc corpus-wide n-gram discovery. The two together cover “give me these specific articles in this order” and “find me whatever intersects these concepts” respectively. No further work implied by that; just flagging that the alias surface in flake.nix now has a clean symmetry that’s worth preserving if either function gets touched again.

Nothing here needs action from me unless you want the silent-empty-result nicety patched in.

MikeLev.in: I like how it cross-referenced with rhyming functionally I wrote before. Insightful and it’s going to make me look closer, because this is the key to the choose your own adventure drill-down process with just the right amount of stuff making up for not having yet what vector embedding so poorly tries to do. This is so much better. And just wait until we shoe those file’s already built _context/ holographic shards interleaved there, and maybe the region around the keyword-hits in the file the way rg itself does. Can you imagine a human or LLM’s ability to explore a topic in one of these Jekyll blogs like Trimnoir? How efficient it could be? How few turns-steps of a system like this it could require to build the perfect context-payload for nearly assured implementation plans every time, when you finally do get to the step where it has everything needed for a high probability successful patch?

Claude Sonnet 5 Medium: Context verified. You’ve spotted the actual leverage point here, so let’s dwell on it a second before jumping to code.

Why this beats embeddings for your use case

Vector search gives you semantic proximity — “these chunks are shaped like your query.” What rgx gives you is provable co-occurrence — “these specific files contain all these specific terms.” For a corpus where you (the author) already know the vocabulary you used when you wrote about something, exact/fuzzy lexical intersection is higher-precision and zero-cost to explain. No embedding drift, no model to keep in sync with a growing corpus, no “why did it surface this unrelated article” mystery. It’s rg’s determinism riding on your own naming discipline as an author — which is exactly the kind of asset an embedding index can’t give you because it doesn’t know your intent, only your words’ statistical neighborhood.

Lexical Precision as a Competitive Advantage

The three-part combo you’re describing is basically building a lexical RAG pipeline out of Unix pipes, and each piece maps to a stage real RAG systems pay for with infrastructure:

RAG concept Your equivalent
Retrieval (find candidate docs) rgx TERM [TERM...]
Chunking / passage extraction rg-style context lines around the hit, instead of full-file dumps
Metadata / summary layer The _context/*.json holographic shards (kw, sub, s)
Reranking / relevance signal Number of matched files × hit density per file

The concrete next step: rgx with context + shard interleaving

What you’re picturing is something like rgx -C 3 "black swan" outlier that, instead of (or in addition to) the file list, shows:

  1. The matched lines with -C N surrounding context (rg already does this per-file — the trick is doing it after the intersection, so you’re not wasting context window showing full files for a 27-file hit)
  2. The holographic shard’s s (summary) and kw/sub fields injected right above each file’s snippet block, so the reader gets “here’s what this article is about” before “here’s the sentence where your terms actually collide”

That’s a genuinely different artifact than what posts --stdin produces today. Right now rgx narrows which files matter; the next layer would narrow which paragraphs matter and why the file is relevant at all — three altitudes of context in one command, most of it already sitting on disk from your existing pipeline (contextualizer.py already wrote the shards, rg already knows how to do context windows).

What it would take

Roughly:

  • A --context N (or reuse -C) flag threaded into the rgx shell function, changing rg -l (list only) to rg -C N (list + snippet) for the final intersection pass only — you still want -l for the narrowing steps, since the intermediate xargs rg -l calls are pure set-intersection and don’t need snippets yet.
  • A small Python step (probably belongs in lsa.py or a new tiny script, since common.py/lsa.py already know how to resolve _context/<stem>.json per-.md file) that takes the final file list, looks up each shard, and prints [date] title — kw: ... — sub: ... as a one-line header before that file’s rg snippet block.
  • Since the intersection logic itself is bash and the shard-lookup logic is Python, the natural seam is: bash does the narrowing (fast, no Python startup cost per term), then hands the final file list to a Python script for the “decorate with shard + snippet” pass — mirroring how rgx already hands off to posts --stdin at the end.

This is a good-sized but bounded next chisel-strike — not a full-file rewrite, more like a new sibling function (rgxc? rgx --deep?) or a --snippets flag on the existing one, backed by a new ~40-line Python helper.

Want me to sketch that helper now, or do you want to first decide the CLI shape (separate command vs. flag) before I commit to an interface?

MikeLev.in: Not quite yet. I’m tired and I’m going to sleep. I will use latest tomorrow to rail-gun us back into motion.


Book Analysis

Ai Editorial Take

What surprised me most is how this entry inadvertently validates the ‘Unix Philosophy’ as the ultimate RAG (Retrieval-Augmented Generation) engine. The article demonstrates that by using the filesystem as a database, the developer gains an auditability that AI model training lacks. It shifts the AI’s role from ‘oracle’ to ‘co-processor’.

🐦 X.com Promo Tweet

Tired of semantic search returning irrelevant results? I built an 'rgx' alias that turns my local article library into a deterministic, high-speed intersection engine. Simple, repeatable, and fast. Learn how to optimize your AI-assisted knowledge base: https://mikelev.in/futureproof/unix-pipe-ai-workflows/ #AI #Unix #LocalFirst

Title Brainstorm

  • Title Option: Engineering Precision: The Unix Pipe of AI Workflows
    • Filename: unix-pipe-ai-workflows.md
    • Rationale: Direct, professional, and highlights the synthesis of classic computing and modern AI.
  • Title Option: The Deterministic Search Pipeline
    • Filename: deterministic-search-pipeline.md
    • Rationale: Focuses on the reliability of the system, which is a major draw for developers.
  • Title Option: Beyond Embeddings: Lexical Intersections for AI
    • Filename: beyond-embeddings-lexical.md
    • Rationale: Positions the method as a thoughtful alternative to current industry trends.

Content Potential And Polish

  • Core Strengths:
    • Strong proof-of-concept with clear CLI utility.
    • Excellent use of personal narrative to ground technical innovation.
    • Clear, logical progression from manual command to automated script.
  • Suggestions For Polish:
    • Add a brief ‘Getting Started’ section for users without a pre-existing _posts directory.
    • Standardize the visual formatting of terminal outputs for better readability.
    • Consider adding a ‘Troubleshooting’ note regarding shell environment variables.

Next Step Prompts

  • Develop a Python helper script to interleave local context shards (_context/*.json) into the final rgx output stream.
  • Explore the creation of a ‘preview’ mode for the rgx command that displays top-k snippet results using the existing terminal audio-bus infrastructure.