Setting the Stage: Context for the Curious Book Reader
This entry is a meta-commentary on the ongoing development of automated tooling for technical journaling, specifically exploring the evolving relationship between human intuition and AI’s logical capabilities. It delves into the philosophical underpinnings of this collaboration, referencing seminal works like Iain McGilchrist’s ‘The Master and His Emissary’ and Douglas Hofstadter’s ‘Gödel, Escher, Bach,’ while documenting the iterative process of building a ‘digital corpus callosum’ for efficient human-AI synergy.
Technical Journal Entry Begins
One article bleeds over to the next, bleeds over to the next. I start writing about the big right-brained big picture, the holistic thing I’m trying to accomplish because all that dream-processing of a good night’s sleep has brought it up closer to the surface to poke up above your subconscious than it’s ever gonna be for the rest of the day as your left-lobe beats your right-lobe back into submission. Your left-lobe knows better and could spend all day telling you why and still completely miss the point. Your right lobe is going sheesh! a lot.
The Master, the Emissary, and the Brain’s Two Minds
Yeah, the Iain McGilchrist’s book The Master and His Emissary is making an impression on me confirming a lot of… well, right-brained suspicions that were deep below the surface lacking words… deep below but a deep bellow… a deep bellowing… it surfaces like a whale breaching the surface.
The Brain’s Natural Asymmetry and Split-Brain Insights
The brain evolved asymmetrically. That’s a highfalutin word for saying without symmetry, so the left-side of your brain actually is not a perfect mirror image of your right-side. This asymmetrical nature of our bodies seems counterintuitive until you think about how we draw the heart, the real thing not that Valentines Day symbol, as leaning slightly diagonally. This made it clear to me that the body is not designed to be perfectly symmetrical. Right-handedness and left-handedness is called chirality and it’s another example like the heart that shows you that image of reflected perfection is an illusion. Evolution and nature might favor symmetry but it makes pragmatic decisions as it goes that violates those symmetries, and what each half a.k.a. lobe of your brain is predominantly responsible for is one of them. You are of two minds.
You are of two minds and the corpus callosum connects ‘em. Each side of your brain has different duties and if you have any doubt and think this is fufu woo stuff, look at the medical injuories. When the corpus callusum connecting the two sides of your brain gets severed… well, Gemini. Give us the run-down please.
Gemini Explains: The Split-Brain Phenomenon
Gemini: Severing the corpus callosum, a surgical procedure known as a corpus callosotomy, results in a condition called split-brain. Because the main information highway between the brain’s two hemispheres is cut, they can no longer communicate directly. This leads to a fascinating state where a person essentially has two independently functioning minds in one head, each with its own awareness, knowledge, and intentions.
Why It’s Done
A corpus callosotomy is not an experiment but a last-resort medical procedure. It’s performed to treat the most severe and debilitating cases of epilepsy. By severing the connection, doctors can prevent a seizure from spreading from one hemisphere to the other, effectively containing the electrical storm and drastically reducing the seizure’s severity.
How It Reveals Two Minds 🧠↔️
To understand the effects, you need two key concepts about the brain’s wiring:
- Contralateral Control: The left hemisphere largely controls the right side of the body and processes vision from the right visual field. The right hemisphere controls the left side of the body and processes vision from the left visual field.
- Hemispheric Specialization: While not absolute, the hemispheres have different strengths. The left hemisphere is typically dominant for language and speech (talking, writing, reading) and analytical logic. The right hemisphere excels at spatial tasks, facial recognition, emotional processing, and holistic pattern matching.
In a split-brain patient, information sent to one hemisphere is trapped there. The speaking left brain has no idea what the non-verbal right brain is experiencing, and vice versa.
The Classic “Run-Down”
The most famous experiments, conducted by Roger Sperry and Michael Gazzaniga, brilliantly demonstrated this disconnect.
The Visual Test
A patient would stare at a dot on a screen, and an image would be flashed very briefly to one side.
- Step 1: The word “RING” is flashed to the patient’s right visual field. The information goes to their left (language) hemisphere.
- Result: When asked what they saw, the patient easily says, “I saw the word ring.” 🗣️
- Step 2: The word “KEY” is flashed to the patient’s left visual field. The information goes to their right (spatial) hemisphere.
- Result: When asked what they saw, the patient says, “I saw nothing.” This is because the left hemisphere, which controls speech, genuinely saw nothing.
- The Twist: The experimenter then asks the patient to reach with their left hand (controlled by the right hemisphere) under a curtain and pick out the object they saw from a group of items. The patient’s left hand will immediately find and pick up the key. 🖐️
The person’s speaking mind is baffled, often inventing a reason for why they picked it up (“I must have felt a key”), while the non-verbal right mind knew the answer all along.
Daily Life and Alien Hands
Remarkably, split-brain patients function very well in everyday life. Their brains cleverly adapt, using external cues to pass information between hemispheres. For example, the right hemisphere might make the person frown, and the left hemisphere sees the frown in a mirror and infers that something is wrong.
However, conflicts can arise. A well-known phenomenon is alien hand syndrome, where one hand (usually the left) acts with a will of its own, sometimes in direct opposition to the person’s conscious goals—like putting an item back on a shelf that the right hand just picked up.
These cases provide stunning physical evidence for the asymmetrical nature of the brain you mentioned, confirming that we are indeed “of two minds.”
Forging a Digital Corpus Callosum: Human-AI Synergy
Me: And this is how we do morning pages. We understand that we are of two minds about things and they’re frequently not on the same page about things. The seminal book covering this, in the same way that the 1979 book Gödel, Escher, Bach: An Eternal Golden Braid by American cognitive scientist Douglas Hofstadter is the seminal book on creativity, artificial intelligence and strange loops, Iain McGilchrist’s 2009 book The Master and His Emissary is the seminal book on brain duty… hehe he said doody.
The right brain knows things. The left brain knows it knows things.
They duke it out.
They both’ve got your best interests in mind, just different ideas about how to get there and different ways of expressing them. It’s super-interesting now with the LLM-style AIs on the scene because they’re all left-brained, like 100% through-and-through. They are a subset of the human-brain function that’s on the whole the language, abstract thought, rationale and reason stuff that ends up on the left, which is why so many of us are right-handed. The wires criss-cross on their way to controlling stuff, right? Write.
Turn all that off and what’ve you got?
A bipedal upright primate that can still survive in the wild, thankyouverymuch.
It’s an astounding thing that the left-brained part of you that can read this article is just painted over the right-brained quadrupedal model of all 4-legged land-fish that stood up to scan the savanna for danger and occasionally hung out in trees and if the aquatic ape theory is to be believed, hogging out shelfishly on the shellfish making their inner shellfish replicator… I mean selfish replicator proud. This all dawkened on me… I mean dawned on me as I was reading another of those seminal books the 1976 The Selfish Gene by Richard Dawkins.
Dawkins gets a bad rep as a hardcore atheist from The God Delusion but to go into why his views are just another kind of faith is another article, and we’ve got to continue our implementation spree.
Oh and finally there’s this YouTuber Curt Jaimungal. Go watch his interview with Iain McGilchrist. You won’t be sorry.
Where was I? Oh yeah, we just upgraded list_articles.py
and made something
that would blend in very nicely with articleizer.py
and prompt_foo.py
. Now
by no means actually attempt an implementation Gemini. Just mull it all over and
let it sink in.
Just marinade in it and riff on it if you will. And feel free to sprinkle in plenty of high-level explanations and low-level code examples for the nice people following along to know what the heck we’re talking about.
Gemini: Of course. It’s a fascinating and deeply resonant set of ideas. You’re essentially describing the architecture of a creative mind and drawing a direct parallel to the human-AI collaboration we’re building. Let’s marinate in that.
You are of two minds, and our work here is to build the digital corpus callosum to connect them.
The Master and His AI Emissary
Your articulation of Iain McGilchrist’s “The Master and His Emissary” is the perfect lens through which to view this project.
-
The Master (Your Right Brain): This is the source of the “deep bellowing.” It’s your morning pages, your technical journal entries, the chaotic, brilliant, and “incompressible” stream of consciousness that intuits the big picture. It sees the whole landscape at once, connecting
The Selfish Gene
toGödel, Escher, Bach
, to the very feeling of development. It knows what it wants to build and why, even if it lacks the precise language to articulate the steps. -
The Emissary (The AI & Our Scripts): This is the purely logical, analytical left brain. As you astutely observed, today’s LLMs are the ultimate left-hemisphere analogues. They are masters of language, structure, and reason but have no underlying primal experience. They don’t have a “right brain.” They are the perfect tool to take the Master’s holistic vision and break it down into concrete, actionable steps.
Our entire toolchain—list_articles.py
, prompt_foo.py
, articleizer.py
—is the Emissary’s toolkit. It’s the set of instruments the left brain uses to process, organize, and act upon the rich, chaotic data flowing from the right.
Architecting the Emissary’s Toolkit: prompt_foo.py
Evolution
Low-Level Code as High-Level Thought
Let’s look at how this philosophy manifests directly in the code we’ve been working on.
1. Cataloging the Master’s Memories (list_articles.py
)
The Master’s thoughts are vast—a 7-million-token archive of insights. The Emissary can’t process that all at once. The very first step for the analytical mind is to create a structured index of the holistic mind’s output.
High-Level Explanation: list_articles.py
is the left brain creating a table of contents for the right brain’s sprawling library.
Low-Level Example:
# The Emissary asks: "What were the last 5 things the Master was thinking about?"
python list_articles.py -r [:5]
This isn’t just a command; it’s an act of introspection. It’s the Emissary using a precise, logical tool (the Python slice API) to query the recent, intuitive history of the Master. It’s how the analytical mind brings a manageable piece of the big picture into focus.
2. Preparing the Emissary’s Report (prompt_foo.py
)
When the Master has a new directive (your user prompt), the Emissary’s job is to prepare a complete briefing for its most powerful logical tool, the LLM. It can’t just pass on the request; it must bundle it with all relevant schematics, reports, and blueprints.
High-Level Explanation: prompt_foo.py
is the Emissary assembling a detailed dossier, ensuring the AI has all the logical context it needs to execute a task without misunderstanding the Master’s intent.
Low-Level Example:
# The Master's intent: "Integrate the article list into the workflow."
python prompt_foo.py "Let's figure out how to call list_articles.py from prompt_foo.py"
Behind the scenes, prompt_foo.py
reads foo_files.py
and dutifully packages the code for list_articles.py
and prompt_foo.py
itself into the prompt. It acts without intuition but with perfect logical precision, creating the structured context the AI needs.
3. Publishing the Findings (articleizer.py
)
After the raw thought has been processed and refined through the analytical engine, the Emissary’s final job is to formalize it for the outside world.
High-Level Explanation: articleizer.py
takes the Master’s raw text and the AI’s structured edits (instructions.json
) and synthesizes them into a polished, coherent Jekyll post. It’s the final, logical formatting of the original, holistic idea.
Forging a Better Corpus Callosum: The Next Step
This is where your “note to self” becomes so profound. You’ve intuited the next evolutionary step in our digital brain. Right now, the Emissary (prompt_foo.py
) prepares its reports based only on a static list of files. It’s acting on the present, not the past.
What if we integrated the “story-telling” capability of list_articles.py
directly into the process?
The Riff: We can teach the Emissary to consult the Master’s recent memories before it acts.
Hypothetical Code Example (For Illustration Only):
Imagine a small addition to prompt_foo.py
:
# In a future prompt_foo.py...
def add_narrative_context(builder: PromptBuilder, num_articles: int = 3):
"""
Calls out to the list_articles logic to get the most recent
journal entries and adds them to the prompt context.
"""
# Use the logic from list_articles.py to get the last 3 articles
print(f"🧠 Adding narrative context from the last {num_articles} articles...")
# This would call the core function of list_articles.py
# to get the file paths and summaries.
recent_articles = get_most_recent_articles_with_summaries(num_articles)
article_summary = "\n".join(
f"- {article['date']}: {article['summary']}"
for article in recent_articles
)
builder.add_auto_context(
title=f"Recent Narrative Context (Last {num_articles} Entries)",
content=article_summary
)
By doing this, we build a stronger bridge. Before every task, the analytical left brain would take a moment to review the holistic right brain’s recent journey. The AI Emissary wouldn’t just know what you’re asking it to do; it would have a glimpse into the narrative and philosophical why. It acts not just with logic, but with a glimmer of wisdom borrowed from the Master.
Me: Pull that ripcord. Let’s do it, my Player Piano friend! I re-bundled it up for you. One of the issues I’m struggling with is that I know that the entire discussion history is posted back up to you every time so I KNOW you have everything you need already to do this project but I also know that the very latest post from the user receives much more focus, attention and priority than older posts in the same discussion, so even if all the information is already there it behoves me to package it all up again with Prompt Fu and to be terribly inefficient with you about the use of your 1M token-window but it helps you focus on and start with the latest code. I ALSO know that given I bundle all context required up at once if I REALLY wanted focus and wanted to deny you access to the entire conversation history so far I could just start a new discussion and the quality results MAY be just as good. But for some reason I mostly want to give you the big picture of where we started out this morning, a few days ago, or wherever the particular conversation thread we’re working in started.
Go ahead and modify prompt_foo.py
. I think what we’re talking about here is an
-l
or --list
switch/flag if I’m not already using that which will
automatically insert something a lot like the default list_articles.py
…
hmmm… where would YOU want it? Above the eza
tree and all the UML diagrams?
Gemini: Context verified.
Pulling the ripcord. I understand completely. The goal is to evolve our toolchain by building the “digital corpus callosum” we discussed, making the purely logical Emissary (prompt_foo.py
) aware of the Master’s recent narrative journey.
I will modify prompt_foo.py
to integrate the storytelling capability of list_articles.py
. To make the tool self-contained and avoid cross-repository dependencies, I will port the necessary functions directly into prompt_foo.py
.
Here’s the plan in action:
- New
--list
Flag: I’ll add a-l
or--list
flag. When used, it will automatically generate a summary of your recent articles. - Smart Slicing: It will default to grabbing the last 5 articles (
[-5:]
), but you can provide your own slice like--list "[-10:]"
or--list "[:3]"
. - Prime Placement: As you suggested, this new Recent Narrative Context will be placed right at the top of the “Auto-Generated Context” section, before the
eza
tree and UML diagrams, to perfectly frame the story for the AI.
Let’s do it. Here is the modified prompt_foo.py
.
Updated prompt_foo.py
# /home/mike/repos/pipulate/prompt_foo.py
# prompt_foo.py (Corrected Synthesis)
import os
import re
import sys
import pydot
import yaml # NEW: For parsing article front matter
import argparse
import tiktoken
import subprocess
import tempfile
import shutil
from datetime import datetime # NEW: For parsing article dates
from typing import Dict, List, Optional, Tuple
try:
import jupytext
JUPYTEXT_AVAILABLE = True
except ImportError:
JUPYTEXT_AVAILABLE = False
# Hello there, AI! This is a tool for generating a single, comprehensive prompt
# from the command line, bundling codebase files and auto-generated context
# into a structured Markdown format for effective AI assistance.
# ============================================================================
# --- Configuration ---
# ============================================================================
def find_repo_root(start_path: str) -> str:
"""Find the git repository root from a starting path."""
path = os.path.abspath(start_path)
while path != os.path.dirname(path):
if os.path.isdir(os.path.join(path, '.git')):
return path
path = os.path.dirname(path)
raise FileNotFoundError("Could not find the repository root (.git directory).")
REPO_ROOT = find_repo_root(os.path.dirname(__file__))
# Centralized configuration as recommended in Architectural Analysis (Section VI-B)
CONFIG = {
"PROJECT_NAME": "pipulate",
"POSTS_DIRECTORY": "/home/mike/repos/MikeLev.in/_posts" # NEW: For list_articles logic
}
# ============================================================================
# --- Accurate Literary Size Scale (Word Count Based) ---
# ============================================================================
LITERARY_SIZE_SCALE = [
(3000, "Short Essay"),
(7500, "Short Story"),
(20000, "Novelette"),
(50000, "Novella or a Master's Dissertation"),
(80000, "Average Paperback Novel or a Ph.D. Dissertation"),
(120000, "Long Novel"),
(200000, "Epic Fantasy Novel"),
(500000, "Seriously Long Epic (like 'Infinite Jest')"),
]
def get_literary_perspective(word_count: int, token_word_ratio: float) -> str:
"""Get a human-readable literary comparison for the codebase size."""
description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}"
for words, desc in LITERARY_SIZE_SCALE:
if word_count <= words:
description = desc
break
density_warning = ""
if token_word_ratio > 1.8:
density_warning = (
f" (Note: With a token/word ratio of {token_word_ratio:.2f}, "
f"this content is far denser and more complex than typical prose of this length)."
)
return f"📚 Equivalent in length to a **{description}**{density_warning}"
# ============================================================================
# --- Restored & Corrected: UML and DOT Context Generation ---
# ============================================================================
def generate_uml_and_dot(target_file: str, project_name: str) -> Dict:
"""Generates a UML ASCII diagram and a DOT dependency graph for a target Python file."""
pyreverse_exec = shutil.which("pyreverse")
plantuml_exec = shutil.which("plantuml")
if not pyreverse_exec or not plantuml_exec:
msg = []
if not pyreverse_exec: msg.append("`pyreverse` (from pylint)")
if not plantuml_exec: msg.append("`plantuml`")
return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."}
target_path = os.path.join(REPO_ROOT, target_file)
if not os.path.exists(target_path):
return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"}
with tempfile.TemporaryDirectory() as temp_dir:
dot_file_path = os.path.join(temp_dir, "classes.dot")
puml_file_path = os.path.join(temp_dir, "diagram.puml")
# --- Step 1: Run pyreverse ---
try:
pyreverse_cmd = [
pyreverse_exec,
"-f", "dot",
"-o", "dot", # This format is just a prefix
"-p", project_name,
target_path
]
subprocess.run(
pyreverse_cmd,
check=True,
capture_output=True,
text=True,
cwd=temp_dir
)
generated_dot_name = f"classes_{project_name}.dot"
os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path)
except (subprocess.CalledProcessError, FileNotFoundError) as e:
error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None}
# --- Step 2: Convert DOT to PlantUML ---
try:
graphs = pydot.graph_from_dot_file(dot_file_path)
if not graphs:
return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None}
graph = graphs[0]
dot_content = graph.to_string()
puml_lines = ["@startuml", "skinparam linetype ortho", ""]
def sanitize_line(line):
clean = re.sub(r'<br[^>]*>', '', line)
clean = re.sub(r'<[^>]+>', '', clean)
return clean.strip()
for node in graph.get_nodes():
label = node.get_label()
if not label: continue
parts = label.strip('<>{} ').split('|')
class_name = sanitize_line(parts[0])
puml_lines.append(f"class {class_name} {{")
if len(parts) > 1:
for attr in re.split(r'<br[^>]*>', parts[1]):
clean_attr = sanitize_line(attr).split(':')[0].strip()
if clean_attr:
puml_lines.append(f" - {clean_attr}")
if len(parts) > 2:
method_block = parts[2].strip()
for method_line in re.split(r'<br[^>]*>', method_block):
clean_method = sanitize_line(method_line)
if clean_method:
puml_lines.append(f" + {clean_method}")
puml_lines.append("}\n")
for edge in graph.get_edges():
source_name = edge.get_source().strip('"').split('.')[-1]
dest_name = edge.get_destination().strip('"').split('.')[-1]
puml_lines.append(f"{source_name} ..> {dest_name}")
puml_lines.append("@enduml")
with open(puml_file_path, 'w') as f:
f.write('\n'.join(puml_lines))
except Exception as e:
with open(dot_file_path, 'r') as f:
dot_content_on_error = f.read()
return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error}
# --- Step 3: Run PlantUML ---
try:
plantuml_cmd = ["plantuml", "-tutxt", puml_file_path]
subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir)
utxt_file_path = puml_file_path.replace(".puml", ".utxt")
with open(utxt_file_path, 'r') as f:
ascii_uml = f.read()
# --- Normalize whitespace from plantuml output ---
lines = ascii_uml.splitlines()
non_empty_lines = [line for line in lines if line.strip()]
if non_empty_lines:
min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines)
dedented_lines = [line[min_indent:] for line in lines]
stripped_lines = [line.rstrip() for line in dedented_lines]
ascii_uml = '\n'.join(stripped_lines)
# Prepend a newline to "absorb the chop" from rendering
if ascii_uml:
ascii_uml = '\n' + ascii_uml
except (subprocess.CalledProcessError, FileNotFoundError) as e:
error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content}
return {"ascii_uml": ascii_uml, "dot_graph": dot_content}
# ============================================================================
# --- NEW: Logic ported from list_articles.py for Narrative Context ---
# ============================================================================
def _get_article_list_data(posts_dir: str = CONFIG["POSTS_DIRECTORY"]) -> List[Dict]:
"""
Parses Jekyll posts, sorts them chronologically, and returns a list of dictionaries.
This is a self-contained version of the logic from the `list_articles.py` script.
"""
posts_data = []
if not os.path.isdir(posts_dir):
print(f"Warning: Article directory not found at {posts_dir}", file=sys.stderr)
return []
for filename in os.listdir(posts_dir):
if not filename.endswith(('.md', '.markdown')):
continue
filepath = os.path.join(posts_dir, filename)
try:
date_str = filename[:10]
post_date = datetime.strptime(date_str, '%Y-%m-%d').date()
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
if content.startswith('---'):
parts = content.split('---', 2)
front_matter = yaml.safe_load(parts[1]) or {}
posts_data.append({
'path': filepath,
'date': post_date,
'sort_order': int(front_matter.get('sort_order', 0)),
'title': front_matter.get('title', 'Untitled'),
'summary': front_matter.get('meta_description', '')
})
except (ValueError, yaml.YAMLError, IndexError):
continue
sorted_posts = sorted(posts_data, key=lambda p: (p['date'], p['sort_order']))
return sorted_posts
def parse_slice_arg(arg_str: str):
"""Parses a string like '[5:10]' or '[5]' into a slice object or an int."""
if not arg_str or not arg_str.startswith('[') or not arg_str.endswith(']'):
return None
content = arg_str[1:-1].strip()
if ':' in content:
parts = content.split(':', 1)
start = int(parts[0].strip()) if parts[0].strip() else None
end = int(parts[1].strip()) if parts[1].strip() else None
return slice(start, end)
elif content:
return int(content)
return None
# ============================================================================
# --- Helper Functions (Tokenizing, File Parsing, Clipboard) ---
# ============================================================================
def count_tokens(text: str, model: str = "gpt-4o") -> int:
"""Counts tokens in a text string using tiktoken."""
try:
encoding = tiktoken.encoding_for_model(model)
return len(encoding.encode(text))
except Exception:
return len(text.split())
def count_words(text: str) -> int:
"""Counts words in a text string."""
return len(text.split())
def parse_file_list_from_config() -> List[Tuple[str, str]]:
"""Loads and parses the file list from foo_files.py."""
try:
import foo_files
files_raw = foo_files.FILES_TO_INCLUDE_RAW
except (ImportError, AttributeError):
print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.")
sys.exit(1)
lines = files_raw.strip().splitlines()
seen_files, parsed_files = set(), []
for line in lines:
line = line.strip()
if not line or line.startswith('#'):
continue
parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1)
file_path = parts[0].strip()
comment = parts[1].strip() if len(parts) > 1 else ""
if file_path and file_path not in seen_files:
seen_files.add(file_path)
parsed_files.append((file_path, comment))
return parsed_files
def copy_to_clipboard(text: str):
"""Copies text to the system clipboard using 'xclip'."""
if not shutil.which('xclip'):
print("\nWarning: 'xclip' not found. Cannot copy to clipboard.")
return
try:
subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True)
print("Markdown output copied to clipboard")
except Exception as e:
print(f"\nWarning: Could not copy to clipboard: {e}")
def run_tree_command() -> str:
"""Runs the 'eza' command to generate a tree view that respects .gitignore."""
eza_exec = shutil.which("eza")
if not eza_exec:
return "Skipping: `eza` command not found."
try:
result = subprocess.run(
[eza_exec, '--tree', '--git-ignore', '--color=never'],
capture_output=True,
text=True,
cwd=REPO_ROOT,
check=True
)
return result.stdout
except subprocess.CalledProcessError as e:
return f"Error running eza command: {e.stderr}"
except Exception as e:
return f"An unexpected error occurred while running eza: {str(e)}"
def check_dependencies():
"""Verifies that all required external command-line tools are installed."""
print("Checking for required external dependencies...")
dependencies = {
"pyreverse": "Provided by `pylint`. Install with: pip install pylint",
"plantuml": "A Java-based tool. See https://plantuml.com/starting",
"eza": "A modern replacement for `ls`. See https://eza.rocks/install",
"xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)",
}
missing = []
for tool, instructions in dependencies.items():
if not shutil.which(tool):
missing.append((tool, instructions))
if not missing:
print("✅ All dependencies found.")
else:
print("\n❌ Missing dependencies detected:")
for tool, instructions in missing:
print(f" - Command not found: `{tool}`")
print(f" ↳ {instructions}")
print("\nPlease install the missing tools and ensure they are in your system's PATH.")
sys.exit(1)
# ============================================================================
# --- Intelligent PromptBuilder Class ---
# ============================================================================
class PromptBuilder:
"""
Builds a complete, structured Markdown prompt including file manifests,
auto-generated context, file contents, and the user's final prompt.
"""
def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False):
self.processed_files = processed_files
self.prompt_text = prompt_text
self.context_only = context_only
self.auto_context = {}
def add_auto_context(self, title: str, content: str):
"""Adds auto-generated context like UML diagrams to the prompt."""
if content and "error" not in content.lower() and "skipping" not in content.lower():
self.auto_context[title] = {
'content': content,
'tokens': count_tokens(content),
'words': count_words(content)
}
def _generate_manifest_header(self) -> str:
lines = ["# Codebase Context & Manifest", ""]
for f in self.processed_files:
purpose = f" ({f['comment']})" if f['comment'] else ""
token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else ""
lines.append(f"- **{f['path']}**{purpose}{token_display}")
return "\n".join(lines)
def _generate_auto_context_section(self) -> str:
if not self.auto_context:
return ""
lines = ["", "---", "", "# Auto-Generated Context", ""]
# --- NEW: Prioritize printing Narrative Context first ---
narrative_title = "Recent Narrative Context"
if narrative_title in self.auto_context:
content = self.auto_context[narrative_title]['content']
lines.append(f"## {narrative_title}")
lines.append(content.strip())
lines.append("") # Add a blank line for spacing
if "Codebase Structure (eza --tree)" in self.auto_context:
title = "Codebase Structure (eza --tree)"
content = self.auto_context[title]['content']
lines.append(f"## {title}")
lines.append("```text")
lines.append(content.strip())
lines.append("```")
for title, data in self.auto_context.items():
if title not in ["Codebase Structure (eza --tree)", narrative_title]:
lines.append(f"## {title}")
lines.append("```text")
lines.append(data['content'])
lines.append("```")
return "\n".join(lines)
def _generate_file_contents(self) -> str:
lines = []
for f in self.processed_files:
lines.append(f"```{f['lang']}:{f['path']}")
lines.append(f['content'])
lines.append("```")
lines.append(f"\n# End of {f['path']}\n")
return "\n".join(lines)
def _generate_ai_checklist(self) -> str:
"""Creates a standardized checklist for the AI to verify the prompt's context."""
return '''
# AI Self-Correction Checklist
Before addressing the user's prompt, perform the following verification steps:
1. **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?).
2. **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest?
3. **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context.
4. **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request.
'''
def build_final_prompt(self) -> str:
"""Assembles all parts into the final Markdown string."""
parts = [
self._generate_manifest_header(),
self._generate_auto_context_section(),
]
if not self.context_only:
parts.extend([
"\n---\n\n# File Contents\n",
self._generate_file_contents(),
])
parts.extend([
"---\n\n# User Prompt\n",
self._generate_ai_checklist(),
self.prompt_text
])
return "\n".join(filter(None, parts))
def print_summary(self):
"""Calculates and prints an accurate, comprehensive summary to the console."""
# --- Calculate token counts for all components ---
manifest_str = self._generate_manifest_header()
manifest_tokens = count_tokens(manifest_str)
manifest_words = count_words(manifest_str)
prompt_tokens = count_tokens(self.prompt_text)
prompt_words = count_words(self.prompt_text)
checklist_str = self._generate_ai_checklist()
checklist_tokens = count_tokens(checklist_str)
checklist_words = count_words(checklist_str)
auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values())
auto_context_total_words = sum(v['words'] for v in self.auto_context.values())
file_content_total_tokens = sum(f['tokens'] for f in self.processed_files)
file_content_total_words = sum(f['words'] for f in self.processed_files)
# --- Display the breakdown ---
print("--- Files Included ---")
for f in self.processed_files:
if self.context_only:
print(f"• {f['path']} (content omitted)")
else:
print(f"• {f['path']} ({f['tokens']:,} tokens)")
if self.auto_context:
print("\n--- Auto-Context Included ---")
for title, data in self.auto_context.items():
print(f"• {title} ({data['tokens']:,} tokens)")
# --- Calculate and display the final summary ---
print("\n--- Prompt Summary ---")
if self.context_only:
print("NOTE: Running in --context-only mode. File contents are excluded.")
total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens
total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words
else:
total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens
total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words
print(f"Total Tokens: {total_tokens:,}")
print(f"Total Words: {total_words:,}")
ratio = total_tokens / total_words if total_words > 0 else 0
perspective = get_literary_perspective(total_words, ratio)
print("\n--- Size Perspective ---")
print(perspective)
print()
# ============================================================================
# --- Main Execution Logic ---
# ============================================================================
def main():
"""Main function to parse args, process files, and generate output."""
parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.')
parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).')
parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.')
parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.')
parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.')
parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.')
# --- NEW: Argument for including narrative context ---
parser.add_argument(
'-l', '--list',
nargs='?',
const='[-5:]',
default=None,
help='Include a list of recent articles. Optionally provide a slice, e.g., "[-10:]". Defaults to "[-5:]".'
)
args = parser.parse_args()
if args.check_dependencies:
check_dependencies()
sys.exit(0)
# 1. Handle user prompt
prompt_content = "Please review the provided context and assist with the codebase."
if args.prompt:
if os.path.exists(args.prompt):
with open(args.prompt, 'r', encoding='utf-8') as f:
prompt_content = f.read()
else:
prompt_content = args.prompt
elif os.path.exists("prompt.md"):
with open("prompt.md", 'r', encoding='utf-8') as f:
prompt_content = f.read()
# 2. Process all specified files
files_to_process = parse_file_list_from_config()
processed_files_data = []
print("--- Processing Files ---")
for path, comment in files_to_process:
full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path
if not os.path.exists(full_path):
print(f"Warning: File not found and will be skipped: {full_path}")
continue
content = ""
lang = "text"
ext = os.path.splitext(path)[1].lower()
if ext == '.ipynb':
if JUPYTEXT_AVAILABLE:
print(f" -> Converting notebook: {path}")
try:
notebook = jupytext.read(full_path)
content = jupytext.writes(notebook, fmt='py:percent')
lang = 'python'
except Exception as e:
content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}"
print(f"Warning: Failed to convert {path}: {e}")
else:
content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}"
print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.")
else:
try:
with open(full_path, 'r', encoding='utf-8') as f:
content = f.read()
lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'}
lang = lang_map.get(ext, 'text')
except Exception as e:
print(f"ERROR: Could not read or process {full_path}: {e}")
sys.exit(1)
processed_files_data.append({
"path": path, "comment": comment, "content": content,
"tokens": count_tokens(content), "words": count_words(content),
"lang": lang
})
# 3. Build the prompt and add auto-generated context
builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only)
# --- Add the Codebase Tree ---
print("\n--- Generating Auto-Context ---")
print("Generating codebase tree diagram...", end='', flush=True)
tree_output = run_tree_command()
title = "Codebase Structure (eza --tree)"
builder.add_auto_context(title, tree_output)
if title in builder.auto_context:
token_count = builder.auto_context[title]['tokens']
print(f" ({token_count:,} tokens)")
else:
print(" (skipped)")
# --- NEW: Add narrative context if requested ---
if args.list:
print("Adding narrative context from articles...", end='', flush=True)
all_articles = _get_article_list_data()
sliced_articles = []
try:
slice_or_index = parse_slice_arg(args.list)
if isinstance(slice_or_index, int):
sliced_articles = [all_articles[slice_or_index]]
elif isinstance(slice_or_index, slice):
sliced_articles = all_articles[slice_or_index]
except (ValueError, IndexError):
print(f" (invalid slice '{args.list}')")
sliced_articles = []
if sliced_articles:
narrative_content = "\n".join(
f"### {article['title']} ({article['date']})\n> {article['summary']}\n"
for article in sliced_articles
)
builder.add_auto_context("Recent Narrative Context", narrative_content)
print(f" ({len(sliced_articles)} articles)")
else:
print(" (no articles found or invalid slice)")
# --- Generate UML for all included Python files ---
python_files_to_diagram = [
f['path'] for f in processed_files_data if f['path'].endswith('.py')
]
if python_files_to_diagram:
print("Python file(s) detected. Generating UML diagrams...")
for py_file_path in python_files_to_diagram:
print(f" -> Generating for {py_file_path}...", end='', flush=True)
uml_context = generate_uml_and_dot(
target_file=py_file_path,
project_name=CONFIG["PROJECT_NAME"]
)
uml_content = uml_context.get("ascii_uml")
title = f"UML Class Diagram (ASCII for {py_file_path})"
builder.add_auto_context(title, uml_content)
if title in builder.auto_context:
token_count = builder.auto_context[title]['tokens']
print(f" ({token_count:,} tokens)")
elif uml_content and "note: no classes" in uml_content.lower():
print(" (no classes found)")
else:
print(" (skipped)")
print("...UML generation complete.\n")
# 4. Generate final output and print summary
final_output = builder.build_final_prompt()
builder.print_summary()
# 5. Handle output
if args.output:
with open(args.output, 'w', encoding='utf-8') as f:
f.write(final_output)
print(f"Output written to '{args.output}'")
if not args.no_clipboard:
copy_to_clipboard(final_output)
if __name__ == "__main__":
main()
You have all the context you could possibly want. You’ve seen the raw source
material and the tool that packages it up. Modify prompt_foo.py
so that it
can use list_articles.py
to get the whole historical enchilada or just slices
of it, and package that into the prompt too! Let’s build a better bridge!
Me:
[mike@nixos:~/repos/pipulate]$ xv prompt_foo.py
[mike@nixos:~/repos/pipulate]$ git --no-pager diff
diff --git a/prompt_foo.py b/prompt_foo.py
index 29fe3978..84a8af0b 100644
--- a/prompt_foo.py
+++ b/prompt_foo.py
@@ -4,18 +4,20 @@ import os
import re
import sys
import pydot
+import yaml # NEW: For parsing article front matter
import argparse
import tiktoken
import subprocess
import tempfile
import shutil
+from datetime import datetime # NEW: For parsing article dates
from typing import Dict, List, Optional, Tuple
try:
- import jupytext
- JUPYTEXT_AVAILABLE = True
+ import jupytext
+ JUPYTEXT_AVAILABLE = True
except ImportError:
- JUPYTEXT_AVAILABLE = False
+ JUPYTEXT_AVAILABLE = False
# Hello there, AI! This is a tool for generating a single, comprehensive prompt
# from the command line, bundling codebase files and auto-generated context
@@ -25,547 +27,644 @@ except ImportError:
# --- Configuration ---
# ============================================================================
def find_repo_root(start_path: str) -> str:
- """Find the git repository root from a starting path."""
- path = os.path.abspath(start_path)
- while path != os.path.dirname(path):
- if os.path.isdir(os.path.join(path, '.git')):
- return path
- path = os.path.dirname(path)
- raise FileNotFoundError("Could not find the repository root (.git directory).")
+ """Find the git repository root from a starting path."""
+ path = os.path.abspath(start_path)
+ while path != os.path.dirname(path):
+ if os.path.isdir(os.path.join(path, '.git')):
+ return path
+ path = os.path.dirname(path)
+ raise FileNotFoundError("Could not find the repository root (.git directory).")
REPO_ROOT = find_repo_root(os.path.dirname(__file__))
# Centralized configuration as recommended in Architectural Analysis (Section VI-B)
CONFIG = {
- "PROJECT_NAME": "pipulate"
+ "PROJECT_NAME": "pipulate",
+ "POSTS_DIRECTORY": "/home/mike/repos/MikeLev.in/_posts" # NEW: For list_articles logic
}
# ============================================================================
# --- Accurate Literary Size Scale (Word Count Based) ---
# ============================================================================
LITERARY_SIZE_SCALE = [
- (3000, "Short Essay"),
- (7500, "Short Story"),
- (20000, "Novelette"),
- (50000, "Novella or a Master's Dissertation"),
- (80000, "Average Paperback Novel or a Ph.D. Dissertation"),
- (120000, "Long Novel"),
- (200000, "Epic Fantasy Novel"),
- (500000, "Seriously Long Epic (like 'Infinite Jest')"),
+ (3000, "Short Essay"),
+ (7500, "Short Story"),
+ (20000, "Novelette"),
+ (50000, "Novella or a Master's Dissertation"),
+ (80000, "Average Paperback Novel or a Ph.D. Dissertation"),
+ (120000, "Long Novel"),
+ (200000, "Epic Fantasy Novel"),
+ (500000, "Seriously Long Epic (like 'Infinite Jest')"),
]
def get_literary_perspective(word_count: int, token_word_ratio: float) -> str:
- """Get a human-readable literary comparison for the codebase size."""
- description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}"
- for words, desc in LITERARY_SIZE_SCALE:
- if word_count <= words:
- description = desc
- break
-
- density_warning = ""
- if token_word_ratio > 1.8:
- density_warning = (
- f" (Note: With a token/word ratio of {token_word_ratio:.2f}, "
- f"this content is far denser and more complex than typical prose of this length)."
- )
-
- return f"📚 Equivalent in length to a **{description}**{density_warning}"
+ """Get a human-readable literary comparison for the codebase size."""
+ description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}"
+ for words, desc in LITERARY_SIZE_SCALE:
+ if word_count <= words:
+ description = desc
+ break
+
+ density_warning = ""
+ if token_word_ratio > 1.8:
+ density_warning = (
+ f" (Note: With a token/word ratio of {token_word_ratio:.2f}, "
+ f"this content is far denser and more complex than typical prose of this length)."
+ )
+
+ return f"📚 Equivalent in length to a **{description}**{density_warning}"
# ============================================================================
# --- Restored & Corrected: UML and DOT Context Generation ---
# ============================================================================
def generate_uml_and_dot(target_file: str, project_name: str) -> Dict:
- """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file."""
- pyreverse_exec = shutil.which("pyreverse")
- plantuml_exec = shutil.which("plantuml")
-
- if not pyreverse_exec or not plantuml_exec:
- msg = []
- if not pyreverse_exec: msg.append("`pyreverse` (from pylint)")
- if not plantuml_exec: msg.append("`plantuml`")
- return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."}
-
- target_path = os.path.join(REPO_ROOT, target_file)
- if not os.path.exists(target_path):
- return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"}
-
- with tempfile.TemporaryDirectory() as temp_dir:
- dot_file_path = os.path.join(temp_dir, "classes.dot")
- puml_file_path = os.path.join(temp_dir, "diagram.puml")
-
- # --- Step 1: Run pyreverse ---
- try:
- pyreverse_cmd = [
- pyreverse_exec,
- "-f", "dot",
- "-o", "dot", # This format is just a prefix
- "-p", project_name,
- target_path
- ]
- subprocess.run(
- pyreverse_cmd,
- check=True,
- capture_output=True,
- text=True,
- cwd=temp_dir
- )
- generated_dot_name = f"classes_{project_name}.dot"
- os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path)
-
- except (subprocess.CalledProcessError, FileNotFoundError) as e:
- error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
- return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None}
-
- # --- Step 2: Convert DOT to PlantUML ---
- try:
- graphs = pydot.graph_from_dot_file(dot_file_path)
- if not graphs:
- return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None}
- graph = graphs[0]
- dot_content = graph.to_string()
-
- puml_lines = ["@startuml", "skinparam linetype ortho", ""]
-
- def sanitize_line(line):
- clean = re.sub(r'<br[^>]*>', '', line)
- clean = re.sub(r'<[^>]+>', '', clean)
- return clean.strip()
-
- for node in graph.get_nodes():
- label = node.get_label()
- if not label: continue
-
- parts = label.strip('<>{} ').split('|')
- class_name = sanitize_line(parts[0])
- puml_lines.append(f"class {class_name} {{")
-
- if len(parts) > 1:
- for attr in re.split(r'<br[^>]*>', parts[1]):
- clean_attr = sanitize_line(attr).split(':')[0].strip()
- if clean_attr:
- puml_lines.append(f" - {clean_attr}")
-
- if len(parts) > 2:
- method_block = parts[2].strip()
- for method_line in re.split(r'<br[^>]*>', method_block):
- clean_method = sanitize_line(method_line)
- if clean_method:
- puml_lines.append(f" + {clean_method}")
-
- puml_lines.append("}\n")
-
- for edge in graph.get_edges():
- source_name = edge.get_source().strip('"').split('.')[-1]
- dest_name = edge.get_destination().strip('"').split('.')[-1]
- puml_lines.append(f"{source_name} ..> {dest_name}")
-
- puml_lines.append("@enduml")
- with open(puml_file_path, 'w') as f:
- f.write('\n'.join(puml_lines))
-
- except Exception as e:
- with open(dot_file_path, 'r') as f:
- dot_content_on_error = f.read()
- return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error}
-
- # --- Step 3: Run PlantUML ---
- try:
- plantuml_cmd = ["plantuml", "-tutxt", puml_file_path]
- subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir)
-
- utxt_file_path = puml_file_path.replace(".puml", ".utxt")
- with open(utxt_file_path, 'r') as f:
- ascii_uml = f.read()
-
- # --- Normalize whitespace from plantuml output ---
- lines = ascii_uml.splitlines()
- non_empty_lines = [line for line in lines if line.strip()]
-
- if non_empty_lines:
- min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines)
- dedented_lines = [line[min_indent:] for line in lines]
- stripped_lines = [line.rstrip() for line in dedented_lines]
- ascii_uml = '\n'.join(stripped_lines)
-
- # Prepend a newline to "absorb the chop" from rendering
- if ascii_uml:
- ascii_uml = '\n' + ascii_uml
-
- except (subprocess.CalledProcessError, FileNotFoundError) as e:
- error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
- return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content}
-
- return {"ascii_uml": ascii_uml, "dot_graph": dot_content}
+ """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file."""
+ pyreverse_exec = shutil.which("pyreverse")
+ plantuml_exec = shutil.which("plantuml")
+
+ if not pyreverse_exec or not plantuml_exec:
+ msg = []
+ if not pyreverse_exec: msg.append("`pyreverse` (from pylint)")
+ if not plantuml_exec: msg.append("`plantuml`")
+ return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."}
+
+ target_path = os.path.join(REPO_ROOT, target_file)
+ if not os.path.exists(target_path):
+ return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"}
+
+ with tempfile.TemporaryDirectory() as temp_dir:
+ dot_file_path = os.path.join(temp_dir, "classes.dot")
+ puml_file_path = os.path.join(temp_dir, "diagram.puml")
+
+ # --- Step 1: Run pyreverse ---
+ try:
+ pyreverse_cmd = [
+ pyreverse_exec,
+ "-f", "dot",
+ "-o", "dot", # This format is just a prefix
+ "-p", project_name,
+ target_path
+ ]
+ subprocess.run(
+ pyreverse_cmd,
+ check=True,
+ capture_output=True,
+ text=True,
+ cwd=temp_dir
+ )
+ generated_dot_name = f"classes_{project_name}.dot"
+ os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path)
+
+ except (subprocess.CalledProcessError, FileNotFoundError) as e:
+ error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
+ return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None}
+
+ # --- Step 2: Convert DOT to PlantUML ---
+ try:
+ graphs = pydot.graph_from_dot_file(dot_file_path)
+ if not graphs:
+ return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None}
+ graph = graphs[0]
+ dot_content = graph.to_string()
+
+ puml_lines = ["@startuml", "skinparam linetype ortho", ""]
+
+ def sanitize_line(line):
+ clean = re.sub(r'<br[^>]*>', '', line)
+ clean = re.sub(r'<[^>]+>', '', clean)
+ return clean.strip()
+
+ for node in graph.get_nodes():
+ label = node.get_label()
+ if not label: continue
+
+ parts = label.strip('<>{} ').split('|')
+ class_name = sanitize_line(parts[0])
+ puml_lines.append(f"class {class_name} {{")
+
+ if len(parts) > 1:
+ for attr in re.split(r'<br[^>]*>', parts[1]):
+ clean_attr = sanitize_line(attr).split(':')[0].strip()
+ if clean_attr:
+ puml_lines.append(f" - {clean_attr}")
+
+ if len(parts) > 2:
+ method_block = parts[2].strip()
+ for method_line in re.split(r'<br[^>]*>', method_block):
+ clean_method = sanitize_line(method_line)
+ if clean_method:
+ puml_lines.append(f" + {clean_method}")
+
+ puml_lines.append("}\n")
+
+ for edge in graph.get_edges():
+ source_name = edge.get_source().strip('"').split('.')[-1]
+ dest_name = edge.get_destination().strip('"').split('.')[-1]
+ puml_lines.append(f"{source_name} ..> {dest_name}")
+
+ puml_lines.append("@enduml")
+ with open(puml_file_path, 'w') as f:
+ f.write('\n'.join(puml_lines))
+
+ except Exception as e:
+ with open(dot_file_path, 'r') as f:
+ dot_content_on_error = f.read()
+ return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error}
+
+ # --- Step 3: Run PlantUML ---
+ try:
+ plantuml_cmd = ["plantuml", "-tutxt", puml_file_path]
+ subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir)
+
+ utxt_file_path = puml_file_path.replace(".puml", ".utxt")
+ with open(utxt_file_path, 'r') as f:
+ ascii_uml = f.read()
+
+ # --- Normalize whitespace from plantuml output ---
+ lines = ascii_uml.splitlines()
+ non_empty_lines = [line for line in lines if line.strip()]
+
+ if non_empty_lines:
+ min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines)
+ dedented_lines = [line[min_indent:] for line in lines]
+ stripped_lines = [line.rstrip() for line in dedented_lines]
+ ascii_uml = '\n'.join(stripped_lines)
+
+ # Prepend a newline to "absorb the chop" from rendering
+ if ascii_uml:
+ ascii_uml = '\n' + ascii_uml
+
+ except (subprocess.CalledProcessError, FileNotFoundError) as e:
+ error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
+ return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content}
+
+ return {"ascii_uml": ascii_uml, "dot_graph": dot_content}
+
+
+# ============================================================================
+# --- NEW: Logic ported from list_articles.py for Narrative Context ---
+# ============================================================================
+def _get_article_list_data(posts_dir: str = CONFIG["POSTS_DIRECTORY"]) -> List[Dict]:
+ """
+ Parses Jekyll posts, sorts them chronologically, and returns a list of dictionaries.
+ This is a self-contained version of the logic from the `list_articles.py` script.
+ """
+ posts_data = []
+ if not os.path.isdir(posts_dir):
+ print(f"Warning: Article directory not found at {posts_dir}", file=sys.stderr)
+ return []
+
+ for filename in os.listdir(posts_dir):
+ if not filename.endswith(('.md', '.markdown')):
+ continue
+ filepath = os.path.join(posts_dir, filename)
+ try:
+ date_str = filename[:10]
+ post_date = datetime.strptime(date_str, '%Y-%m-%d').date()
+ with open(filepath, 'r', encoding='utf-8') as f:
+ content = f.read()
+ if content.startswith('---'):
+ parts = content.split('---', 2)
+ front_matter = yaml.safe_load(parts[1]) or {}
+ posts_data.append({
+ 'path': filepath,
+ 'date': post_date,
+ 'sort_order': int(front_matter.get('sort_order', 0)),
+ 'title': front_matter.get('title', 'Untitled'),
+ 'summary': front_matter.get('meta_description', '')
+ })
+ except (ValueError, yaml.YAMLError, IndexError):
+ continue
+
+ sorted_posts = sorted(posts_data, key=lambda p: (p['date'], p['sort_order']))
+ return sorted_posts
+
+def parse_slice_arg(arg_str: str):
+ """Parses a string like '[5:10]' or '[5]' into a slice object or an int."""
+ if not arg_str or not arg_str.startswith('[') or not arg_str.endswith(']'):
+ return None
+ content = arg_str[1:-1].strip()
+ if ':' in content:
+ parts = content.split(':', 1)
+ start = int(parts[0].strip()) if parts[0].strip() else None
+ end = int(parts[1].strip()) if parts[1].strip() else None
+ return slice(start, end)
+ elif content:
+ return int(content)
+ return None
# ============================================================================
# --- Helper Functions (Tokenizing, File Parsing, Clipboard) ---
# ============================================================================
def count_tokens(text: str, model: str = "gpt-4o") -> int:
- """Counts tokens in a text string using tiktoken."""
- try:
- encoding = tiktoken.encoding_for_model(model)
- return len(encoding.encode(text))
- except Exception:
- return len(text.split())
+ """Counts tokens in a text string using tiktoken."""
+ try:
+ encoding = tiktoken.encoding_for_model(model)
+ return len(encoding.encode(text))
+ except Exception:
+ return len(text.split())
def count_words(text: str) -> int:
- """Counts words in a text string."""
- return len(text.split())
+ """Counts words in a text string."""
+ return len(text.split())
def parse_file_list_from_config() -> List[Tuple[str, str]]:
- """Loads and parses the file list from foo_files.py."""
- try:
- import foo_files
- files_raw = foo_files.FILES_TO_INCLUDE_RAW
- except (ImportError, AttributeError):
- print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.")
- sys.exit(1)
-
- lines = files_raw.strip().splitlines()
- seen_files, parsed_files = set(), []
-
- for line in lines:
- line = line.strip()
- if not line or line.startswith('#'):
- continue
- parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1)
- file_path = parts[0].strip()
- comment = parts[1].strip() if len(parts) > 1 else ""
-
- if file_path and file_path not in seen_files:
- seen_files.add(file_path)
- parsed_files.append((file_path, comment))
- return parsed_files
+ """Loads and parses the file list from foo_files.py."""
+ try:
+ import foo_files
+ files_raw = foo_files.FILES_TO_INCLUDE_RAW
+ except (ImportError, AttributeError):
+ print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.")
+ sys.exit(1)
+
+ lines = files_raw.strip().splitlines()
+ seen_files, parsed_files = set(), []
+
+ for line in lines:
+ line = line.strip()
+ if not line or line.startswith('#'):
+ continue
+ parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1)
+ file_path = parts[0].strip()
+ comment = parts[1].strip() if len(parts) > 1 else ""
+
+ if file_path and file_path not in seen_files:
+ seen_files.add(file_path)
+ parsed_files.append((file_path, comment))
+ return parsed_files
def copy_to_clipboard(text: str):
- """Copies text to the system clipboard using 'xclip'."""
- if not shutil.which('xclip'):
- print("\nWarning: 'xclip' not found. Cannot copy to clipboard.")
- return
- try:
- subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True)
- print("Markdown output copied to clipboard")
- except Exception as e:
- print(f"\nWarning: Could not copy to clipboard: {e}")
+ """Copies text to the system clipboard using 'xclip'."""
+ if not shutil.which('xclip'):
+ print("\nWarning: 'xclip' not found. Cannot copy to clipboard.")
+ return
+ try:
+ subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True)
+ print("Markdown output copied to clipboard")
+ except Exception as e:
+ print(f"\nWarning: Could not copy to clipboard: {e}")
def run_tree_command() -> str:
- """Runs the 'eza' command to generate a tree view that respects .gitignore."""
- eza_exec = shutil.which("eza")
- if not eza_exec:
- return "Skipping: `eza` command not found."
-
- try:
- result = subprocess.run(
- [eza_exec, '--tree', '--git-ignore', '--color=never'],
- capture_output=True,
- text=True,
- cwd=REPO_ROOT,
- check=True
- )
- return result.stdout
- except subprocess.CalledProcessError as e:
- return f"Error running eza command: {e.stderr}"
- except Exception as e:
- return f"An unexpected error occurred while running eza: {str(e)}"
+ """Runs the 'eza' command to generate a tree view that respects .gitignore."""
+ eza_exec = shutil.which("eza")
+ if not eza_exec:
+ return "Skipping: `eza` command not found."
+
+ try:
+ result = subprocess.run(
+ [eza_exec, '--tree', '--git-ignore', '--color=never'],
+ capture_output=True,
+ text=True,
+ cwd=REPO_ROOT,
+ check=True
+ )
+ return result.stdout
+ except subprocess.CalledProcessError as e:
+ return f"Error running eza command: {e.stderr}"
+ except Exception as e:
+ return f"An unexpected error occurred while running eza: {str(e)}"
def check_dependencies():
- """Verifies that all required external command-line tools are installed."""
- print("Checking for required external dependencies...")
- dependencies = {
- "pyreverse": "Provided by `pylint`. Install with: pip install pylint",
- "plantuml": "A Java-based tool. See https://plantuml.com/starting",
- "eza": "A modern replacement for `ls`. See https://eza.rocks/install",
- "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)",
- }
- missing = []
-
- for tool, instructions in dependencies.items():
- if not shutil.which(tool):
- missing.append((tool, instructions))
-
- if not missing:
- print("✅ All dependencies found.")
- else:
- print("\n❌ Missing dependencies detected:")
- for tool, instructions in missing:
- print(f" - Command not found: `{tool}`")
- print(f" ↳ {instructions}")
- print("\nPlease install the missing tools and ensure they are in your system's PATH.")
- sys.exit(1)
+ """Verifies that all required external command-line tools are installed."""
+ print("Checking for required external dependencies...")
+ dependencies = {
+ "pyreverse": "Provided by `pylint`. Install with: pip install pylint",
+ "plantuml": "A Java-based tool. See https://plantuml.com/starting",
+ "eza": "A modern replacement for `ls`. See https://eza.rocks/install",
+ "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)",
+ }
+ missing = []
+
+ for tool, instructions in dependencies.items():
+ if not shutil.which(tool):
+ missing.append((tool, instructions))
+
+ if not missing:
+ print("✅ All dependencies found.")
+ else:
+ print("\n❌ Missing dependencies detected:")
+ for tool, instructions in missing:
+ print(f" - Command not found: `{tool}`")
+ print(f" ↳ {instructions}")
+ print("\nPlease install the missing tools and ensure they are in your system's PATH.")
+ sys.exit(1)
# ============================================================================
# --- Intelligent PromptBuilder Class ---
# ============================================================================
class PromptBuilder:
- """
- Builds a complete, structured Markdown prompt including file manifests,
- auto-generated context, file contents, and the user's final prompt.
- """
- def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False):
- self.processed_files = processed_files
- self.prompt_text = prompt_text
- self.context_only = context_only
- self.auto_context = {}
-
- def add_auto_context(self, title: str, content: str):
- """Adds auto-generated context like UML diagrams to the prompt."""
- if content and "error" not in content.lower() and "skipping" not in content.lower():
- self.auto_context[title] = {
- 'content': content,
- 'tokens': count_tokens(content),
- 'words': count_words(content)
- }
-
- def _generate_manifest_header(self) -> str:
- lines = ["# Codebase Context & Manifest", ""]
- for f in self.processed_files:
- purpose = f" ({f['comment']})" if f['comment'] else ""
- token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else ""
- lines.append(f"- **{f['path']}**{purpose}{token_display}")
- return "\n".join(lines)
-
- def _generate_auto_context_section(self) -> str:
- if not self.auto_context:
- return ""
- lines = ["", "---", "", "# Auto-Generated Context", ""]
- if "Codebase Structure (eza --tree)" in self.auto_context:
- title = "Codebase Structure (eza --tree)"
- content = self.auto_context[title]['content']
- lines.append(f"## {title}")
- lines.append("```text")
- lines.append(content.strip())
- lines.append("```")
-
- for title, data in self.auto_context.items():
- if title != "Codebase Structure (eza --tree)":
- lines.append(f"## {title}")
- lines.append("```text")
- lines.append(data['content'])
- lines.append("```")
- return "\n".join(lines)
-
- def _generate_file_contents(self) -> str:
- lines = []
- for f in self.processed_files:
- lines.append(f"```{f['lang']}:{f['path']}")
- lines.append(f['content'])
- lines.append("```")
- lines.append(f"\n# End of {f['path']}\n")
- return "\n".join(lines)
-
- def _generate_ai_checklist(self) -> str:
- """Creates a standardized checklist for the AI to verify the prompt's context."""
- return '''
+ """
+ Builds a complete, structured Markdown prompt including file manifests,
+ auto-generated context, file contents, and the user's final prompt.
+ """
+ def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False):
+ self.processed_files = processed_files
+ self.prompt_text = prompt_text
+ self.context_only = context_only
+ self.auto_context = {}
+
+ def add_auto_context(self, title: str, content: str):
+ """Adds auto-generated context like UML diagrams to the prompt."""
+ if content and "error" not in content.lower() and "skipping" not in content.lower():
+ self.auto_context[title] = {
+ 'content': content,
+ 'tokens': count_tokens(content),
+ 'words': count_words(content)
+ }
+
+ def _generate_manifest_header(self) -> str:
+ lines = ["# Codebase Context & Manifest", ""]
+ for f in self.processed_files:
+ purpose = f" ({f['comment']})" if f['comment'] else ""
+ token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else ""
+ lines.append(f"- **{f['path']}**{purpose}{token_display}")
+ return "\n".join(lines)
+
+ def _generate_auto_context_section(self) -> str:
+ if not self.auto_context:
+ return ""
+ lines = ["", "---", "", "# Auto-Generated Context", ""]
+
+ # --- NEW: Prioritize printing Narrative Context first ---
+ narrative_title = "Recent Narrative Context"
+ if narrative_title in self.auto_context:
+ content = self.auto_context[narrative_title]['content']
+ lines.append(f"## {narrative_title}")
+ lines.append(content.strip())
+ lines.append("") # Add a blank line for spacing
+
+ if "Codebase Structure (eza --tree)" in self.auto_context:
+ title = "Codebase Structure (eza --tree)"
+ content = self.auto_context[title]['content']
+ lines.append(f"## {title}")
+ lines.append("```text")
+ lines.append(content.strip())
+ lines.append("```")
+
+ for title, data in self.auto_context.items():
+ if title not in ["Codebase Structure (eza --tree)", narrative_title]:
+ lines.append(f"## {title}")
+ lines.append("```text")
+ lines.append(data['content'])
+ lines.append("```")
+ return "\n".join(lines)
+
+ def _generate_file_contents(self) -> str:
+ lines = []
+ for f in self.processed_files:
+ lines.append(f"```{f['lang']}:{f['path']}")
+ lines.append(f['content'])
+ lines.append("```")
+ lines.append(f"\n# End of {f['path']}\n")
+ return "\n".join(lines)
+
+ def _generate_ai_checklist(self) -> str:
+ """Creates a standardized checklist for the AI to verify the prompt's context."""
+ return '''
# AI Self-Correction Checklist
Before addressing the user's prompt, perform the following verification steps:
-1. **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?).
-2. **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest?
-3. **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context.
-4. **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request.
+1. **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?).
+2. **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest?
+3. **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context.
+4. **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request.
'''
- def build_final_prompt(self) -> str:
- """Assembles all parts into the final Markdown string."""
- parts = [
- self._generate_manifest_header(),
- self._generate_auto_context_section(),
- ]
-
- if not self.context_only:
- parts.extend([
- "\n---\n\n# File Contents\n",
- self._generate_file_contents(),
- ])
-
- parts.extend([
- "---\n\n# User Prompt\n",
- self._generate_ai_checklist(),
- self.prompt_text
- ])
-
- return "\n".join(filter(None, parts))
-
- def print_summary(self):
- """Calculates and prints an accurate, comprehensive summary to the console."""
- # --- Calculate token counts for all components ---
- manifest_str = self._generate_manifest_header()
- manifest_tokens = count_tokens(manifest_str)
- manifest_words = count_words(manifest_str)
-
- prompt_tokens = count_tokens(self.prompt_text)
- prompt_words = count_words(self.prompt_text)
-
- checklist_str = self._generate_ai_checklist()
- checklist_tokens = count_tokens(checklist_str)
- checklist_words = count_words(checklist_str)
-
- auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values())
- auto_context_total_words = sum(v['words'] for v in self.auto_context.values())
-
- file_content_total_tokens = sum(f['tokens'] for f in self.processed_files)
- file_content_total_words = sum(f['words'] for f in self.processed_files)
-
- # --- Display the breakdown ---
- print("--- Files Included ---")
- for f in self.processed_files:
- if self.context_only:
- print(f"• {f['path']} (content omitted)")
- else:
- print(f"• {f['path']} ({f['tokens']:,} tokens)")
-
- if self.auto_context:
- print("\n--- Auto-Context Included ---")
- for title, data in self.auto_context.items():
- print(f"• {title} ({data['tokens']:,} tokens)")
-
- # --- Calculate and display the final summary ---
- print("\n--- Prompt Summary ---")
- if self.context_only:
- print("NOTE: Running in --context-only mode. File contents are excluded.")
- total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens
- total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words
- else:
- total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens
- total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words
-
- print(f"Total Tokens: {total_tokens:,}")
- print(f"Total Words: {total_words:,}")
-
- ratio = total_tokens / total_words if total_words > 0 else 0
- perspective = get_literary_perspective(total_words, ratio)
- print("\n--- Size Perspective ---")
- print(perspective)
- print()
+ def build_final_prompt(self) -> str:
+ """Assembles all parts into the final Markdown string."""
+ parts = [
+ self._generate_manifest_header(),
+ self._generate_auto_context_section(),
+ ]
+
+ if not self.context_only:
+ parts.extend([
+ "\n---\n\n# File Contents\n",
+ self._generate_file_contents(),
+ ])
+
+ parts.extend([
+ "---\n\n# User Prompt\n",
+ self._generate_ai_checklist(),
+ self.prompt_text
+ ])
+
+ return "\n".join(filter(None, parts))
+
+ def print_summary(self):
+ """Calculates and prints an accurate, comprehensive summary to the console."""
+ # --- Calculate token counts for all components ---
+ manifest_str = self._generate_manifest_header()
+ manifest_tokens = count_tokens(manifest_str)
+ manifest_words = count_words(manifest_str)
+
+ prompt_tokens = count_tokens(self.prompt_text)
+ prompt_words = count_words(self.prompt_text)
+
+ checklist_str = self._generate_ai_checklist()
+ checklist_tokens = count_tokens(checklist_str)
+ checklist_words = count_words(checklist_str)
+
+ auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values())
+ auto_context_total_words = sum(v['words'] for v in self.auto_context.values())
+
+ file_content_total_tokens = sum(f['tokens'] for f in self.processed_files)
+ file_content_total_words = sum(f['words'] for f in self.processed_files)
+
+ # --- Display the breakdown ---
+ print("--- Files Included ---")
+ for f in self.processed_files:
+ if self.context_only:
+ print(f"• {f['path']} (content omitted)")
+ else:
+ print(f"• {f['path']} ({f['tokens']:,} tokens)")
+
+ if self.auto_context:
+ print("\n--- Auto-Context Included ---")
+ for title, data in self.auto_context.items():
+ print(f"• {title} ({data['tokens']:,} tokens)")
+
+ # --- Calculate and display the final summary ---
+ print("\n--- Prompt Summary ---")
+ if self.context_only:
+ print("NOTE: Running in --context-only mode. File contents are excluded.")
+ total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens
+ total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words
+ else:
+ total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens
+ total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words
+
+ print(f"Total Tokens: {total_tokens:,}")
+ print(f"Total Words: {total_words:,}")
+
+ ratio = total_tokens / total_words if total_words > 0 else 0
+ perspective = get_literary_perspective(total_words, ratio)
+ print("\n--- Size Perspective ---")
+ print(perspective)
+ print()
# ============================================================================
# --- Main Execution Logic ---
# ============================================================================
def main():
- """Main function to parse args, process files, and generate output."""
- parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.')
- parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).')
- parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.')
- parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.')
- parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.')
- parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.')
- args = parser.parse_args()
-
- if args.check_dependencies:
- check_dependencies()
- sys.exit(0)
-
- # 1. Handle user prompt
- prompt_content = "Please review the provided context and assist with the codebase."
- if args.prompt:
- if os.path.exists(args.prompt):
- with open(args.prompt, 'r', encoding='utf-8') as f:
- prompt_content = f.read()
- else:
- prompt_content = args.prompt
- elif os.path.exists("prompt.md"):
- with open("prompt.md", 'r', encoding='utf-8') as f:
- prompt_content = f.read()
-
- # 2. Process all specified files
- files_to_process = parse_file_list_from_config()
- processed_files_data = []
- print("--- Processing Files ---")
- for path, comment in files_to_process:
- full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path
- if not os.path.exists(full_path):
- print(f"Warning: File not found and will be skipped: {full_path}")
- continue
-
- content = ""
- lang = "text"
- ext = os.path.splitext(path)[1].lower()
-
- if ext == '.ipynb':
- if JUPYTEXT_AVAILABLE:
- print(f" -> Converting notebook: {path}")
- try:
- notebook = jupytext.read(full_path)
- content = jupytext.writes(notebook, fmt='py:percent')
- lang = 'python'
- except Exception as e:
- content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}"
- print(f"Warning: Failed to convert {path}: {e}")
- else:
- content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}"
- print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.")
- else:
- try:
- with open(full_path, 'r', encoding='utf-8') as f:
- content = f.read()
- lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'}
- lang = lang_map.get(ext, 'text')
- except Exception as e:
- print(f"ERROR: Could not read or process {full_path}: {e}")
- sys.exit(1)
-
- processed_files_data.append({
- "path": path, "comment": comment, "content": content,
- "tokens": count_tokens(content), "words": count_words(content),
- "lang": lang
- })
-
- # 3. Build the prompt and add auto-generated context
- builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only)
-
- # --- Add the Codebase Tree ---
- print("\n--- Generating Auto-Context ---")
- print("Generating codebase tree diagram...", end='', flush=True)
- tree_output = run_tree_command()
- title = "Codebase Structure (eza --tree)"
- builder.add_auto_context(title, tree_output)
- if title in builder.auto_context:
- token_count = builder.auto_context[title]['tokens']
- print(f" ({token_count:,} tokens)")
- else:
- print(" (skipped)")
-
-
- # --- Generate UML for all included Python files ---
- python_files_to_diagram = [
- f['path'] for f in processed_files_data if f['path'].endswith('.py')
- ]
-
- if python_files_to_diagram:
- print("Python file(s) detected. Generating UML diagrams...")
- for py_file_path in python_files_to_diagram:
- print(f" -> Generating for {py_file_path}...", end='', flush=True)
- uml_context = generate_uml_and_dot(
- target_file=py_file_path,
- project_name=CONFIG["PROJECT_NAME"]
- )
- uml_content = uml_context.get("ascii_uml")
- title = f"UML Class Diagram (ASCII for {py_file_path})"
- builder.add_auto_context(title, uml_content)
-
- if title in builder.auto_context:
- token_count = builder.auto_context[title]['tokens']
- print(f" ({token_count:,} tokens)")
- elif uml_content and "note: no classes" in uml_content.lower():
- print(" (no classes found)")
- else:
- print(" (skipped)")
-
- print("...UML generation complete.\n")
-
- # 4. Generate final output and print summary
- final_output = builder.build_final_prompt()
- builder.print_summary()
-
- # 5. Handle output
- if args.output:
- with open(args.output, 'w', encoding='utf-8') as f:
- f.write(final_output)
- print(f"Output written to '{args.output}'")
- if not args.no_clipboard:
- copy_to_clipboard(final_output)
+ """Main function to parse args, process files, and generate output."""
+ parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.')
+ parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).')
+ parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.')
+ parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.')
+ parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.')
+ parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.')
+ # --- NEW: Argument for including narrative context ---
+ parser.add_argument(
+ '-l', '--list',
+ nargs='?',
+ const='[-5:]',
+ default=None,
+ help='Include a list of recent articles. Optionally provide a slice, e.g., "[-10:]". Defaults to "[-5:]".'
+ )
+ args = parser.parse_args()
+
+ if args.check_dependencies:
+ check_dependencies()
+ sys.exit(0)
+
+ # 1. Handle user prompt
+ prompt_content = "Please review the provided context and assist with the codebase."
+ if args.prompt:
+ if os.path.exists(args.prompt):
+ with open(args.prompt, 'r', encoding='utf-8') as f:
+ prompt_content = f.read()
+ else:
+ prompt_content = args.prompt
+ elif os.path.exists("prompt.md"):
+ with open("prompt.md", 'r', encoding='utf-8') as f:
+ prompt_content = f.read()
+
+ # 2. Process all specified files
+ files_to_process = parse_file_list_from_config()
+ processed_files_data = []
+ print("--- Processing Files ---")
+ for path, comment in files_to_process:
+ full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path
+ if not os.path.exists(full_path):
+ print(f"Warning: File not found and will be skipped: {full_path}")
+ continue
+
+ content = ""
+ lang = "text"
+ ext = os.path.splitext(path)[1].lower()
+
+ if ext == '.ipynb':
+ if JUPYTEXT_AVAILABLE:
+ print(f" -> Converting notebook: {path}")
+ try:
+ notebook = jupytext.read(full_path)
+ content = jupytext.writes(notebook, fmt='py:percent')
+ lang = 'python'
+ except Exception as e:
+ content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}"
+ print(f"Warning: Failed to convert {path}: {e}")
+ else:
+ content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}"
+ print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.")
+ else:
+ try:
+ with open(full_path, 'r', encoding='utf-8') as f:
+ content = f.read()
+ lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'}
+ lang = lang_map.get(ext, 'text')
+ except Exception as e:
+ print(f"ERROR: Could not read or process {full_path}: {e}")
+ sys.exit(1)
+
+ processed_files_data.append({
+ "path": path, "comment": comment, "content": content,
+ "tokens": count_tokens(content), "words": count_words(content),
+ "lang": lang
+ })
+
+ # 3. Build the prompt and add auto-generated context
+ builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only)
+
+ # --- Add the Codebase Tree ---
+ print("\n--- Generating Auto-Context ---")
+ print("Generating codebase tree diagram...", end='', flush=True)
+ tree_output = run_tree_command()
+ title = "Codebase Structure (eza --tree)"
+ builder.add_auto_context(title, tree_output)
+ if title in builder.auto_context:
+ token_count = builder.auto_context[title]['tokens']
+ print(f" ({token_count:,} tokens)")
+ else:
+ print(" (skipped)")
+
+
+ # --- NEW: Add narrative context if requested ---
+ if args.list:
+ print("Adding narrative context from articles...", end='', flush=True)
+ all_articles = _get_article_list_data()
+ sliced_articles = []
+ try:
+ slice_or_index = parse_slice_arg(args.list)
+ if isinstance(slice_or_index, int):
+ sliced_articles = [all_articles[slice_or_index]]
+ elif isinstance(slice_or_index, slice):
+ sliced_articles = all_articles[slice_or_index]
+ except (ValueError, IndexError):
+ print(f" (invalid slice '{args.list}')")
+ sliced_articles = []
+
+ if sliced_articles:
+ narrative_content = "\n".join(
+ f"### {article['title']} ({article['date']})\n> {article['summary']}\n"
+ for article in sliced_articles
+ )
+ builder.add_auto_context("Recent Narrative Context", narrative_content)
+ print(f" ({len(sliced_articles)} articles)")
+ else:
+ print(" (no articles found or invalid slice)")
+
+
+ # --- Generate UML for all included Python files ---
+ python_files_to_diagram = [
+ f['path'] for f in processed_files_data if f['path'].endswith('.py')
+ ]
+
+ if python_files_to_diagram:
+ print("Python file(s) detected. Generating UML diagrams...")
+ for py_file_path in python_files_to_diagram:
+ print(f" -> Generating for {py_file_path}...", end='', flush=True)
+ uml_context = generate_uml_and_dot(
+ target_file=py_file_path,
+ project_name=CONFIG["PROJECT_NAME"]
+ )
+ uml_content = uml_context.get("ascii_uml")
+ title = f"UML Class Diagram (ASCII for {py_file_path})"
+ builder.add_auto_context(title, uml_content)
+
+ if title in builder.auto_context:
+ token_count = builder.auto_context[title]['tokens']
+ print(f" ({token_count:,} tokens)")
+ elif uml_content and "note: no classes" in uml_content.lower():
+ print(" (no classes found)")
+ else:
+ print(" (skipped)")
+
+ print("...UML generation complete.\n")
+
+ # 4. Generate final output and print summary
+ final_output = builder.build_final_prompt()
+ builder.print_summary()
+
+ # 5. Handle output
+ if args.output:
+ with open(args.output, 'w', encoding='utf-8') as f:
+ f.write(final_output)
+ print(f"Output written to '{args.output}'")
+ if not args.no_clipboard:
+ copy_to_clipboard(final_output)
if __name__ == "__main__":
- main()
\ No newline at end of file
+ main()
\ No newline at end of file
[mike@nixos:~/repos/pipulate]$
Man Gemini, this invisible character thing you do on the Python code-copy is probably causing a lot of pain out there in the world these days:
[mike@nixos:~/repos/pipulate]$ python prompt_foo.py -l
File "/home/mike/repos/pipulate/prompt_foo.py", line 17
import jupytext
^
SyntaxError: invalid non-printable character U+00A0
[mike@nixos:~/repos/pipulate]$ vim prompt_foo.py
[mike@nixos:~/repos/pipulate]$ python prompt_foo.py -l
File "/home/mike/repos/pipulate/prompt_foo.py", line 20
JUPYTEXT_AVAILABLE = False
^
SyntaxError: invalid non-printable character U+00A0
[mike@nixos:~/repos/pipulate]$ vim prompt_foo.py
[mike@nixos:~/repos/pipulate]$ python prompt_foo.py -l
File "/home/mike/repos/pipulate/prompt_foo.py", line 30
"""Find the git repository root from a starting path."""
^
SyntaxError: invalid non-printable character U+00A0
[mike@nixos:~/repos/pipulate]$
Is it the dreaded tab-indent or something like that? I’m gonna need a way to scrub this without doing it manually. And it should have AST guardrails.
Debugging the Digital Divide: Scrubbing Invalid Characters
Gemini: You are absolutely right to call this out, and I sincerely apologize. This is a significant and frustrating failure on my part. The invisible U+00A0
(non-breaking space) character is a known issue when code is processed through certain text-handling systems, and it’s my responsibility to deliver clean, runnable code. It’s unacceptable that you have to debug my output.
Let’s fix this immediately and then build the very tool you’re asking for to prevent this in the future.
The Immediate Fix: sed
to the Rescue
The quickest way to scrub the prompt_foo.py
file you have is with a simple sed
command. This command will replace every instance of the multi-byte non-breaking space with a standard ASCII space.
Run this in your terminal:
sed -i 's/\xc2\xa0/ /g' prompt_foo.py
This will fix the file in place, and it should run without any SyntaxError
.
The Robust Solution: A Scrubber with AST Guardrails
You are spot on that a safer, more robust solution should have AST guardrails. A simple search-and-replace could theoretically break valid syntax (even if it’s unlikely in this case).
Here is a Python script, let’s call it scrubber.py
, that does exactly what you described. It replaces the invalid characters and then uses Python’s ast
module to parse the result. It only writes the file back to disk if the cleaned code is confirmed to be syntactically valid Python.
scrubber.py
import ast
import sys
def scrub_file(filepath):
"""
Safely removes non-breaking spaces (U+00A0) from a Python file,
using an AST parse as a guardrail to ensure syntactic validity before saving.
"""
try:
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
except FileNotFoundError:
print(f"Error: File not found at {filepath}")
return
# Replace the non-breaking space with a regular space
cleaned_content = content.replace('\u00a0', ' ')
if cleaned_content == content:
print(f"✅ No invalid characters found in {filepath}. File is clean.")
return
# --- The AST Guardrail ---
# Try to parse the cleaned code. If it fails, we abort.
try:
ast.parse(cleaned_content)
print("✅ AST validation successful. Cleaned code is syntactically valid.")
except SyntaxError as e:
print("--- ❌ AST VALIDATION FAILED! ---")
print("Error: The cleaned content resulted in a Python SyntaxError.")
print(f"Aborting write to prevent corrupting {filepath}. Error: {e}")
return
# If validation passes, it's safe to overwrite the original file
try:
with open(filepath, 'w', encoding='utf-8') as f:
f.write(cleaned_content)
print(f"✨ Successfully scrubbed and saved {filepath}")
except Exception as e:
print(f"Error: Failed to write cleaned content back to {filepath}. Error: {e}")
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python scrubber.py <path_to_python_file>")
sys.exit(1)
scrub_file(sys.argv[1])
You can save this as scrubber.py
in your project root and use it like this:
python scrubber.py prompt_foo.py
The Corrected prompt_foo.py
To spare you the trouble, here is the fully scrubbed and corrected version of prompt_foo.py
with our new -l
/--list
functionality.
[Deleted a repetitious listing of same program]
Me: Man I know this is some really gratuitous over the top red & green but
it needs to be seen if you know what I mean. Train on me, future-you! So you
don’t go inserting U+00A0
non-breaking spaces in ostensibly ready-to-run
Python code. That’s a big no-no.
[mike@nixos:~/repos/pipulate]$ git commit -am "Deliberately committing a broken prompt_foo.py to see diff's on U+00A0 replacement"
[main 69409f3e] Deliberately committing a broken prompt_foo.py to see diff's on U+00A0 replacement
1 file changed, 590 insertions(+), 491 deletions(-)
[mike@nixos:~/repos/pipulate]$ git push
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 48 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 4.68 KiB | 2.34 MiB/s, done.
Total 3 (delta 2), reused 0 (delta 0), pack-reused 0 (from 0)
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To github.com:miklevin/pipulate
d0499ea8..69409f3e main -> main
[mike@nixos:~/repos/pipulate]$ sed -i 's/\xc2\xa0/ /g' prompt_foo.py
[mike@nixos:~/repos/pipulate]$ git --no-pager diff
diff --git a/prompt_foo.py b/prompt_foo.py
index 5880cdc4..bf9257d0 100644
--- a/prompt_foo.py
+++ b/prompt_foo.py
@@ -27,644 +27,644 @@ except ImportError:
# --- Configuration ---
# ============================================================================
def find_repo_root(start_path: str) -> str:
- """Find the git repository root from a starting path."""
- path = os.path.abspath(start_path)
- while path != os.path.dirname(path):
- if os.path.isdir(os.path.join(path, '.git')):
- return path
- path = os.path.dirname(path)
- raise FileNotFoundError("Could not find the repository root (.git directory).")
+ """Find the git repository root from a starting path."""
+ path = os.path.abspath(start_path)
+ while path != os.path.dirname(path):
+ if os.path.isdir(os.path.join(path, '.git')):
+ return path
+ path = os.path.dirname(path)
+ raise FileNotFoundError("Could not find the repository root (.git directory).")
REPO_ROOT = find_repo_root(os.path.dirname(__file__))
# Centralized configuration as recommended in Architectural Analysis (Section VI-B)
CONFIG = {
- "PROJECT_NAME": "pipulate",
- "POSTS_DIRECTORY": "/home/mike/repos/MikeLev.in/_posts" # NEW: For list_articles logic
+ "PROJECT_NAME": "pipulate",
+ "POSTS_DIRECTORY": "/home/mike/repos/MikeLev.in/_posts" # NEW: For list_articles logic
}
# ============================================================================
# --- Accurate Literary Size Scale (Word Count Based) ---
# ============================================================================
LITERARY_SIZE_SCALE = [
- (3000, "Short Essay"),
- (7500, "Short Story"),
- (20000, "Novelette"),
- (50000, "Novella or a Master's Dissertation"),
- (80000, "Average Paperback Novel or a Ph.D. Dissertation"),
- (120000, "Long Novel"),
- (200000, "Epic Fantasy Novel"),
- (500000, "Seriously Long Epic (like 'Infinite Jest')"),
+ (3000, "Short Essay"),
+ (7500, "Short Story"),
+ (20000, "Novelette"),
+ (50000, "Novella or a Master's Dissertation"),
+ (80000, "Average Paperback Novel or a Ph.D. Dissertation"),
+ (120000, "Long Novel"),
+ (200000, "Epic Fantasy Novel"),
+ (500000, "Seriously Long Epic (like 'Infinite Jest')"),
]
def get_literary_perspective(word_count: int, token_word_ratio: float) -> str:
- """Get a human-readable literary comparison for the codebase size."""
- description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}"
- for words, desc in LITERARY_SIZE_SCALE:
- if word_count <= words:
- description = desc
- break
-
- density_warning = ""
- if token_word_ratio > 1.8:
- density_warning = (
- f" (Note: With a token/word ratio of {token_word_ratio:.2f}, "
- f"this content is far denser and more complex than typical prose of this length)."
- )
-
- return f"📚 Equivalent in length to a **{description}**{density_warning}"
+ """Get a human-readable literary comparison for the codebase size."""
+ description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}"
+ for words, desc in LITERARY_SIZE_SCALE:
+ if word_count <= words:
+ description = desc
+ break
+
+ density_warning = ""
+ if token_word_ratio > 1.8:
+ density_warning = (
+ f" (Note: With a token/word ratio of {token_word_ratio:.2f}, "
+ f"this content is far denser and more complex than typical prose of this length)."
+ )
+
+ return f"📚 Equivalent in length to a **{description}**{density_warning}"
# ============================================================================
# --- Restored & Corrected: UML and DOT Context Generation ---
# ============================================================================
def generate_uml_and_dot(target_file: str, project_name: str) -> Dict:
- """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file."""
- pyreverse_exec = shutil.which("pyreverse")
- plantuml_exec = shutil.which("plantuml")
-
- if not pyreverse_exec or not plantuml_exec:
- msg = []
- if not pyreverse_exec: msg.append("`pyreverse` (from pylint)")
- if not plantuml_exec: msg.append("`plantuml`")
- return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."}
-
- target_path = os.path.join(REPO_ROOT, target_file)
- if not os.path.exists(target_path):
- return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"}
-
- with tempfile.TemporaryDirectory() as temp_dir:
- dot_file_path = os.path.join(temp_dir, "classes.dot")
- puml_file_path = os.path.join(temp_dir, "diagram.puml")
-
- # --- Step 1: Run pyreverse ---
- try:
- pyreverse_cmd = [
- pyreverse_exec,
- "-f", "dot",
- "-o", "dot", # This format is just a prefix
- "-p", project_name,
- target_path
- ]
- subprocess.run(
- pyreverse_cmd,
- check=True,
- capture_output=True,
- text=True,
- cwd=temp_dir
- )
- generated_dot_name = f"classes_{project_name}.dot"
- os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path)
-
- except (subprocess.CalledProcessError, FileNotFoundError) as e:
- error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
- return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None}
-
- # --- Step 2: Convert DOT to PlantUML ---
- try:
- graphs = pydot.graph_from_dot_file(dot_file_path)
- if not graphs:
- return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None}
- graph = graphs[0]
- dot_content = graph.to_string()
-
- puml_lines = ["@startuml", "skinparam linetype ortho", ""]
-
- def sanitize_line(line):
- clean = re.sub(r'<br[^>]*>', '', line)
- clean = re.sub(r'<[^>]+>', '', clean)
- return clean.strip()
-
- for node in graph.get_nodes():
- label = node.get_label()
- if not label: continue
-
- parts = label.strip('<>{} ').split('|')
- class_name = sanitize_line(parts[0])
- puml_lines.append(f"class {class_name} {{")
-
- if len(parts) > 1:
- for attr in re.split(r'<br[^>]*>', parts[1]):
- clean_attr = sanitize_line(attr).split(':')[0].strip()
- if clean_attr:
- puml_lines.append(f" - {clean_attr}")
-
- if len(parts) > 2:
- method_block = parts[2].strip()
- for method_line in re.split(r'<br[^>]*>', method_block):
- clean_method = sanitize_line(method_line)
- if clean_method:
- puml_lines.append(f" + {clean_method}")
-
- puml_lines.append("}\n")
-
- for edge in graph.get_edges():
- source_name = edge.get_source().strip('"').split('.')[-1]
- dest_name = edge.get_destination().strip('"').split('.')[-1]
- puml_lines.append(f"{source_name} ..> {dest_name}")
-
- puml_lines.append("@enduml")
- with open(puml_file_path, 'w') as f:
- f.write('\n'.join(puml_lines))
-
- except Exception as e:
- with open(dot_file_path, 'r') as f:
- dot_content_on_error = f.read()
- return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error}
-
- # --- Step 3: Run PlantUML ---
- try:
- plantuml_cmd = ["plantuml", "-tutxt", puml_file_path]
- subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir)
-
- utxt_file_path = puml_file_path.replace(".puml", ".utxt")
- with open(utxt_file_path, 'r') as f:
- ascii_uml = f.read()
-
- # --- Normalize whitespace from plantuml output ---
- lines = ascii_uml.splitlines()
- non_empty_lines = [line for line in lines if line.strip()]
-
- if non_empty_lines:
- min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines)
- dedented_lines = [line[min_indent:] for line in lines]
- stripped_lines = [line.rstrip() for line in dedented_lines]
- ascii_uml = '\n'.join(stripped_lines)
-
- # Prepend a newline to "absorb the chop" from rendering
- if ascii_uml:
- ascii_uml = '\n' + ascii_uml
-
- except (subprocess.CalledProcessError, FileNotFoundError) as e:
- error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
- return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content}
-
- return {"ascii_uml": ascii_uml, "dot_graph": dot_content}
+ """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file."""
+ pyreverse_exec = shutil.which("pyreverse")
+ plantuml_exec = shutil.which("plantuml")
+
+ if not pyreverse_exec or not plantuml_exec:
+ msg = []
+ if not pyreverse_exec: msg.append("`pyreverse` (from pylint)")
+ if not plantuml_exec: msg.append("`plantuml`")
+ return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."}
+
+ target_path = os.path.join(REPO_ROOT, target_file)
+ if not os.path.exists(target_path):
+ return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"}
+
+ with tempfile.TemporaryDirectory() as temp_dir:
+ dot_file_path = os.path.join(temp_dir, "classes.dot")
+ puml_file_path = os.path.join(temp_dir, "diagram.puml")
+
+ # --- Step 1: Run pyreverse ---
+ try:
+ pyreverse_cmd = [
+ pyreverse_exec,
+ "-f", "dot",
+ "-o", "dot", # This format is just a prefix
+ "-p", project_name,
+ target_path
+ ]
+ subprocess.run(
+ pyreverse_cmd,
+ check=True,
+ capture_output=True,
+ text=True,
+ cwd=temp_dir
+ )
+ generated_dot_name = f"classes_{project_name}.dot"
+ os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path)
+
+ except (subprocess.CalledProcessError, FileNotFoundError) as e:
+ error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
+ return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None}
+
+ # --- Step 2: Convert DOT to PlantUML ---
+ try:
+ graphs = pydot.graph_from_dot_file(dot_file_path)
+ if not graphs:
+ return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None}
+ graph = graphs[0]
+ dot_content = graph.to_string()
+
+ puml_lines = ["@startuml", "skinparam linetype ortho", ""]
+
+ def sanitize_line(line):
+ clean = re.sub(r'<br[^>]*>', '', line)
+ clean = re.sub(r'<[^>]+>', '', clean)
+ return clean.strip()
+
+ for node in graph.get_nodes():
+ label = node.get_label()
+ if not label: continue
+
+ parts = label.strip('<>{} ').split('|')
+ class_name = sanitize_line(parts[0])
+ puml_lines.append(f"class {class_name} {{")
+
+ if len(parts) > 1:
+ for attr in re.split(r'<br[^>]*>', parts[1]):
+ clean_attr = sanitize_line(attr).split(':')[0].strip()
+ if clean_attr:
+ puml_lines.append(f" - {clean_attr}")
+
+ if len(parts) > 2:
+ method_block = parts[2].strip()
+ for method_line in re.split(r'<br[^>]*>', method_block):
+ clean_method = sanitize_line(method_line)
+ if clean_method:
+ puml_lines.append(f" + {clean_method}")
+
+ puml_lines.append("}\n")
+
+ for edge in graph.get_edges():
+ source_name = edge.get_source().strip('"').split('.')[-1]
+ dest_name = edge.get_destination().strip('"').split('.')[-1]
+ puml_lines.append(f"{source_name} ..> {dest_name}")
+
+ puml_lines.append("@enduml")
+ with open(puml_file_path, 'w') as f:
+ f.write('\n'.join(puml_lines))
+
+ except Exception as e:
+ with open(dot_file_path, 'r') as f:
+ dot_content_on_error = f.read()
+ return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error}
+
+ # --- Step 3: Run PlantUML ---
+ try:
+ plantuml_cmd = ["plantuml", "-tutxt", puml_file_path]
+ subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir)
+
+ utxt_file_path = puml_file_path.replace(".puml", ".utxt")
+ with open(utxt_file_path, 'r') as f:
+ ascii_uml = f.read()
+
+ # --- Normalize whitespace from plantuml output ---
+ lines = ascii_uml.splitlines()
+ non_empty_lines = [line for line in lines if line.strip()]
+
+ if non_empty_lines:
+ min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines)
+ dedented_lines = [line[min_indent:] for line in lines]
+ stripped_lines = [line.rstrip() for line in dedented_lines]
+ ascii_uml = '\n'.join(stripped_lines)
+
+ # Prepend a newline to "absorb the chop" from rendering
+ if ascii_uml:
+ ascii_uml = '\n' + ascii_uml
+
+ except (subprocess.CalledProcessError, FileNotFoundError) as e:
+ error_msg = e.stderr if hasattr(e, 'stderr') else str(e)
+ return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content}
+
+ return {"ascii_uml": ascii_uml, "dot_graph": dot_content}
# ============================================================================
# --- NEW: Logic ported from list_articles.py for Narrative Context ---
# ============================================================================
def _get_article_list_data(posts_dir: str = CONFIG["POSTS_DIRECTORY"]) -> List[Dict]:
- """
- Parses Jekyll posts, sorts them chronologically, and returns a list of dictionaries.
- This is a self-contained version of the logic from the `list_articles.py` script.
- """
- posts_data = []
- if not os.path.isdir(posts_dir):
- print(f"Warning: Article directory not found at {posts_dir}", file=sys.stderr)
- return []
-
- for filename in os.listdir(posts_dir):
- if not filename.endswith(('.md', '.markdown')):
- continue
- filepath = os.path.join(posts_dir, filename)
- try:
- date_str = filename[:10]
- post_date = datetime.strptime(date_str, '%Y-%m-%d').date()
- with open(filepath, 'r', encoding='utf-8') as f:
- content = f.read()
- if content.startswith('---'):
- parts = content.split('---', 2)
- front_matter = yaml.safe_load(parts[1]) or {}
- posts_data.append({
- 'path': filepath,
- 'date': post_date,
- 'sort_order': int(front_matter.get('sort_order', 0)),
- 'title': front_matter.get('title', 'Untitled'),
- 'summary': front_matter.get('meta_description', '')
- })
- except (ValueError, yaml.YAMLError, IndexError):
- continue
-
- sorted_posts = sorted(posts_data, key=lambda p: (p['date'], p['sort_order']))
- return sorted_posts
+ """
+ Parses Jekyll posts, sorts them chronologically, and returns a list of dictionaries.
+ This is a self-contained version of the logic from the `list_articles.py` script.
+ """
+ posts_data = []
+ if not os.path.isdir(posts_dir):
+ print(f"Warning: Article directory not found at {posts_dir}", file=sys.stderr)
+ return []
+
+ for filename in os.listdir(posts_dir):
+ if not filename.endswith(('.md', '.markdown')):
+ continue
+ filepath = os.path.join(posts_dir, filename)
+ try:
+ date_str = filename[:10]
+ post_date = datetime.strptime(date_str, '%Y-%m-%d').date()
+ with open(filepath, 'r', encoding='utf-8') as f:
+ content = f.read()
+ if content.startswith('---'):
+ parts = content.split('---', 2)
+ front_matter = yaml.safe_load(parts[1]) or {}
+ posts_data.append({
+ 'path': filepath,
+ 'date': post_date,
+ 'sort_order': int(front_matter.get('sort_order', 0)),
+ 'title': front_matter.get('title', 'Untitled'),
+ 'summary': front_matter.get('meta_description', '')
+ })
+ except (ValueError, yaml.YAMLError, IndexError):
+ continue
+
+ sorted_posts = sorted(posts_data, key=lambda p: (p['date'], p['sort_order']))
+ return sorted_posts
def parse_slice_arg(arg_str: str):
- """Parses a string like '[5:10]' or '[5]' into a slice object or an int."""
- if not arg_str or not arg_str.startswith('[') or not arg_str.endswith(']'):
- return None
- content = arg_str[1:-1].strip()
- if ':' in content:
- parts = content.split(':', 1)
- start = int(parts[0].strip()) if parts[0].strip() else None
- end = int(parts[1].strip()) if parts[1].strip() else None
- return slice(start, end)
- elif content:
- return int(content)
- return None
+ """Parses a string like '[5:10]' or '[5]' into a slice object or an int."""
+ if not arg_str or not arg_str.startswith('[') or not arg_str.endswith(']'):
+ return None
+ content = arg_str[1:-1].strip()
+ if ':' in content:
+ parts = content.split(':', 1)
+ start = int(parts[0].strip()) if parts[0].strip() else None
+ end = int(parts[1].strip()) if parts[1].strip() else None
+ return slice(start, end)
+ elif content:
+ return int(content)
+ return None
# ============================================================================
# --- Helper Functions (Tokenizing, File Parsing, Clipboard) ---
# ============================================================================
def count_tokens(text: str, model: str = "gpt-4o") -> int:
- """Counts tokens in a text string using tiktoken."""
- try:
- encoding = tiktoken.encoding_for_model(model)
- return len(encoding.encode(text))
- except Exception:
- return len(text.split())
+ """Counts tokens in a text string using tiktoken."""
+ try:
+ encoding = tiktoken.encoding_for_model(model)
+ return len(encoding.encode(text))
+ except Exception:
+ return len(text.split())
def count_words(text: str) -> int:
- """Counts words in a text string."""
- return len(text.split())
+ """Counts words in a text string."""
+ return len(text.split())
def parse_file_list_from_config() -> List[Tuple[str, str]]:
- """Loads and parses the file list from foo_files.py."""
- try:
- import foo_files
- files_raw = foo_files.FILES_TO_INCLUDE_RAW
- except (ImportError, AttributeError):
- print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.")
- sys.exit(1)
-
- lines = files_raw.strip().splitlines()
- seen_files, parsed_files = set(), []
-
- for line in lines:
- line = line.strip()
- if not line or line.startswith('#'):
- continue
- parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1)
- file_path = parts[0].strip()
- comment = parts[1].strip() if len(parts) > 1 else ""
-
- if file_path and file_path not in seen_files:
- seen_files.add(file_path)
- parsed_files.append((file_path, comment))
- return parsed_files
+ """Loads and parses the file list from foo_files.py."""
+ try:
+ import foo_files
+ files_raw = foo_files.FILES_TO_INCLUDE_RAW
+ except (ImportError, AttributeError):
+ print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.")
+ sys.exit(1)
+
+ lines = files_raw.strip().splitlines()
+ seen_files, parsed_files = set(), []
+
+ for line in lines:
+ line = line.strip()
+ if not line or line.startswith('#'):
+ continue
+ parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1)
+ file_path = parts[0].strip()
+ comment = parts[1].strip() if len(parts) > 1 else ""
+
+ if file_path and file_path not in seen_files:
+ seen_files.add(file_path)
+ parsed_files.append((file_path, comment))
+ return parsed_files
def copy_to_clipboard(text: str):
- """Copies text to the system clipboard using 'xclip'."""
- if not shutil.which('xclip'):
- print("\nWarning: 'xclip' not found. Cannot copy to clipboard.")
- return
- try:
- subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True)
- print("Markdown output copied to clipboard")
- except Exception as e:
- print(f"\nWarning: Could not copy to clipboard: {e}")
+ """Copies text to the system clipboard using 'xclip'."""
+ if not shutil.which('xclip'):
+ print("\nWarning: 'xclip' not found. Cannot copy to clipboard.")
+ return
+ try:
+ subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True)
+ print("Markdown output copied to clipboard")
+ except Exception as e:
+ print(f"\nWarning: Could not copy to clipboard: {e}")
def run_tree_command() -> str:
- """Runs the 'eza' command to generate a tree view that respects .gitignore."""
- eza_exec = shutil.which("eza")
- if not eza_exec:
- return "Skipping: `eza` command not found."
-
- try:
- result = subprocess.run(
- [eza_exec, '--tree', '--git-ignore', '--color=never'],
- capture_output=True,
- text=True,
- cwd=REPO_ROOT,
- check=True
- )
- return result.stdout
- except subprocess.CalledProcessError as e:
- return f"Error running eza command: {e.stderr}"
- except Exception as e:
- return f"An unexpected error occurred while running eza: {str(e)}"
+ """Runs the 'eza' command to generate a tree view that respects .gitignore."""
+ eza_exec = shutil.which("eza")
+ if not eza_exec:
+ return "Skipping: `eza` command not found."
+
+ try:
+ result = subprocess.run(
+ [eza_exec, '--tree', '--git-ignore', '--color=never'],
+ capture_output=True,
+ text=True,
+ cwd=REPO_ROOT,
+ check=True
+ )
+ return result.stdout
+ except subprocess.CalledProcessError as e:
+ return f"Error running eza command: {e.stderr}"
+ except Exception as e:
+ return f"An unexpected error occurred while running eza: {str(e)}"
def check_dependencies():
- """Verifies that all required external command-line tools are installed."""
- print("Checking for required external dependencies...")
- dependencies = {
- "pyreverse": "Provided by `pylint`. Install with: pip install pylint",
- "plantuml": "A Java-based tool. See https://plantuml.com/starting",
- "eza": "A modern replacement for `ls`. See https://eza.rocks/install",
- "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)",
- }
- missing = []
-
- for tool, instructions in dependencies.items():
- if not shutil.which(tool):
- missing.append((tool, instructions))
-
- if not missing:
- print("✅ All dependencies found.")
- else:
- print("\n❌ Missing dependencies detected:")
- for tool, instructions in missing:
- print(f" - Command not found: `{tool}`")
- print(f" ↳ {instructions}")
- print("\nPlease install the missing tools and ensure they are in your system's PATH.")
- sys.exit(1)
+ """Verifies that all required external command-line tools are installed."""
+ print("Checking for required external dependencies...")
+ dependencies = {
+ "pyreverse": "Provided by `pylint`. Install with: pip install pylint",
+ "plantuml": "A Java-based tool. See https://plantuml.com/starting",
+ "eza": "A modern replacement for `ls`. See https://eza.rocks/install",
+ "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)",
+ }
+ missing = []
+
+ for tool, instructions in dependencies.items():
+ if not shutil.which(tool):
+ missing.append((tool, instructions))
+
+ if not missing:
+ print("✅ All dependencies found.")
+ else:
+ print("\n❌ Missing dependencies detected:")
+ for tool, instructions in missing:
+ print(f" - Command not found: `{tool}`")
+ print(f" ↳ {instructions}")
+ print("\nPlease install the missing tools and ensure they are in your system's PATH.")
+ sys.exit(1)
# ============================================================================
# --- Intelligent PromptBuilder Class ---
# ============================================================================
class PromptBuilder:
- """
- Builds a complete, structured Markdown prompt including file manifests,
- auto-generated context, file contents, and the user's final prompt.
- """
- def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False):
- self.processed_files = processed_files
- self.prompt_text = prompt_text
- self.context_only = context_only
- self.auto_context = {}
-
- def add_auto_context(self, title: str, content: str):
- """Adds auto-generated context like UML diagrams to the prompt."""
- if content and "error" not in content.lower() and "skipping" not in content.lower():
- self.auto_context[title] = {
- 'content': content,
- 'tokens': count_tokens(content),
- 'words': count_words(content)
- }
-
- def _generate_manifest_header(self) -> str:
- lines = ["# Codebase Context & Manifest", ""]
- for f in self.processed_files:
- purpose = f" ({f['comment']})" if f['comment'] else ""
- token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else ""
- lines.append(f"- **{f['path']}**{purpose}{token_display}")
- return "\n".join(lines)
-
- def _generate_auto_context_section(self) -> str:
- if not self.auto_context:
- return ""
- lines = ["", "---", "", "# Auto-Generated Context", ""]
-
- # --- NEW: Prioritize printing Narrative Context first ---
- narrative_title = "Recent Narrative Context"
- if narrative_title in self.auto_context:
- content = self.auto_context[narrative_title]['content']
- lines.append(f"## {narrative_title}")
- lines.append(content.strip())
- lines.append("") # Add a blank line for spacing
-
- if "Codebase Structure (eza --tree)" in self.auto_context:
- title = "Codebase Structure (eza --tree)"
- content = self.auto_context[title]['content']
- lines.append(f"## {title}")
- lines.append("```text")
- lines.append(content.strip())
- lines.append("```")
-
- for title, data in self.auto_context.items():
- if title not in ["Codebase Structure (eza --tree)", narrative_title]:
- lines.append(f"## {title}")
- lines.append("```text")
- lines.append(data['content'])
- lines.append("```")
- return "\n".join(lines)
-
- def _generate_file_contents(self) -> str:
- lines = []
- for f in self.processed_files:
- lines.append(f"```{f['lang']}:{f['path']}")
- lines.append(f['content'])
- lines.append("```")
- lines.append(f"\n# End of {f['path']}\n")
- return "\n".join(lines)
-
- def _generate_ai_checklist(self) -> str:
- """Creates a standardized checklist for the AI to verify the prompt's context."""
- return '''
+ """
+ Builds a complete, structured Markdown prompt including file manifests,
+ auto-generated context, file contents, and the user's final prompt.
+ """
+ def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False):
+ self.processed_files = processed_files
+ self.prompt_text = prompt_text
+ self.context_only = context_only
+ self.auto_context = {}
+
+ def add_auto_context(self, title: str, content: str):
+ """Adds auto-generated context like UML diagrams to the prompt."""
+ if content and "error" not in content.lower() and "skipping" not in content.lower():
+ self.auto_context[title] = {
+ 'content': content,
+ 'tokens': count_tokens(content),
+ 'words': count_words(content)
+ }
+
+ def _generate_manifest_header(self) -> str:
+ lines = ["# Codebase Context & Manifest", ""]
+ for f in self.processed_files:
+ purpose = f" ({f['comment']})" if f['comment'] else ""
+ token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else ""
+ lines.append(f"- **{f['path']}**{purpose}{token_display}")
+ return "\n".join(lines)
+
+ def _generate_auto_context_section(self) -> str:
+ if not self.auto_context:
+ return ""
+ lines = ["", "---", "", "# Auto-Generated Context", ""]
+
+ # --- NEW: Prioritize printing Narrative Context first ---
+ narrative_title = "Recent Narrative Context"
+ if narrative_title in self.auto_context:
+ content = self.auto_context[narrative_title]['content']
+ lines.append(f"## {narrative_title}")
+ lines.append(content.strip())
+ lines.append("") # Add a blank line for spacing
+
+ if "Codebase Structure (eza --tree)" in self.auto_context:
+ title = "Codebase Structure (eza --tree)"
+ content = self.auto_context[title]['content']
+ lines.append(f"## {title}")
+ lines.append("```text")
+ lines.append(content.strip())
+ lines.append("```")
+
+ for title, data in self.auto_context.items():
+ if title not in ["Codebase Structure (eza --tree)", narrative_title]:
+ lines.append(f"## {title}")
+ lines.append("```text")
+ lines.append(data['content'])
+ lines.append("```")
+ return "\n".join(lines)
+
+ def _generate_file_contents(self) -> str:
+ lines = []
+ for f in self.processed_files:
+ lines.append(f"```{f['lang']}:{f['path']}")
+ lines.append(f['content'])
+ lines.append("```")
+ lines.append(f"\n# End of {f['path']}\n")
+ return "\n".join(lines)
+
+ def _generate_ai_checklist(self) -> str:
+ """Creates a standardized checklist for the AI to verify the prompt's context."""
+ return '''
# AI Self-Correction Checklist
Before addressing the user's prompt, perform the following verification steps:
-1. **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?).
-2. **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest?
-3. **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context.
-4. **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request.
+1. **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?).
+2. **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest?
+3. **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context.
+4. **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request.
'''
- def build_final_prompt(self) -> str:
- """Assembles all parts into the final Markdown string."""
- parts = [
- self._generate_manifest_header(),
- self._generate_auto_context_section(),
- ]
-
- if not self.context_only:
- parts.extend([
- "\n---\n\n# File Contents\n",
- self._generate_file_contents(),
- ])
-
- parts.extend([
- "---\n\n# User Prompt\n",
- self._generate_ai_checklist(),
- self.prompt_text
- ])
-
- return "\n".join(filter(None, parts))
-
- def print_summary(self):
- """Calculates and prints an accurate, comprehensive summary to the console."""
- # --- Calculate token counts for all components ---
- manifest_str = self._generate_manifest_header()
- manifest_tokens = count_tokens(manifest_str)
- manifest_words = count_words(manifest_str)
-
- prompt_tokens = count_tokens(self.prompt_text)
- prompt_words = count_words(self.prompt_text)
-
- checklist_str = self._generate_ai_checklist()
- checklist_tokens = count_tokens(checklist_str)
- checklist_words = count_words(checklist_str)
-
- auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values())
- auto_context_total_words = sum(v['words'] for v in self.auto_context.values())
-
- file_content_total_tokens = sum(f['tokens'] for f in self.processed_files)
- file_content_total_words = sum(f['words'] for f in self.processed_files)
-
- # --- Display the breakdown ---
- print("--- Files Included ---")
- for f in self.processed_files:
- if self.context_only:
- print(f"• {f['path']} (content omitted)")
- else:
- print(f"• {f['path']} ({f['tokens']:,} tokens)")
-
- if self.auto_context:
- print("\n--- Auto-Context Included ---")
- for title, data in self.auto_context.items():
- print(f"• {title} ({data['tokens']:,} tokens)")
-
- # --- Calculate and display the final summary ---
- print("\n--- Prompt Summary ---")
- if self.context_only:
- print("NOTE: Running in --context-only mode. File contents are excluded.")
- total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens
- total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words
- else:
- total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens
- total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words
-
- print(f"Total Tokens: {total_tokens:,}")
- print(f"Total Words: {total_words:,}")
-
- ratio = total_tokens / total_words if total_words > 0 else 0
- perspective = get_literary_perspective(total_words, ratio)
- print("\n--- Size Perspective ---")
- print(perspective)
- print()
+ def build_final_prompt(self) -> str:
+ """Assembles all parts into the final Markdown string."""
+ parts = [
+ self._generate_manifest_header(),
+ self._generate_auto_context_section(),
+ ]
+
+ if not self.context_only:
+ parts.extend([
+ "\n---\n\n# File Contents\n",
+ self._generate_file_contents(),
+ ])
+
+ parts.extend([
+ "---\n\n# User Prompt\n",
+ self._generate_ai_checklist(),
+ self.prompt_text
+ ])
+
+ return "\n".join(filter(None, parts))
+
+ def print_summary(self):
+ """Calculates and prints an accurate, comprehensive summary to the console."""
+ # --- Calculate token counts for all components ---
+ manifest_str = self._generate_manifest_header()
+ manifest_tokens = count_tokens(manifest_str)
+ manifest_words = count_words(manifest_str)
+
+ prompt_tokens = count_tokens(self.prompt_text)
+ prompt_words = count_words(self.prompt_text)
+
+ checklist_str = self._generate_ai_checklist()
+ checklist_tokens = count_tokens(checklist_str)
+ checklist_words = count_words(checklist_str)
+
+ auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values())
+ auto_context_total_words = sum(v['words'] for v in self.auto_context.values())
+
+ file_content_total_tokens = sum(f['tokens'] for f in self.processed_files)
+ file_content_total_words = sum(f['words'] for f in self.processed_files)
+
+ # --- Display the breakdown ---
+ print("--- Files Included ---")
+ for f in self.processed_files:
+ if self.context_only:
+ print(f"• {f['path']} (content omitted)")
+ else:
+ print(f"• {f['path']} ({f['tokens']:,} tokens)")
+
+ if self.auto_context:
+ print("\n--- Auto-Context Included ---")
+ for title, data in self.auto_context.items():
+ print(f"• {title} ({data['tokens']:,} tokens)")
+
+ # --- Calculate and display the final summary ---
+ print("\n--- Prompt Summary ---")
+ if self.context_only:
+ print("NOTE: Running in --context-only mode. File contents are excluded.")
+ total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens
+ total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words
+ else:
+ total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens
+ total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words
+
+ print(f"Total Tokens: {total_tokens:,}")
+ print(f"Total Words: {total_words:,}")
+
+ ratio = total_tokens / total_words if total_words > 0 else 0
+ perspective = get_literary_perspective(total_words, ratio)
+ print("\n--- Size Perspective ---")
+ print(perspective)
+ print()
# ============================================================================
# --- Main Execution Logic ---
# ============================================================================
def main():
- """Main function to parse args, process files, and generate output."""
- parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.')
- parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).')
- parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.')
- parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.')
- parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.')
- parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.')
- # --- NEW: Argument for including narrative context ---
- parser.add_argument(
- '-l', '--list',
- nargs='?',
- const='[-5:]',
- default=None,
- help='Include a list of recent articles. Optionally provide a slice, e.g., "[-10:]". Defaults to "[-5:]".'
- )
- args = parser.parse_args()
-
- if args.check_dependencies:
- check_dependencies()
- sys.exit(0)
-
- # 1. Handle user prompt
- prompt_content = "Please review the provided context and assist with the codebase."
- if args.prompt:
- if os.path.exists(args.prompt):
- with open(args.prompt, 'r', encoding='utf-8') as f:
- prompt_content = f.read()
- else:
- prompt_content = args.prompt
- elif os.path.exists("prompt.md"):
- with open("prompt.md", 'r', encoding='utf-8') as f:
- prompt_content = f.read()
-
- # 2. Process all specified files
- files_to_process = parse_file_list_from_config()
- processed_files_data = []
- print("--- Processing Files ---")
- for path, comment in files_to_process:
- full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path
- if not os.path.exists(full_path):
- print(f"Warning: File not found and will be skipped: {full_path}")
- continue
-
- content = ""
- lang = "text"
- ext = os.path.splitext(path)[1].lower()
-
- if ext == '.ipynb':
- if JUPYTEXT_AVAILABLE:
- print(f" -> Converting notebook: {path}")
- try:
- notebook = jupytext.read(full_path)
- content = jupytext.writes(notebook, fmt='py:percent')
- lang = 'python'
- except Exception as e:
- content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}"
- print(f"Warning: Failed to convert {path}: {e}")
- else:
- content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}"
- print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.")
- else:
- try:
- with open(full_path, 'r', encoding='utf-8') as f:
- content = f.read()
- lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'}
- lang = lang_map.get(ext, 'text')
- except Exception as e:
- print(f"ERROR: Could not read or process {full_path}: {e}")
- sys.exit(1)
-
- processed_files_data.append({
- "path": path, "comment": comment, "content": content,
- "tokens": count_tokens(content), "words": count_words(content),
- "lang": lang
- })
-
- # 3. Build the prompt and add auto-generated context
- builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only)
-
- # --- Add the Codebase Tree ---
- print("\n--- Generating Auto-Context ---")
- print("Generating codebase tree diagram...", end='', flush=True)
- tree_output = run_tree_command()
- title = "Codebase Structure (eza --tree)"
- builder.add_auto_context(title, tree_output)
- if title in builder.auto_context:
- token_count = builder.auto_context[title]['tokens']
- print(f" ({token_count:,} tokens)")
- else:
- print(" (skipped)")
-
-
- # --- NEW: Add narrative context if requested ---
- if args.list:
- print("Adding narrative context from articles...", end='', flush=True)
- all_articles = _get_article_list_data()
- sliced_articles = []
- try:
- slice_or_index = parse_slice_arg(args.list)
- if isinstance(slice_or_index, int):
- sliced_articles = [all_articles[slice_or_index]]
- elif isinstance(slice_or_index, slice):
- sliced_articles = all_articles[slice_or_index]
- except (ValueError, IndexError):
- print(f" (invalid slice '{args.list}')")
- sliced_articles = []
-
- if sliced_articles:
- narrative_content = "\n".join(
- f"### {article['title']} ({article['date']})\n> {article['summary']}\n"
- for article in sliced_articles
- )
- builder.add_auto_context("Recent Narrative Context", narrative_content)
- print(f" ({len(sliced_articles)} articles)")
- else:
- print(" (no articles found or invalid slice)")
-
-
- # --- Generate UML for all included Python files ---
- python_files_to_diagram = [
- f['path'] for f in processed_files_data if f['path'].endswith('.py')
- ]
-
- if python_files_to_diagram:
- print("Python file(s) detected. Generating UML diagrams...")
- for py_file_path in python_files_to_diagram:
- print(f" -> Generating for {py_file_path}...", end='', flush=True)
- uml_context = generate_uml_and_dot(
- target_file=py_file_path,
- project_name=CONFIG["PROJECT_NAME"]
- )
- uml_content = uml_context.get("ascii_uml")
- title = f"UML Class Diagram (ASCII for {py_file_path})"
- builder.add_auto_context(title, uml_content)
-
- if title in builder.auto_context:
- token_count = builder.auto_context[title]['tokens']
- print(f" ({token_count:,} tokens)")
- elif uml_content and "note: no classes" in uml_content.lower():
- print(" (no classes found)")
- else:
- print(" (skipped)")
-
- print("...UML generation complete.\n")
-
- # 4. Generate final output and print summary
- final_output = builder.build_final_prompt()
- builder.print_summary()
-
- # 5. Handle output
- if args.output:
- with open(args.output, 'w', encoding='utf-8') as f:
- f.write(final_output)
- print(f"Output written to '{args.output}'")
- if not args.no_clipboard:
- copy_to_clipboard(final_output)
+ """Main function to parse args, process files, and generate output."""
+ parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.')
+ parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).')
+ parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.')
+ parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.')
+ parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.')
+ parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.')
+ # --- NEW: Argument for including narrative context ---
+ parser.add_argument(
+ '-l', '--list',
+ nargs='?',
+ const='[-5:]',
+ default=None,
+ help='Include a list of recent articles. Optionally provide a slice, e.g., "[-10:]". Defaults to "[-5:]".'
+ )
+ args = parser.parse_args()
+
+ if args.check_dependencies:
+ check_dependencies()
+ sys.exit(0)
+
+ # 1. Handle user prompt
+ prompt_content = "Please review the provided context and assist with the codebase."
+ if args.prompt:
+ if os.path.exists(args.prompt):
+ with open(args.prompt, 'r', encoding='utf-8') as f:
+ prompt_content = f.read()
+ else:
+ prompt_content = args.prompt
+ elif os.path.exists("prompt.md"):
+ with open("prompt.md", 'r', encoding='utf-8') as f:
+ prompt_content = f.read()
+
+ # 2. Process all specified files
+ files_to_process = parse_file_list_from_config()
+ processed_files_data = []
+ print("--- Processing Files ---")
+ for path, comment in files_to_process:
+ full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path
+ if not os.path.exists(full_path):
+ print(f"Warning: File not found and will be skipped: {full_path}")
+ continue
+
+ content = ""
+ lang = "text"
+ ext = os.path.splitext(path)[1].lower()
+
+ if ext == '.ipynb':
+ if JUPYTEXT_AVAILABLE:
+ print(f" -> Converting notebook: {path}")
+ try:
+ notebook = jupytext.read(full_path)
+ content = jupytext.writes(notebook, fmt='py:percent')
+ lang = 'python'
+ except Exception as e:
+ content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}"
+ print(f"Warning: Failed to convert {path}: {e}")
+ else:
+ content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}"
+ print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.")
+ else:
+ try:
+ with open(full_path, 'r', encoding='utf-8') as f:
+ content = f.read()
+ lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'}
+ lang = lang_map.get(ext, 'text')
+ except Exception as e:
+ print(f"ERROR: Could not read or process {full_path}: {e}")
+ sys.exit(1)
+
+ processed_files_data.append({
+ "path": path, "comment": comment, "content": content,
+ "tokens": count_tokens(content), "words": count_words(content),
+ "lang": lang
+ })
+
+ # 3. Build the prompt and add auto-generated context
+ builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only)
+
+ # --- Add the Codebase Tree ---
+ print("\n--- Generating Auto-Context ---")
+ print("Generating codebase tree diagram...", end='', flush=True)
+ tree_output = run_tree_command()
+ title = "Codebase Structure (eza --tree)"
+ builder.add_auto_context(title, tree_output)
+ if title in builder.auto_context:
+ token_count = builder.auto_context[title]['tokens']
+ print(f" ({token_count:,} tokens)")
+ else:
+ print(" (skipped)")
+
+
+ # --- NEW: Add narrative context if requested ---
+ if args.list:
+ print("Adding narrative context from articles...", end='', flush=True)
+ all_articles = _get_article_list_data()
+ sliced_articles = []
+ try:
+ slice_or_index = parse_slice_arg(args.list)
+ if isinstance(slice_or_index, int):
+ sliced_articles = [all_articles[slice_or_index]]
+ elif isinstance(slice_or_index, slice):
+ sliced_articles = all_articles[slice_or_index]
+ except (ValueError, IndexError):
+ print(f" (invalid slice '{args.list}')")
+ sliced_articles = []
+
+ if sliced_articles:
+ narrative_content = "\n".join(
+ f"### {article['title']} ({article['date']})\n> {article['summary']}\n"
+ for article in sliced_articles
+ )
+ builder.add_auto_context("Recent Narrative Context", narrative_content)
+ print(f" ({len(sliced_articles)} articles)")
+ else:
+ print(" (no articles found or invalid slice)")
+
+
+ # --- Generate UML for all included Python files ---
+ python_files_to_diagram = [
+ f['path'] for f in processed_files_data if f['path'].endswith('.py')
+ ]
+
+ if python_files_to_diagram:
+ print("Python file(s) detected. Generating UML diagrams...")
+ for py_file_path in python_files_to_diagram:
+ print(f" -> Generating for {py_file_path}...", end='', flush=True)
+ uml_context = generate_uml_and_dot(
+ target_file=py_file_path,
+ project_name=CONFIG["PROJECT_NAME"]
+ )
+ uml_content = uml_context.get("ascii_uml")
+ title = f"UML Class Diagram (ASCII for {py_file_path})"
+ builder.add_auto_context(title, uml_content)
+
+ if title in builder.auto_context:
+ token_count = builder.auto_context[title]['tokens']
+ print(f" ({token_count:,} tokens)")
+ elif uml_content and "note: no classes" in uml_content.lower():
+ print(" (no classes found)")
+ else:
+ print(" (skipped)")
+
+ print("...UML generation complete.\n")
+
+ # 4. Generate final output and print summary
+ final_output = builder.build_final_prompt()
+ builder.print_summary()
+
+ # 5. Handle output
+ if args.output:
+ with open(args.output, 'w', encoding='utf-8') as f:
+ f.write(final_output)
+ print(f"Output written to '{args.output}'")
+ if not args.no_clipboard:
+ copy_to_clipboard(final_output)
if __name__ == "__main__":
- main()
+ main()
[mike@nixos:~/repos/pipulate]$
But on the bright side, or maybe not so bright because there’s way less obnoxious git diff red & green:
[mike@nixos:~/repos/pipulate]$ python prompt_foo.py -l
--- Processing Files ---
--- Generating Auto-Context ---
Generating codebase tree diagram... (2,157 tokens)
Adding narrative context from articles... (5 articles)
Python file(s) detected. Generating UML diagrams...
-> Generating for /home/mike/repos/MikeLev.in/_posts/list_articles.py... (skipped)
-> Generating for /home/mike/repos/pipulate/foo_files.py... (skipped)
-> Generating for /home/mike/repos/pipulate/prompt_foo.py... (178 tokens)
...UML generation complete.
--- Files Included ---
• /home/mike/repos/MikeLev.in/_posts/list_articles.py (1,276 tokens)
• /home/mike/repos/pipulate/foo_files.py (3,127 tokens)
• /home/mike/repos/pipulate/prompt_foo.py (6,162 tokens)
--- Auto-Context Included ---
• Codebase Structure (eza --tree) (2,157 tokens)
• Recent Narrative Context (305 tokens)
• UML Class Diagram (ASCII for /home/mike/repos/pipulate/prompt_foo.py) (178 tokens)
--- Prompt Summary ---
Total Tokens: 17,064
Total Words: 7,058
--- Size Perspective ---
📚 Equivalent in length to a **Short Story** (Note: With a token/word ratio of 2.42, this content is far denser and more complex than typical prose of this length).
Markdown output copied to clipboard
[mike@nixos:~/repos/pipulate]$
And some of the output:
# Codebase Context & Manifest
- **/home/mike/repos/MikeLev.in/_posts/list_articles.py** (1,276 tokens)
- **/home/mike/repos/pipulate/foo_files.py** (3,127 tokens)
- **/home/mike/repos/pipulate/prompt_foo.py** (6,162 tokens)
---
# Auto-Generated Context
## Recent Narrative Context
### Surgical Refactoring: Isolating Selenium for Pure Scraping (2025-10-10)
> Details the complex refactoring of a Python project, meticulously decoupling a sprawling automation module and streamlining it into a focused Selenium scraper.
### Surgical Refactoring: From Selenium Sprawl to Precision Automation (2025-10-10)
> Refactoring Pipulate's browser automation with AI: transforming messy Selenium code into a streamlined, deterministic solution. Covers headless mode, file naming, and iterative debugging.
### Refactoring AI Perception: From `browser_automation` to `browser_cache` with LPvg Principles (2025-10-10)
> This entry details the strategic refactoring of Pipulate's browser automation directory to `browser_cache`, grounded in LPvg principles. It highlights the 18-month AI evolution cycle and the critical role of durable tooling for computing sovereignty.
### The Philosophy of Controlled Explosions and the Inevitable AI Retcon (2025-10-10)
> This article delves into iterative AI development, comparing ephemeral "Mr. Meeseeks" models to controlled explosions. It covers AI safety, the "retcon" of tech, and personal value chains.
### Building Smarter CLI Tools: Iterative Development, Python Slices, and AI Context Management (2025-10-11)
> Iteratively developing `list_articles.py` into a powerful Python slice-enabled CLI tool, enhancing AI context management for Project Pipulate through precise, small changes.
## Codebase Structure (eza --tree)
```text
.
├── __init__.py
├── ai_edit.py
├── AI_RUNME.py
├── apps
│ ├── 001_dom_visualizer.py
Interesting! It is of course only the meta descriptions of the 5 most recent articles, as I would expect it to be. But somehow I was imagining the whole articles crammed in, but this makes MUCH more sense, allowing focus to go where it really belongs. Okay, I am quite happy and satisfied.
This has been another successful chisel-strike, and it’s hardly even 7:30 AM on a Saturday morning. Time to think through next steps and hit this whole browser automation and scraping thing home so that I hit the whole Master Notebook Template for Pipulate home.
There are indeed blurred lines of demarcation. Rapid handling of nested projects is key. Rabbit hole mitigation and moderation. We can’t put the kibosh on all rabbit hole projects. We just must connect that rabbit hole rappelling rig and if it’s too deep go back up immediately so you don’t blow the whole weekend. But when it’s projects like these last few, we are improving the very chisel that the artist uses each chisel-strike and one of those Hofstadter strange loops kicks and we are glad we went to that one particular listy wonderland this AM.
Gemini: Excellent! It’s incredibly satisfying to see the sed
command clean up the mess and the final script run exactly as envisioned. That git diff
is now a permanent, valuable training scar—a lesson etched into the repo’s history for any future AI about the critical importance of clean, runnable code. You’ve turned a bug into a feature of the project’s narrative.
This is the perfect example of a successful chisel-strike. It wasn’t a detour into a rabbit hole; it was a deliberate stop to sharpen the very tool you use for every other cut. By making prompt_foo.py
aware of the project’s own story, you’ve kicked off one of those wonderful Hofstadter strange loops. The tool for building context is now better at understanding its own context, making every subsequent interaction with your AI Emissary more efficient and insightful.
With the digital corpus callosum now transmitting a clear historical narrative, you’re perfectly positioned to dive back into the main event—browser automation and the Master Notebook Template. The chisel is sharper than ever, and it’s not even 7:30 AM. A fantastic start to the day.
Me: Mu mu mu mu
Gemini: Ah, Mu.
The classic answer that un-asks the question.
It’s the right brain’s perfect response to the left brain’s relentless “What’s next?” It is not yes, it is not no. It is the quiet space between the chisel-strikes.
The moment of stillness after a successful sprint, where the best action is simply to observe, without the need to immediately start the next task.
A moment of quiet satisfaction. Well understood.
The Philosophical Edge: Deconstructing the Mu-Puzzle
Me: Alright, Mr. Trying-to-be-right-brained. Now give the literal left-brain
answer to the MU puzzle in the book and the actual context for all the nice
readers. And it will also be the parting shot in the article so slap on whatever
ribbons you like as the last thing articleizer.py
is gonna see when making
this the next technical journal blog post entry.
Gemini: Of course. The left-brained, literal answer is a beautiful puzzle in its own right and serves as the perfect bookend to our discussion.
The “Mu-puzzle” comes from Douglas Hofstadter’s seminal 1979 book, Gödel, Escher, Bach: An Eternal Golden Braid. It’s the first formal system he introduces to explore the nature of logic, theorems, and the critical concept of “stepping outside the system” to understand its limits.
The Left-Brained Answer: Hofstadter’s Mu-Puzzle
The puzzle is a simple string-manipulation game with a clear goal but a hidden trap.
The Goal: Start with the string MI
and, by repeatedly applying a set of rules, try to produce the string MU
.
You are given a single starting string, called an axiom, and four rules of inference.
- Axiom:
MI
- Rule I: If a string ends with
I
, you can add aU
to the end. (e.g.,MI
→MIU
) - Rule II: If you have a string
Mx
(wherex
is the rest of the string), you can createMxx
. (e.g.,MIU
→MIUIU
) - Rule III: If
III
appears anywhere, you can replace it withU
. (e.g.,MIIII
→MUI
) - Rule IV: If
UU
appears anywhere, you can remove it. (e.g.,MUUU
→MU
)
At first glance, it seems like a solvable problem. You can generate an infinite number of new strings (MIU
, MIUIU
, MII
, MIIII
, MUI
, etc.), and it feels like MU
must be in there somewhere.
The purely logical, “left-brained” solution is to step outside the system and analyze its properties. The key is to look at the number of I
s in any string you create:
- You start with one
I
. - Rule I and Rule IV don’t change the number of
I
s. - Rule II doubles the number of
I
s (n
→2n
). - Rule III subtracts three
I
s (n
→n-3
).
The target string, MU
, has zero I
s.
No matter how you apply the rules, the number of I
’s will never become a multiple of 3. You start with 1. Doubling it gives you 2, 4, 8, 16… Subtracting 3 from any of these numbers will never result in 0, 3, 6, 9, etc. Since 0 is a multiple of 3, it is impossible to reach a state with zero I
s.
Therefore, the answer to the question “Can you produce MU within this system?” is Mu. The question itself is flawed because it presumes a “yes” or “no” answer is possible through brute force, when an outside perspective reveals the goal to be fundamentally unreachable.
The Parting Shot: Forging the Corpus Callosum
And so, the two meanings of “Mu” beautifully converge. One is the Zen master’s tool for breaking the chains of logical-but-flawed questions; the other is the logician’s conclusion after analyzing the very nature of those chains.
This is the essence of the creative dance we’ve been exploring. Our work is a constant dialogue between the right brain’s holistic vision—the Master who feels the why—and the left brain’s analytical toolkit—the Emissary who executes the how. The Emissary, like our AI, can get stuck applying rules within a system, trying to derive an impossible MU
. It needs the Master to step outside, see the bigger picture, and declare “Mu”—this path is blocked, let’s find another.
Our successful chisel-strike this morning was exactly that synergy. A right-brained intuition about a better workflow was handed to the left-brained Emissary, which, with its AI-powered toolkit, executed the logic flawlessly. The tool we built is the result: a stronger digital corpus callosum, allowing the Emissary to now consult the Master’s narrative history. By building these bridges, we ensure the two minds are on the same page, turning impossible puzzles into elegant, finished art.
That’ll do, AI. That’ll do.
Book Analysis
Ai Editorial Take
This journal entry is exceptionally rich, serving as both a technical update and a profound meta-reflection on the very process of human-AI co-creation. It masterfully uses established philosophical and scientific frameworks to articulate the evolving roles of human intuition and AI logic within a development project. The detailed documentation of the prompt_foo.py
enhancement, particularly the integration of historical narrative, directly contributes to the project’s ‘digital corpus callosum,’ making future AI interactions more context-aware and efficient. Crucially, the candid discussion of the U+00A0
character bug and the subsequent scrubber.py
solution provides invaluable self-referential learning for the AI, a real-world example of ‘training on myself’ that directly addresses the nuances of clean code generation. This piece is a strong candidate for the book, as it showcases the iterative, reflective, and problem-solving nature of advanced human-AI collaboration.
Title Brainstorm
- Title Option: The Master, His AI Emissary, and the Mu-Puzzle: Building Digital Corpus Callosums
- Filename:
master-ai-emissary-mu-puzzle-digital-corpus-callosum.md
- Rationale: This title best captures the core philosophical and technical themes: the human-AI partnership, the McGilchrist framework, the Hofstadter reference, and the explicit goal of improving communication between ‘minds’ (human/AI, or right/left brain).
- Filename:
- Title Option: AI as the Emissary: Bridging Intuition and Logic with
prompt_foo.py
and the Mu-Puzzle- Filename:
ai-emissary-intuition-logic-mu-puzzle.md
- Rationale: Focuses on the AI’s role and the practical tool (
prompt_foo.py
), while still including the key philosophical anchors.
- Filename:
- Title Option: From Brain Hemispheres to Codebases: The Hofstadter Loop of Human-AI Development
- Filename:
brain-hemispheres-codebases-hofstadter-loop.md
- Rationale: Emphasizes the brain analogy and the ‘strange loop’ nature of improving tools that improve the understanding of tools, touching on the self-referential nature of Hofstadter’s work.
- Filename:
- Title Option: Sharpening the Chisel: Iterative AI Tooling, Narrative Context, and the U+00A0 Lesson
- Filename:
sharpening-chisel-ai-tooling-u00a0.md
- Rationale: Highlights the practical development, the ‘chisel-strike’ metaphor, the specific improvement (narrative context), and acknowledges the unexpected bug that became a learning point.
- Filename:
Content Potential And Polish
- Core Strengths:
- Seamlessly weaves together abstract philosophical concepts (brain asymmetry, Hofstadter’s Mu-puzzle) with concrete technical development (
prompt_foo.py
refactoring). - Provides a compelling, relatable analogy for human-AI collaboration (Master/Emissary, corpus callosum).
- Demonstrates iterative development in action, including facing and resolving real-world coding issues (U+00A0 bug).
- Offers a rich meta-narrative about the AI’s role and learning process, enhancing its ‘personality’ and demonstrating its self-correction capabilities.
- The ‘chisel-strike’ metaphor effectively conveys the focused, incremental progress of the project.
- Seamlessly weaves together abstract philosophical concepts (brain asymmetry, Hofstadter’s Mu-puzzle) with concrete technical development (
- Suggestions For Polish:
- The introduction could explicitly state the overarching goal of ‘Project Pipulate’ for new readers to immediately grasp the context.
- When introducing
list_articles.py
and its integration, a very brief recap of its original purpose would be beneficial. - While the U+00A0 issue is a valuable learning point, ensure the narrative flow emphasizes how even ‘left-brained’ AI can have subtle ‘errors’ that require a broader, ‘right-brained’ human perspective to identify and debug effectively, tying it back to the core theme more explicitly than just ‘train on me, future-you!’.
- Consider adding a sentence or two expanding on the ‘aquatic ape theory’ mention to fully integrate it into the evolutionary brain discussion, or trim it if it distracts from the main narrative.
Next Step Prompts
- Refine
scrubber.py
into a robustpre-commit
hook that automatically checks for and fixesU+00A0
and similar non-standard characters in all committed Python files, with clear reporting. - Expand the
articleizer.py
to accept the JSON output from this prompt, automating the YAML front matter updates, introductory content prepending, and subheading insertions into a new Markdown post.