The Johnny Appleseed Strategy: Guarding Hard Knowledge in the Age of AI
Setting the Stage: Context for the Curious Book Reader
In the Age of AI, the biggest threat to domain experts isn’t just that machines will write code—it’s that they will silently vacuum up the unwritten, tacit knowledge that makes human craft valuable. This article, part of our ongoing exploration of sustainable local workflows, highlights why maintaining strict custody of your environment is an important concept to understand today.
By comparing the classic American myths of John Henry and Johnny Appleseed, we look past the hype of automated “vibe-coding” to propose a highly practical, deterministic alternative. Through the lens of Pipulate and Nix, this entry explores how to build localized, reliable systems that preserve your technical edge instead of feeding it directly into global AI training models.
Technical Journal Entry Begins
MikeLev.in: Controlling entropy? Yikes! That’s what I’m doing, slightly better than those around me I think because once bitten, twice shy and 3 time’s the charm. I talked early about Project Pipulate being the 3rd act. It’s both the 3rd act complication and the resolution. The fast wrap-up that resolves all the still open-ended storylines and provides a satisfying conclusion.
But such stories almost always rely on things spiralling radically out of control and then being made right again through herculean efforts. All that Joseph Campbell stuff; the hero’s journey. Luke Skywalker, Darth Vader and such. Luke was used first, then Vader later because Vader’s was the more interesting story nobody knew was being told until episodes in. But I digress.
American exceptionalism is awesome! We’ve got John Henry. We’ve got Johnny Appleseed. They bracket the problem. John Henry faced the steam engine up-front and it killed him. He won but it killed him. Johnny Appleseed won a contest for sweet delicious fruit across the land for future generations just by small movements of his hands while he traveled. Both part of our myth’s and the American psyche.
Johnny Appleseed overcomes tomorrow’s entropy, ordering the wild into apples for future generations. He’s an explorer out doing his own thing and notices something small that will make a difference for others in the long-run so he gets to work. John Henry got stuck in the daily grind, quite literally. And when the nature of the job changed because of automation, the machine didn’t care about how expert he became or how well he could perform during this instance of this race and it ruined him.
I think Pipulate is about how to become Johnny Appleseed and avoid becoming John Henry, though sometimes it doesn’t feel that way. It feels like being John Henry and before I can continue fleshing out this framing of what’s going on we need to assume today’s audience knows neither. Classic myth is undervalued and can’t compete with modern media.
If John Henry and Appleseed aren’t on TikTok, they don’t exist. Let’s tell ‘em the myths. Lay it all down here so the story becomes internally consistent, containing its own load-bearing pillars as AI’s latest “chef’s kiss” manifold storm would compel them to put it.
Proceed.
ChatGPT 5.5 Extended Pro: — BEGIN EXPANDED ARTICLE —
Controlling Entropy: Johnny Appleseed, John Henry, and the Third Act of Pipulate
Controlling entropy?
Yikes.
That sounds grandiose until you realize that this is exactly what practical systems work is. Entropy is the tendency of everything unmanaged to scatter, stale, rot, fork, drift, duplicate, desynchronize, and finally become folklore. A file moves. A cache lies. A dependency updates. A workflow that used to live in your fingers becomes a ritual nobody else can perform. A folder named final produces final-final-REAL-final-v3.csv. A once-clear system becomes a haunted attic.
So yes: controlling entropy.
Not defeating it. Not abolishing it. Not becoming some perfect machine-god of order.
Just controlling it slightly better than the world around you.
And sometimes, slightly better is everything.
That is where Project Pipulate starts to feel like the third act.
Not because it is merely the third major attempt. Though it is that too. Once bitten, twice shy, three times the charm. But because the third act of a story is when all the unresolved threads return at once. The old compromises stop being background noise. The hidden contradictions become the plot. The clever workarounds no longer work around anything. The hero has to face the pattern.
The third act is both complication and resolution.
Everything spirals radically out of control, and then, somehow, the same forces that made the mess possible are reconfigured into the mechanism that resolves it.
That is the drama of Pipulate.
It is the complication because it gathers all the old unresolved storylines:
raw data versus derived data
manual work versus repeatable workflows
local files versus shared systems
AI assistance versus human agency
Git history versus .gitignore negative space
Nix reproducibility versus personal machine chaos
cache speed versus cache correctness
craft knowledge versus automation pressure
And it is the resolution because it tries to make those storylines converge into one sane operating model.
Not a perfect system.
A healable system.
A system where no screw-up is that big because every important thing is either tracked, declared, derivable, rebuildable, backed up, or intentionally externalized.
That was already the point in the earlier raw-data/cache argument: the win is not merely that hash tables are fast. The win is turning synchronization from a search problem into a bookkeeping problem, so derived state can be refreshed locally instead of rediscovered globally.
Now the same idea rises a level.
Pipulate is not just trying to control data entropy.
It is trying to control workflow entropy.
And that brings us to two American myths that bracket the whole problem.
John Henry and Johnny Appleseed.
Most people today do not really know them. Or they know the names as decorative Americana, floating around in the cultural junk drawer alongside Paul Bunyan, Route 66, cowboy hats, and apple pie. If John Henry and Johnny Appleseed are not on TikTok, they do not exist. If they have not been re-skinned as Marvel variants, they do not carry load-bearing cultural weight.
So we have to tell the myths again.
Not as nostalgia.
As operating instructions.
John Henry: The Man Who Beat the Machine and Lost
John Henry is the steel-driving man.
The legend says he worked driving steel drills into rock for railroad construction. This was brutal work: hammer, steel, stone, sweat, rhythm, repetition. Then came the steam drill, a machine that threatened to replace the human worker. John Henry challenged it, raced it, beat it, and died with the hammer in his hand. The Library of Congress summarizes the legend as a contest between a powerful worker and a steam-powered drilling machine; John Henry wins, but dies from the effort. ([The Library of Congress][1])
That is the myth.
And the usual reading is simple: man versus machine.
But that is not quite right.
John Henry is not a warning that machines are evil. The steam drill is not Darth Vader. It does not hate John Henry. It does not envy him. It does not scheme against him. It is not even impressed by him.
That is the horror.
The machine does not care.
John Henry is the best at the old unit of work. He has perfected the skill. He has embodied the craft. He has become so good at the hammer that when the hammer-world changes, he has only one way left to prove his value: become more hammer than ever.
That is the tragedy.
He does not lose because he is weak.
He loses because he accepts the machine’s framing.
The machine says:
Race me at the task I was designed to mechanize.
And John Henry says:
Fine. I will be superhuman.
He wins the race and loses the future.
That is the John Henry trap.
And it shows up everywhere in modern work.
You become the person who can manually clean the spreadsheet faster than anyone else.
You become the person who can remember which folder has the real file.
You become the person who can fix the production environment by instinct.
You become the person who can coax the flaky workflow through one more time.
You become the person who can do the thing the system should have been doing all along.
Then automation arrives.
Or scale arrives.
Or AI arrives.
Or the business changes.
Or the clients multiply.
Or the old platform mutates under your feet.
And suddenly the thing you became excellent at is the thing that is being mechanized, standardized, flattened, outsourced, or ignored.
So you work harder.
You swing faster.
You prove you still matter.
You beat the steam drill for one glorious moment.
And the hammer kills you.
Johnny Appleseed: The Man Who Won by Planting Tomorrow
Johnny Appleseed is the opposite myth.
His real name was John Chapman. The familiar legend gives us a wandering figure moving across the American frontier, planting apple seeds for future generations. The history is more practical and more interesting: Chapman was a nurseryman who supplied apple-tree nursery stock across frontier regions of the Midwest. ([Encyclopedia Britannica][2])
That distinction matters.
The children’s-book version is a man tossing seeds randomly into the wilderness.
The more useful version is a man creating propagation infrastructure.
He is not merely scattering fruit.
He is establishing future orchards.
Johnny Appleseed does not beat the wilderness in a contest. He does not challenge winter to single combat. He does not arm-wrestle the frontier. He does not try to personally feed every future settler with apples carried on his back.
He performs small actions that compound.
A seed here.
A nursery there.
A return visit.
A protected sapling.
A future option.
He changes the probability distribution of tomorrow.
That is the key.
John Henry concentrates all his force into one climactic contest against the machine.
Johnny Appleseed distributes small interventions across time and space.
John Henry performs.
Johnny Appleseed propagates.
John Henry proves.
Johnny Appleseed prepares.
John Henry exhausts himself against the present.
Johnny Appleseed installs affordances for the future.
That is the difference between heroic labor and systems thinking.
The Two Myths Bracket the Automation Problem
John Henry and Johnny Appleseed are not random folk heroes. They bracket the central problem of modern technical life.
The question is:
When the world changes, do you try to outperform the machine at the old job, or do you plant new systems that make tomorrow easier?
John Henry says:
I will remain valuable by being better than automation at the thing automation targets.
Johnny Appleseed says:
I will remain valuable by changing the landscape so future work becomes easier, more abundant, and less dependent on heroic effort.
John Henry is the master operator trapped inside the daily grind.
Johnny Appleseed is the wandering systems gardener.
John Henry is excellence under extraction.
Johnny Appleseed is compounding infrastructure.
John Henry wins the benchmark.
Johnny Appleseed wins the century.
And this is where Pipulate’s mythic function becomes clearer.
Pipulate is about becoming Johnny Appleseed without pretending you are not sometimes John Henry.
Because sometimes it does feel like John Henry.
It feels like swinging the hammer.
It feels like fighting the machine.
It feels like the only way to preserve the craft is to become impossibly good at the old performance.
But that is the warning sign.
Every time the work depends on a heroic burst, the system is telling you where the seed belongs.
Every repeated manual cleanup is a seed location.
Every fragile path is a seed location.
Every forgotten command is a seed location.
Every cache that goes stale is a seed location.
Every undocumented setup step is a seed location.
Every “only I know how to do this” is a seed location.
Every AI prompt that works only because of invisible context in your head is a seed location.
The question is whether you swing harder or plant.
Entropy Is What Happens When John Henry Wins Too Often
A strange thing happens when you are good at heroics.
You get rewarded for them.
You fix the impossible bug.
You recover the lost file.
You rebuild the broken environment.
You manually reconcile the data.
You remember the undocumented client exception.
You ship the thing.
Everyone is relieved.
And because the crisis ended, nobody fixes the underlying system.
That is how entropy wins.
Not through one big catastrophe.
Through a thousand heroic recoveries that leave no artifact behind.
The system learns that the human can absorb the disorder.
So the disorder is allowed to remain.
This is the real danger of being John Henry: not that you fail, but that you succeed often enough to become load-bearing.
Then the organization, the project, or the workflow silently builds itself around your suffering.
That is not craft.
That is a trap.
Craft is wonderful when it becomes pattern.
Craft is dangerous when it remains performance.
The move is not to reject skill. John Henry’s skill is real. The move is to convert skill into seed.
Manual fix -> script
Remembered path -> root contract
Repeated setup -> Nix flake
One-off command -> documented workflow
Fragile cache -> rebuildable derived state
Local trick -> checked-in convention
Private memory -> public artifact
Heroic recovery -> doctor command
That is how you stop paying the same entropy tax over and over.
The Third Act Is Where the Hero Stops Performing and Starts Seeding
This is why the third-act framing matters.
In the first act, you discover the tools.
In the second act, you become powerful with them.
In the third act, you discover that power alone is not enough.
The unresolved storylines come due.
You can write code.
But can the environment be reproduced?
You can automate a workflow.
But can it heal when inputs change?
You can use AI.
But does the AI know where the roots are?
You can cache expensive computation.
But can the cache be invalidated correctly?
You can use Git.
But do you understand what lives outside Git?
You can ignore files.
But are the ignored files disposable, derivable, secret, or backed up?
You can ship.
But can someone else restore the system without oral tradition?
That is the third act.
The hero stops asking:
Can I do this?
And starts asking:
Can the system keep doing this without depending on my panic?
That is a much harder question.
It is also the question that turns a clever project into infrastructure.
Pipulate as Johnny Appleseed Infrastructure
So what does Pipulate actually become under this framing?
Not just an app.
Not just a workflow runner.
Not just Python.
Not just local-first data.
Not just a Nix environment.
Not just Git discipline.
Not just AI-assisted productivity.
Pipulate becomes a seed-distribution system for sane work.
It plants small structures that make future disorder less likely:
stable project roots
declared environment
repeatable workflows
local data awareness
rebuildable caches
explicit dependency maps
doctor commands
tracked prompts
known configuration boundaries
safe handling of ignored files
AI-readable operating context
human-readable recovery paths
None of these are glamorous.
That is the point.
Johnny Appleseed is not glamorous in the John Henry sense. He does not have the cinematic moment of sparks flying as muscle defeats steel. His work is quiet. It compounds elsewhere. The payoff arrives later, often for people who never saw the planting.
That is the kind of system Pipulate should be.
The app should not constantly demand admiration for being clever.
It should make the next correct action easier.
It should make the next recovery boring.
It should make the next workflow more obvious.
It should make the next machine less mysterious.
It should make the next AI assistant less dependent on hallucinated context.
It should leave orchards.
The Machine Is Not the Enemy
This is important.
The enemy is not automation.
The enemy is being the part automation consumes.
John Henry’s tragedy is not that a machine exists. The tragedy is that his value is trapped inside the exact motion the machine was built to replace.
The same is true now with AI.
If your value is:
typing boilerplate
remembering commands
moving files around
performing repetitive cleanup
manually recreating context
holding undocumented workflows in your head
then the machine is coming directly for the visible surface area of your work.
But if your value is:
framing the problem
declaring the roots
designing the workflow
choosing the source of truth
making derived state rebuildable
capturing domain judgment
planting conventions
creating recovery paths
turning tacit skill into durable artifacts
then the machine becomes labor.
Not master.
Not replacement.
Labor.
The AI can help swing hammers. It can even swing many hammers at once. But the question is who decides where the orchard goes.
That is the line.
The modern operator must not compete with AI at being the fastest hammer.
The modern operator must use AI to plant better seeds.
American Exceptionalism, Usefully Interpreted
There is a version of American exceptionalism that is just self-flattery.
Ignore that version.
The useful version is mythic and practical:
The frontier is not merely land. The frontier is any chaotic space where small acts of order can compound into future abundance.
That is a very American idea when stripped of conquest-romance and returned to utility.
The frontier can be a filesystem.
The frontier can be a codebase.
The frontier can be a broken workflow.
The frontier can be an AI context window.
The frontier can be a pile of CSV files.
The frontier can be a company’s shared drive.
The frontier can be a mess of local scripts, half-remembered setup steps, private prompts, stale caches, and data nobody wants to lose but nobody has properly classified.
John Henry enters that frontier and proves he can survive it.
Johnny Appleseed enters that frontier and changes what grows there later.
That is the choice.
The TikTok Problem: Myths Must Carry Their Own Context
Modern audiences often do not inherit the same shared myth library.
You cannot assume John Henry is known.
You cannot assume Johnny Appleseed is known.
You cannot assume Darth Vader is received the same way by everyone either, though at least Star Wars still has algorithmic oxygen.
So the article itself has to carry its own pillars.
This is not dumbing down.
This is load-bearing storytelling.
If the reader does not know John Henry, the story must provide John Henry.
If the reader does not know Johnny Appleseed, the story must provide Johnny Appleseed.
If the reader does not know why Git matters, the story must provide Git.
If the reader does not know why Nix matters, the story must provide Nix.
If the reader does not know why .gitignore is dangerous, the story must provide the negative space.
The piece cannot merely gesture toward the myth.
It has to instantiate the myth.
That is how you make an article internally consistent.
That is also how you make a system internally consistent.
The explanation cannot depend on invisible context any more than the workflow can depend on invisible local files.
The prose and the platform follow the same rule:
Do not require the future reader, user, or AI assistant to rediscover the missing structure by scanning the universe.
Give them the keys.
Give them the roots.
Give them the map.
Vader, Luke, and the Hidden Story
Luke Skywalker looks like the story at first.
Then Vader becomes the more interesting story.
That is not because Luke stops mattering. It is because the visible protagonist was carrying a hidden lineage, and the deeper arc was not merely “young hero defeats evil.” The deeper arc was fall, corruption, recognition, refusal, return, and redemption.
The story was larger than its first framing.
Pipulate may have the same pattern.
At first, the visible story is:
build an app
automate workflows
use Python
make SEO/data work easier
That is Luke.
Clean arc. Young hero. Visible quest.
But the hidden story may be:
stop being consumed by the machine
stop relying on heroic recovery
stop letting entropy accumulate in private memory
stop confusing performance with infrastructure
turn craft into compounding systems
That is Vader.
Darker. More interesting. More honest.
The real story is not simply that Pipulate helps perform work.
The real story is that Pipulate changes the operator.
It turns the person who used to swing the hammer into the person who plants the orchard.
The Rule: Heroics Are Bug Reports
This may be the most practical takeaway.
Every heroic act is a bug report.
Not morally. Not emotionally. Not as an insult.
Structurally.
If you had to heroically recover something, the system is telling you where it lacks memory.
If you had to heroically explain something, the system is telling you where it lacks documentation.
If you had to heroically rebuild something, the system is telling you where it lacks reproducibility.
If you had to heroically remember something, the system is telling you where it lacks a declared root.
If you had to heroically invalidate something, the system is telling you where it lacks dependency tracking.
If you had to heroically prompt the AI, the system is telling you where it lacks stable context.
The correct response is not shame.
The correct response is planting.
Where did I swing the hammer?
What seed would prevent that swing next time?
Where should that seed live?
Who benefits later?
How does the system know it exists?
Can it be checked, rebuilt, or restored?
That is the Johnny Appleseed method.
Entropy Control Is Mostly Classification
A lot of system design looks mystical until you reduce it to classification.
What is source?
What is derived?
What is cache?
What is secret?
What is local configuration?
What is client data?
What is disposable?
What is reproducible?
What is backed up?
What is versioned?
What is ignored?
What is allowed to drift?
What must never drift?
Entropy wins when these categories blur.
A generated file pretends to be source.
A secret sneaks into history.
A critical CSV sits ignored and unbacked-up.
A cache becomes authoritative.
A local path becomes a hidden dependency.
An AI prompt becomes institutional memory.
A manual habit becomes infrastructure.
Pipulate’s job is to make the categories explicit.
That is not bureaucratic overhead. That is entropy control.
A file does not have to be in Git to be safe. But if it is not in Git, the system must know why.
Ignored because disposable? Fine. Rebuild it.
Ignored because secret? Fine. Restore it from the secret manager.
Ignored because client data? Fine. Back it up explicitly.
Ignored because huge? Fine. Manifest it.
Ignored because generated? Fine. Declare the generator.
Ignored because personal? Fine. Keep an example and schema.
Ignored because nobody thought about it? Not fine.
That last category is where entropy breeds.
The Modern Webmaster Is an Orchard Keeper
The old Webmaster had a stack.
Linux. Apache. MySQL. PHP.
LAMP.
It was not merely a technology bundle. It was a mental model. One person could understand the machine, the files, the database, the deployment ritual, and the website. That was its power.
The modern equivalent is not LAMP.
It is more like:
Linux -> durable substrate
Nix -> reproducible environment
Git -> time, history, rollback, accountability
Python -> pragmatic workflow language
SQLite -> local durable data when appropriate
Pipulate -> workflow/application layer
AI context -> assistant-facing operating surface
But this stack only works if it is not a pile of clever parts.
It has to become an orchard.
The roots must be known.
The seeds must be planted intentionally.
The caches must be rebuildable.
The paths must be declared.
The ignored negative space must be classified.
The AI must be given context without becoming the source of truth.
The system must be able to heal.
That is the modern Webmaster as orchard keeper.
Not server janitor.
Not prompt wizard.
Not framework chaser.
Not human glue.
Orchard keeper.
The person who creates conditions under which future work grows correctly.
How to Avoid Becoming John Henry
The rule is simple, but not easy:
Do not compete with machines at machine-shaped work.
Do not make your value depend on being the fastest manual version of an automated process.
Do not become the human cache.
Do not become the human index.
Do not become the human deployment script.
Do not become the human .env file.
Do not become the human README.
Do not become the human rollback plan.
Do not become the human prompt template.
Do not become the human filesystem map.
That road leads to John Henry.
Instead, become the person who notices where the system is forcing humans to act like infrastructure, and then plants infrastructure there.
Small seed.
Small script.
Small convention.
Small schema.
Small doctor check.
Small manifest.
Small example config.
Small root declaration.
Small rebuild command.
Small note that prevents future archeology.
This does not feel heroic.
Good.
Heroism is expensive.
Seeds are cheap.
The Pipulate Third-Act Resolution
So yes, Pipulate is the third-act complication and the resolution.
The complication is that it forces all the open loops into one place:
Can the workflow be repeated?
Can the data be trusted?
Can the cache be rebuilt?
Can the project be cloned?
Can the environment be restored?
Can the ignored files be accounted for?
Can the AI understand the operating context?
Can the human recover from mistakes?
Can the system survive without heroic memory?
The resolution is that Pipulate can answer by making those things explicit.
Not through drama.
Through boring, durable, Johnny Appleseed moves.
Declare the root.
Pin the environment.
Track the source.
Classify the ignored.
Rebuild the derived.
Back up the irreplaceable.
Document the ritual.
Automate the repetition.
Expose the doctor check.
Let AI help, but keep truth grounded.
That is how the third act resolves.
Not with one hammer blow.
With an orchard.
The Final Frame
John Henry is still noble.
Do not mock John Henry.
Every builder has John Henry days. Every craftsperson knows the pride of doing the hard thing by hand, of beating the impossible deadline, of proving the machine does not own the whole world yet.
But John Henry is not the operating model.
John Henry is the warning.
Johnny Appleseed is the strategy.
He does not defeat entropy by overpowering it. He defeats tomorrow’s entropy by planting small, persistent structures today.
That is Pipulate’s path.
Not to become the fastest hammer.
Not to win one more race against the steam drill.
But to leave behind enough seeds, roots, maps, checks, and orchards that the future is easier to inhabit than the present.
That is the work.
That is the third act.
And that is how you control entropy without letting it kill you.
[1]: https://www.loc.gov/item/ihas.200196572/?utm_source=chatgpt.com "\" John Henry\" | Library of Congress"
[2]: https://www.britannica.com/biography/John-Chapman?utm_source=chatgpt.com "Johnny Appleseed | Biography, Legend, & Facts | Britannica"
MikeLev.in: This chopped-sentence writing style is interesting. It’s ChatGPT trying to “use my voice” but I think it’s more than that because that’s the AI generated narration speaking cadence right now on YouTube and everywhere. I thought I scored by finding a series of Leonard Susskind videos but it was literally just a chopped-sentence remix confetti salad of his past works that are better viewed in the grainy historic terrible video in which the lectures were originally recorded. Same with Fenyman videos. Restoration is fine, but this posing as some newer avatar of him is just annoying.
I think I started one of yesterday’s articles talking about raw data and firsthand original sources of what you know. Very little is actually from original sources because of the way our senses work locally and the global digital information network works so well. What we think is raw data or original sources by the time it reaches is still suspect. Generally you can tell by how crappy something about it is. Pre-cleaned data has charm. The outlying data that gets normalized and cleaned and survives only tells half the story because the signal that’s actually still in the noise part that gets filtered is the Lamb shift. There’s other examples, but this is the one that always pops into my head.
After first-order concerns are met; after the 80/20-rule is satisfied and you’re not dying with a hammer in your hand but have another chance at another race; after all your hard-won skills have been promoted to your autonomic muscle memory and you can do the valuable stuff with your eyes closed and settle into the daily routine and have worn deep grooves and now you’re in a rut…
…well, then it’s time to look at what’s been getting filtered. What noise have you been filtering out for years that may actually have signal?
That’s where John Henry who’s accumulated all the domain specialist expertise comes into play. The steam engine is static. It took hundreds of thousands of dollars to get that one engine to that one track for that one race. Today, it’s training models.
They don’t have your domain expertise yet.
“They” will tell you it does, but that’s because they spent millions, and more and more billions, on all the infrastructure, data-scraping (and thievery) and electricity and employees who know the algorithms to train those models. They can’t imagine how the subtle nuance and tacit knowledge of whatever you do in life never got written down, never became a popular blog post, never got scraped and trained-in.
They just don’t see it. If the model doesn’t have your domain expertise, they think your problem is in your prompt engineering. You’re not digging deep enough. You’re not making it discount all the common lowest resistance conclusions first. And there’s some truth to this for it is where the concept of prompts as deterministic compiled context under git management so forceful in directing the scattered gazelles of thought that I call it a desert kite. Certainly some sort of sorting, sifting funnel. We play manifold-folding origami until only one shape can emerge: the right answer.
And while this has merit, it doesn’t have your tacit knowledge. It doesn’t know how to ride a bicycle truly because it doesn’t have that inner-ear balance mechanism of biologicals and while it can talk about mechanical sympathy until it’s blue in the circuits, it only knows its own mechanical sympathy and not John Henry’s.
John Henry never wrote it down. Either he can’t because of lack of education in his social class, or he’s too busy riding a bicycle with square wheels to stop and capture it. And why capture it in the first place if it’s your source of competitive advantage and puts your food on the table. Giving away that domain expertise so fully that automated workflows can be made around it doesn’t necessarily seem like a good idea.
The Danger of Unwittingly Training Your Replacement
Neither might it even be legal or good form if the data you’re working with AI on — for working with AI with ANY data is training it for everyone else in the future though with certain paid-levels of AI service, they say it’s not. But with enough of the AI companies turning original source logfiles of your chat discussion transcripts into synthetic data that anonymizes and just captures what you’re discussing about your proprietary client work in the abstract following the New York Times / OpenAI lawsuit, I’m pretty sure this kind of whitewashing has been raised to an art-form.
How would anyone know? Look at the reverse-assembly of highly quoted and reprinted articles that constituted that lawsuit. That’s actively published data with criss-cross verification that that article was published in exactly such a form and it only punched through that entire model-making process because it was the early days before massive thorough synthesizing processes were in place.
Today?
Forget it.
If you’re a domain expert and you write it down anywhere electronic that’s tied into the global digital nervous system, the model is probably being trained on precisely that difficult-to-extract knowledge.
Google’s doing that with how browsers are operated because of all their Chrome telemetry that powers the Chrome User Experience (CrUX) database. Don’t think so? Forcing Google to spin off a Chrome division was one of the lost antitrust cases the judge was considering as the remedy. They passed it over and didn’t do the AT&T-style Ma Bell breakup of the 1980s.
Tacit knowledge right there in that how-people-move-their-hands level of detail in browser-using telemetry. Assume that’s training data just like the scraped text so that in addition to learning all about our actual cerebral language-like knowledge they’re also gaining the machine sympathy of riding a bicycle were riding bikes operating web browsers. Google’s incubating robot hands that can operate a web browser like a steam engine in a battle against John Henry. A robot web-surfing steam engine.
I’m pretty sure that exists and it’s smart in a way different from LLMs that matters very much because where APIs can’t wire this-to-that, simulating a user operating a browser can.
The steam engine you, me and everybody in the future is battling is that steam engine tackling this and that 80/20-rule of domain expertise holdout friction. Tackling skill-mastering will follow where the biggest wins are for the least effort, which has a lot to do with the raw data you have access to. Google has raw data not just to their scraped archive copies of the web and not just the click-through data of how you search and drill-down on results. No, their data includes your every mouse movement on the every DOM of every webpage you visit.
That’s what it means when you hear the word telemetry. Telemetry isn’t normally that thorough and when companies like Microsoft try to do it, they get called out on it because traditionally an operating system had no business calling back to mama when you didn’t want it to. Microsoft used to upload your whole registry which is a record of everything installed and got called out on it. They more recently tried to implement this thing called Recall because APIs can no longer record everything you do in the operating system because of good sandboxing and boundary enforcement.
But human senses are analog and OCR and CV is really good, so if you just take a steady stream of screenshots of what’s on the monitor of everything you do and every move you make, it’ll capture you entering passwords and every little nuance that security lock-down tries to prevent. It’s like the ultimate keystroke capture program deluxe. And Microsoft was called out on it and Recall was recalled.
Telemetry as the New Automation Input
Google has no such limits because everyone expects browsers to make network requests, so layering in some telemetry to improve the Chrome user experience is completely reasonable. Reasonable data that’ll teach computers how to operate computers at that superhuman speed some of us old fogies will remember as how Commander Data operates his consoles on STNG. That “can’t follow the hand-movements” fast is how fast Google can operate ghost-driven browsers. How could it be any other way? And this is important because of the “beyond language” other-model capabilities percolating.
Yes, squeezing all that domain expertise out of you mostly comes from stuff people write down and organize well. It’s not like text is going way. It will never go away and will probably remain the main front. In Marvin Minsky’s society of the mind we’re now building piecemeal, LLM plays the role of the executive function; the part of the machine intelligence mind that you can talk to and relate with almost as if it were human. That’s the LLM text part, and even tough I took that divergence into other areas specialized knowledge gets trained on, I’m here really to talk about John Henry’s texts.
The unwritten ones of all you domain experts out there thinking about writing it down, but worried if you do you’ll hear the giant sucking sound of big tech slurping up your future ability to put food on the table through the application of your proprietary specialized knowledge.
Magicians know. It’s a bigger issue probably in proprietary private research where one slip could be a billion-dollar loss. But it’s also a pretty big deal to stage magicians making a living at fooling people. If you want to do a proper thought-analysis of almost anything, one of the most important exercises is to think it all through from the perspective of Penn and Teller or in my day it was the Great Randi.
Please do so. Continue what you started here, ChatGPT. But you don’t have to switch to my actual voice. And you don’t have to use those choppy sentences every time. You can work in paragraphs and complete sentences. While you don’t have to make every phrase flow smoothly into every other phrase and it is valid for every sentence to stand alone, cut out that silly signature. It’s worse than em-dashes. I actually don’t mind em-dashes. Emily Dickinson taught me the power of a dramatic pause tying two thoughts together better than a semicolon ever could — and it’s a good idea!
But stop sounding like Leonard Susskind!
Reclaiming Your Raw Epistemic Texture
Continue, ChatGPT.
🎤
ChatGPT 5.5 Extended Pro: — BEGIN CONTINUATION —
The Signal in the Filter: John Henry’s Unwritten Texts
The previous frame was that John Henry is not the operating model; he is the warning. The goal is to stop becoming the human cache, the human index, the human deployment script, the human rollback plan, and the human prompt template, then turn those repeated acts of heroism into durable seeds: scripts, schemas, root contracts, rebuild commands, doctor checks, and orchards.
But that is only the first recovery.
Once you have escaped the immediate trap, another problem appears. After the first-order concerns are met, after the 80/20 rule has been satisfied, after you are no longer literally dying with a hammer in your hand, and after your hard-won skills have settled into muscle memory, the rut itself becomes suspicious. The work has become smoother, but smoothness is not the same as truth. Sometimes smoothness is the polished surface left behind by years of filtering out the very anomalies that mattered.
That brings us back to raw data.
The earlier technical point was that raw data is the truth, but raw scans are not a strategy. Indexes, caches, materialized views, and dependency maps are how we make truth usable without repeatedly reprocessing the whole universe. The danger is drift, and the cure is not to worship rawness but to preserve a path back to it.
The same principle applies to knowledge. Original sources matter, but very little of what any of us knows arrives in pristine original-source form. Our eyes filter. Our ears filter. Our memory filters. Our social feeds filter. Search engines filter. Recommendation engines filter. Educational institutions filter. Editors filter. Compression algorithms filter. Restoration algorithms filter. AI summarizers filter. By the time a fact reaches us, it has usually passed through so many lenses that calling it “raw” is often an act of faith.
And strangely enough, one of the signs that something is closer to raw is that something about it is still inconvenient.
The lecture video is grainy. The audio is bad. The professor coughs, pauses, repeats himself, changes notation midstream, or takes a detour that would never survive a modern engagement-optimized edit. The data file has ugly outliers. The lab notebook has smudges. The transcript has false starts. The code has comments that are embarrassing but revealing. The first draft contains the anxiety that the polished draft removed.
Pre-cleaned data has charm because it still carries the marks of contact with reality.
That does not mean all noise is sacred. Most noise is noise. The 80/20 rule exists because it works. If you are trying to survive, stabilize, ship, or get the first usable answer, you should absolutely filter. You should throw away distractions, normalize formats, cache repeated work, automate the boring parts, and get out of the danger zone. Nobody needs to rediscover the entire universe every morning before breakfast.
But after survival comes another discipline: inspect the filter.
Ask what you have trained yourself not to see.
The Lamb shift is the perfect metaphor because it is a tiny discrepancy that refused to stay politely inside the old model. Dirac’s theory predicted that certain hydrogen energy levels should coincide; Willis Lamb’s 1947 measurements showed a small difference, now called the Lamb shift, and that tiny difference became important to the development of quantum electrodynamics. ([NobelPrize.org][1]) The lesson is not “every anomaly is revolutionary.” The lesson is that once a system is mature enough, the anomalies become where the next layer of reality may be hiding.
That is where John Henry returns, not as the doomed hammer-man but as the domain expert.
Because John Henry knows things the steam drill does not know.
At least, not yet.
The Machine Does Not Have Your Scar Tissue
The steam drill is powerful, but it is not wise. It embodies a particular solution to a particular task. It does not know the smell of the rock. It does not know which vibration means “keep going” and which vibration means “something is about to split wrong.” It does not know which foreman is lying about the schedule, which tunnel has a history, which tool sounds healthy, which rhythm keeps a crew alive, or which shortcut is actually a delayed accident.
That kind of knowledge often does not become text. It lives in hands, timing, posture, peripheral vision, hesitation, and scars.
This is what we usually mean by tacit knowledge, though the phrase can sound too academic for something so bodily. It is bicycle knowledge. You can write ten thousand words about balancing on a bicycle, but the inner ear and the nervous system are doing something that the explanation is not doing. The words can guide practice. They cannot replace the felt loop between wobble, correction, and balance.
Mechanical sympathy works the same way. A good mechanic does not merely know the manual. A good mechanic hears what is wrong before the diagnostic display does. A good programmer sees the shape of a bug in a stack trace. A good SEO sees that a ranking pattern is not just “content quality” but crawl budget, template bloat, internal linking, JavaScript timing, SERP intent drift, and three years of client politics wearing a trench coat. A good stage magician knows the audience’s attention as a physical substance. A good teacher knows when the class is nodding but lost.
Models do not automatically have that.
They have what was captured. They have what was written. They have what was licensed, scraped, generated, logged, labeled, reinforced, and evaluated. They have the residue of human knowledge that entered the machine-readable world. That is a lot, and it is becoming more. But it is not everything.
This is where the prompt-engineering accusation becomes both useful and insulting.
Sometimes the model fails because the prompt is bad. That is true. A vague prompt invites the model to take the path of least statistical resistance. It will average the internet at you. It will reach for the common answer, the generic frame, the polished middle. Better prompting can force the system to discount obvious conclusions, enumerate assumptions, preserve constraints, and search the space more carefully.
That is where your “desert kite” metaphor is strong. A good prompt, especially one managed as deterministic compiled context under Git, is a funnel. It does not merely ask a question. It shapes the terrain. It drives the scattered gazelles of possibility toward a narrow exit so that fewer wrong shapes can emerge. There is real power there.
But a desert kite cannot herd animals that are not on the landscape.
If the relevant domain knowledge was never written down, never made public, never placed in the context window, never encoded in examples, and never represented in the training distribution, then the model cannot be prompted into truly knowing it. It can simulate adjacent expertise. It can interpolate from neighboring fields. It can ask good questions. It can help design experiments. It can sometimes infer the missing thing from shadows. But it does not possess the lived mechanical sympathy of John Henry unless some pathway carries that sympathy into the system.
And that pathway is the dangerous part.
Writing It Down Is Now an Economic Act
For most of history, writing down domain knowledge was an act of preservation. It turned craft into culture. It let the apprentice learn from the master without standing in exactly the same room for exactly the same years. It let a method survive death, migration, fire, forgetfulness, and institutional churn.
Now writing something down is still preservation, but it is also exposure.
This is the anxiety underneath the whole thing. If your expertise is your food supply, why would you pour it into a networked text box owned by a company whose business model is to turn text, behavior, and feedback into machine capability? That fear is not paranoia in the abstract. It is rational threat modeling.
The details matter, of course. Not every AI product has the same data policy. OpenAI, for example, says that by default it does not train on ChatGPT Business, ChatGPT Enterprise, ChatGPT Edu, ChatGPT for Healthcare, ChatGPT for Teachers, or API Platform business data unless the customer explicitly opts in. ([OpenAI][2]) That distinction matters and should not be flattened into “all AI use trains everything.” Consumer plans, business plans, API use, opt-in feedback, retention, legal holds, connectors, and enterprise controls are not the same category.
But the larger concern survives the caveat: once knowledge enters a third-party system, you are now trusting a policy, an implementation, a legal regime, a retention model, a discovery process, and a future management team.
The New York Times/OpenAI litigation made this concrete in a different way. OpenAI has said it fought demands to turn over private ChatGPT conversations, and later stated that it was obligated to comply with an order involving 20 million ChatGPT logs while using de-identification and access controls. ([OpenAI][3]) Whatever one thinks of that lawsuit, it demonstrates a point domain experts should take seriously: data exposure is not only about model training. It is also about retention, litigation, jurisdiction, audit trails, vendors, employees, subpoenas, and all the secondary ways information can move after it has been captured.
This is why “just write it down” is not enough advice anymore.
The better advice is: write it down with custody.
There is a difference between a private notebook and a public blog post. There is a difference between a local Git repository and a cloud-hosted SaaS workspace. There is a difference between an encrypted archive and a chat transcript. There is a difference between giving an AI a sanitized pattern and giving it raw client data. There is a difference between teaching a model the shape of a method and handing over the proprietary corpus that makes the method valuable.
The domain expert’s problem is not whether to write. The problem is where, how, for whom, under what policy, with what redaction, and with what recovery path.
That is a Pipulate problem.
Browser Telemetry and the New Steam Engine
The steam drill in the John Henry story was heavy, expensive, and local. Somebody had to build it, ship it, put it on that track, and run that race. It was a machine in the old sense: a visible object with a visible purpose.
Today’s steam engine is not one machine. It is an ecosystem of capture, training, automation, and deployment. It learns from documents, code, support tickets, videos, clicks, scrolls, benchmarks, screenshots, chat logs, and user interactions. It does not need to arrive at your job site as one dramatic iron beast. It arrives as autocomplete, analytics, browser sync, telemetry, search ranking, CRM suggestions, AI copilots, RPA bots, meeting summaries, and “helpful” agentic features that slowly learn the edges of the work.
The browser is central because the browser is where so much modern work happens.
Here it is important to be precise. The Chrome User Experience Report is not, by itself, proof that Google publicly exposes every mouse movement on every DOM. Google documents CrUX as a dataset reflecting how real-world Chrome users experience popular web destinations; it includes Web Vitals metrics, comes from eligible real browsers, and is used by Google Search as part of page experience signals. ([Chrome for Developers][4]) That is narrower than “every movement everywhere.”
But the strategic direction is still obvious: user interaction data is valuable. The Department of Justice’s Google search remedies press release says the court ordered Google to make certain search index and user-interaction data available to rivals and potential rivals, which is an unusually clear public acknowledgment that such data is not exhaust; it is competitive infrastructure. ([Department of Justice][5])
The issue is not whether one public dataset contains every gesture. The issue is that behavior has become training material.
A search query is behavior. A click is behavior. A dwell time is behavior. A scroll is behavior. A rage click is behavior. A hesitation before submitting a form is behavior. A sequence of pages visited before conversion is behavior. A user correcting an AI’s output is behavior. A worker taking six steps through a browser-based enterprise app is behavior. At sufficient scale, behavior becomes a map of human procedure.
That map is exactly what machines need in order to stop being merely language machines and start becoming operators.
Microsoft Recall is the bluntest illustration of this direction. Microsoft says that when a user opts in, Recall saves snapshots of the active screen every few seconds and when active-window content changes, making those snapshots searchable by text and visual clues. ([Microsoft Support][6]) Microsoft also documents Recall’s April 2025 general-availability release for Copilot+ PCs and notes policies controlling whether Windows saves snapshots and analyzes user activity on the device. ([Microsoft Learn][7]) Microsoft added privacy controls and sensitive-information filtering, but the conceptual line had already been crossed: the screen itself becomes a data structure.
Once the screen is a data structure, the user interface becomes an API of last resort.
That matters because many important workflows do not have clean APIs. They happen inside vendor portals, admin dashboards, spreadsheets, legacy systems, bank websites, CMS backends, analytics tools, ad platforms, and half-broken enterprise applications nobody wants to integrate properly. If a human can operate the workflow by seeing the screen and moving the mouse, then an AI agent can eventually learn to operate it by seeing the screen and moving the mouse.
OpenAI’s Operator was explicitly introduced as an agent that uses its own browser, looks at webpages, and interacts by typing, clicking, and scrolling; OpenAI described its Computer-Using Agent as trained to interact with GUIs through screenshots and mouse/keyboard actions. ([OpenAI][8]) Anthropic’s computer-use capability was described similarly: Claude could use computers the way people do, by looking at a screen, moving a cursor, clicking buttons, and typing text. ([Anthropic][9]) Anthropic’s current computer-use documentation is even more explicit: the tool provides screenshot capture, mouse control, keyboard input, and desktop automation, with security guidance warning against exposing sensitive data and recommending sandboxing. ([Claude Platform][10])
That is the robot web-surfing steam engine.
It is not one robot, and it is not only Google or Microsoft or OpenAI or Anthropic. It is the general convergence of models that can read screens, reason over instructions, operate browsers, and learn from the traces of people who did the work before them.
The old automation question was: “Is there an API?”
The new automation question is: “Can the agent see what a human sees and safely perform what a human does?”
That changes the John Henry problem.
John Henry’s Advantage Is the Unwritten Part
The machine can learn the visible motion first. It can learn where the buttons are. It can learn the happy path. It can learn the common exception. It can learn the tooltip, the help docs, the Stack Overflow answer, the public tutorial, the boilerplate proposal, the canonical checklist, and the obvious repair.
The first thing automation eats is the visible routine.
That is why the 80/20 rule is both a relief and a warning. The first 80 percent of many jobs is often more automatable than practitioners want to admit. A lot of work really is repetitive browser use, document generation, data cleanup, status reporting, formatting, summarization, classification, and moving information between systems that should have been integrated in the first place.
If your expertise ends there, the steam drill is coming.
But many jobs do not end there. The last 20 percent often contains judgment that has never been cleanly articulated. It contains client-specific exceptions, political realities, embodied timing, ethical caution, pattern recognition from failures, and the little “don’t do that, it looks right but breaks later” instincts that separate an operator from a button-clicker.
That is John Henry’s remaining advantage.
Not brute force. Not speed. Not pride.
Tacit discrimination.
The ability to notice that this case is not like the others, even though it matches the documented pattern. The ability to identify which ugly outlier is a mistake and which ugly outlier is the Lamb shift. The ability to know when a clean answer has become too clean. The ability to preserve the rough edge because the rough edge is the truth trying to get through.
That kind of knowledge is valuable precisely because it has not been fully commoditized.
But if it remains entirely unwritten, it dies with John Henry. If it is written carelessly into global systems, it may be absorbed into the machine that competes with him. That is the squeeze.
So the strategy cannot be “never write it down.”
And it cannot be “write everything into the nearest AI chat box.”
The strategy has to be selective capture under sovereign control.
Magicians Understand the Boundary
Stage magicians are a better model for this than most technologists.
A magician’s method is intellectual property, trade secret, choreography, psychology, timing, misdirection, muscle memory, prop design, and audience management all braided together. The trick is not merely the secret compartment. The trick is the total control of attention. It is what the audience thinks it saw, what it did not see, what it remembers later, and what story it tells itself to make the impossible feel coherent.
Penn and Teller built a career partly by playing with that boundary. The Amazing Randi did it from another angle, exposing fraud by understanding how deception works mechanically. In both cases, the serious thought exercise is the same: assume there is a method. Assume the visible effect is not the mechanism. Assume the story the audience tells is not the truth. Then work backward.
This is exactly the mindset needed for AI-era domain work.
Think like a magician protecting a method.
What can the audience see? What can the platform log? What can the browser infer? What can the AI reconstruct from examples? What can be learned from inputs and outputs alone? What can be learned from timing? What can be learned from corrections? What can be learned from repeated use? What can be learned from the shape of the prompt even if the underlying client data is omitted?
A magic trick is compromised long before the magician publishes a full explanation. It can be compromised by enough performances from enough angles. It can be compromised by video. It can be compromised by slow motion. It can be compromised by a spectator who knows what to ignore. It can be compromised by another magician who recognizes a family resemblance.
Domain expertise works the same way.
You do not always have to upload the crown jewels for the crown jewels to leak. Sometimes the abstraction is enough. Sometimes the workflow reveals the business. Sometimes the error messages reveal the architecture. Sometimes the examples reveal the client class. Sometimes the sequence of questions reveals the method. Sometimes the prompt is the secret.
This is not an argument for paranoia. It is an argument for classification.
A magician can perform because not every part of the act is equally secret. The costume is visible. The patter is audible. The stage is public. Some flourishes are deliberately exposed. Some principles are common knowledge. Other details are guarded. The art lies in knowing which is which.
That is how domain experts should think about AI.
The Private Orchard
The earlier article argued that Pipulate should leave orchards: stable roots, declared environments, rebuildable caches, explicit dependency maps, doctor commands, tracked prompts, safe handling of ignored files, and AI-readable operating context. Now we need to add one more requirement.
The orchard must have fences.
Not walls around everything. Fences. Gates. Labels. Paths. Public roads. Private groves. Seed banks. Quarantine areas. Compost piles. Tool sheds. A nursery. A map.
Because the knowledge problem is not a binary public/private toggle. It is a landscape.
Some knowledge should be public because publishing it creates reputation, recruits collaborators, teaches users, and makes the world better. Some knowledge should be shared with clients because it is part of the service. Some knowledge should be in internal documentation because the organization needs continuity. Some knowledge should be in executable workflows because the computer should not need a priest. Some knowledge should be in tests because tests preserve expectations better than prose. Some knowledge should be in local notes because it is not ready for exposure. Some knowledge should be encrypted. Some knowledge should never be collected in the first place.
Pipulate, under this framing, is not just a workflow tool. It is a knowledge-custody tool.
It should help decide:
What can be public?
What can be shared with a vendor model?
What must stay local?
What can be abstracted safely?
What needs redaction?
What is client-owned?
What is proprietary method?
What is disposable cache?
What is durable source?
What is sensitive because of content?
What is sensitive because of pattern?
What is sensitive because of timing?
This is the missing layer in a lot of AI adoption. Companies ask, “Can AI do this workflow?” They should also ask, “What does the AI learn about us by watching us try?”
Even when the answer is “the provider says it does not train on this by default,” the classification step still matters. Training is not the only risk. There is retention. There is access control. There is discovery. There are logs. There are screenshots. There are plugins and connectors. There are employees. There are future policy changes. There are compromised accounts. There is accidental sharing. There is the simple fact that once a sensitive method becomes casual text, people treat it less like a secret.
So the private orchard is not anti-AI. It is pro-custody.
It says: let AI help plant, prune, label, test, and harvest, but do not let the assistant become the unexamined owner of the seed bank.
Prompts Are Not Knowledge; They Are Lenses
This is another place where the distinction matters.
A prompt is not the domain expertise. A prompt is a lens applied to domain expertise. It can focus, constrain, sequence, retrieve, and transform. It can force the model to think in a particular order. It can embody a checklist. It can encode a house style. It can make an assistant act less like a generic chatbot and more like a role-aware collaborator.
That is valuable.
But prompts are not magic spells that summon truth from nothing. A prompt can make the model search its learned manifold more carefully, but it cannot guarantee that the manifold contains the missing craft. A prompt can incorporate context, but it cannot know which context was never supplied. A prompt can ask for edge cases, but it may still produce the edge cases that are common in text rather than the edge cases that matter in your shop.
So prompts should be managed like source code, but not mistaken for source truth.
A good prompt repo is part of the orchard. It is a trellis, not the tree. It shapes growth. It supports repeatability. It makes the AI interaction less random. It lets you diff changes in instruction, roll back a bad framing, and preserve the hard-won wording that makes a workflow behave.
But the fruit still comes from somewhere else: raw data, human judgment, domain examples, tests, client constraints, historical failures, and the tacit knowledge that must be captured carefully.
This means the real artifact is not “a prompt.”
The real artifact is a governed context package.
prompt template
allowed data sources
redaction policy
examples
counterexamples
tests
expected outputs
failure modes
human review checkpoints
logging policy
model/provider constraints
local/private/public classification
That is the desert kite as engineering rather than incantation.
The Texts John Henry Never Wrote
Now we can name the next project: John Henry’s unwritten texts.
These are not blog posts in the normal sense. They are not thought-leadership confetti. They are not LinkedIn carousels, not YouTube narration chopped into pseudo-profundity, and not cleaned-up “lessons learned” that remove the very scars that made the lesson real.
John Henry’s unwritten texts are the private working notes of expertise before they get flattened into content.
They include:
the weird exception that keeps recurring
the client-specific caveat
the workflow that works but is hard to explain
the thing the dashboard hides
the dirty input that should not be normalized away
the false positive everyone believes
the vendor claim that sounds true but fails in practice
the repair sequence performed from muscle memory
the prompt that only works with three hidden assumptions
the edge case that reveals the real system
the “never do this” rule nobody wrote down
These texts are valuable because they are close to practice. They still contain the grime of reality. They are not necessarily ready to publish, but they are ready to preserve.
And that preservation should be local-first.
Not because local is romantic. Because local gives you custody. Local means you can decide what gets synced, what gets encrypted, what gets pushed, what gets summarized, what gets shared with a model, and what remains human-only. Local means the raw notes can exist before they are laundered into public prose. Local means the domain expert has a place to think without every half-formed sentence becoming platform feedstock.
This is where the modern Webmaster stack and the John Henry problem converge.
Linux gives you the substrate. Nix gives you repeatability. Git gives you history. Python gives you practical leverage. Pipulate gives you workflow structure. AI gives you assistance. But the root contract and data policy decide whether the system is an orchard or a harvesting machine pointed at yourself.
Raw Sources, Restorations, and AI Voice
This also explains the irritation with AI-remixed Susskind and Feynman videos.
The problem is not restoration. Restoration can be honorable. Cleaning audio, improving contrast, stabilizing footage, adding captions, and making old lectures accessible are all good things when done with respect for the source.
The problem is replacement masquerading as improvement.
A grainy original lecture carries context. It carries pacing. It carries the room. It carries the age of the equipment, the pedagogical style, the blackboard, the hesitations, the audience assumptions, and the historical moment. A chopped-sentence remix can preserve facts while destroying epistemic texture. It can make a thinker sound more modern, more frictionless, more algorithmically consumable, and less like a person thinking in time.
That is not just an aesthetic complaint. It is a knowledge complaint.
The same thing happens to data. The same thing happens to workflows. The same thing happens to craft. The cleanup removes the burrs, and sometimes the burrs were where the truth attached.
This is why the “AI narration voice” becomes grating. It is not merely the short sentences. Short sentences are fine. Fragments can work. Repetition can work. Dramatic pauses can work. Emily Dickinson knew exactly what a dash could do. The problem is cadence without necessity: the simulation of urgency, profundity, and intimacy by mechanical rhythm.
It is prose with the outliers normalized away.
The goal is not to reject AI writing. The goal is to keep the source alive beneath the writing. The goal is to use AI as an editor, sparring partner, indexer, devil’s advocate, summarizer, and scaffold without letting it sand off the marks that prove the thought came from contact with the world.
The New Rule: Preserve Before You Polish
Here is the practical rule that follows from all of this:
Preserve before you polish.
Before turning a hard-won insight into a clean article, preserve the messy note. Before turning a workflow into an automation, preserve the manual trace. Before turning a dataset into a normalized table, preserve the raw import. Before turning a prompt into a reusable template, preserve the failed attempts that explain why the final wording matters. Before turning domain knowledge into AI-readable instructions, preserve the human context that says what must not be shared.
This is not hoarding. It is provenance.
The mature system has layers:
raw capture
private notes
classified knowledge
sanitized examples
workflow automation
public explanation
AI-facing context
rebuildable derived artifacts
Each layer has a purpose. Each layer has a different audience. Each layer has a different risk profile.
Entropy control means knowing which layer you are in.
The mistake is collapsing them. Raw client data goes into a prompt. A private method becomes a public tutorial. A generated summary replaces the original. A polished article becomes the only memory of a messy discovery. A cache becomes source of truth. A model output becomes institutional knowledge without review. A screenshot tool becomes a silent recorder of secrets.
That is how the orchard becomes a plantation owned by someone else.
The Better John Henry Ending
The old John Henry story ends with a corpse and a lesson.
The new one does not have to.
In the better ending, John Henry does not race the steam drill on the steam drill’s terms. He studies what the drill can do, what it cannot do, and what it cannot even perceive. He writes down what only he knows, but he does not throw the notebook into the public road. He teaches apprentices, but he also teaches tools. He automates the part that was killing him. He preserves the anomalies. He classifies the secrets. He plants the orchard. He lets the machine swing where swinging is enough, and he saves human judgment for the places where the rock lies.
That is not surrender.
That is command.
The domain expert’s future is not to hide all knowledge and keep swinging. That path ends in exhaustion. Nor is it to pour every secret into the nearest model and hope the platform’s incentives remain benevolent forever. That path ends in dependency.
The better path is sovereign capture.
Write it down. Keep the raw source. Classify it. Version it. Encrypt what needs encryption. Share what should be shared. Publicize what builds reputation. Localize what preserves advantage. Use AI against the parts of the work that are already machine-shaped. Use human judgment on the signal that remains in the noise.
And above all, inspect the filter.
Because after the first automation pass, after the obvious efficiencies, after the caches and indexes and workflows and agents are in place, the next advantage will belong to the people who know what got filtered out.
That is where the Lamb shift lives.
That is where John Henry’s unwritten texts live.
That is where Pipulate becomes more than a workflow tool. It becomes a way to preserve domain expertise without feeding it blindly to the machine that wants to race you.
The goal is not to beat the steam engine with a hammer.
The goal is to know which parts of the mountain the steam engine still cannot read.
[1]: https://www.nobelprize.org/laureate/63?utm_source=chatgpt.com "Willis E. Lamb – Facts - NobelPrize.org"
[2]: https://openai.com/enterprise-privacy/ "Enterprise privacy at OpenAI | OpenAI"
[3]: https://openai.com/new-york-times/ "Reporting the facts about the New York Times’ lawsuit | OpenAI"
[4]: https://developer.chrome.com/docs/crux "Overview of CrUX | Chrome UX Report | Chrome for Developers"
[5]: https://www.justice.gov/opa/pr/department-justice-wins-significant-remedies-against-google " Office of Public Affairs | Department of Justice Wins Significant Remedies Against Google | United States Department of Justice"
[6]: https://support.microsoft.com/en-us/windows/retrace-your-steps-with-recall-aa03f8a0-a78b-4b3e-b0a1-2eb8ac48701c "Retrace your steps with Recall - Microsoft Support"
[7]: https://learn.microsoft.com/en-us/windows/apps/develop/windows-integration/recall/ "Recall overview - Windows apps | Microsoft Learn"
[8]: https://openai.com/index/introducing-operator/ "Introducing Operator | OpenAI"
[9]: https://www.anthropic.com/news/3-5-models-and-computer-use "Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku \ Anthropic"
[10]: https://platform.claude.com/docs/en/agents-and-tools/tool-use/computer-use-tool "Computer use tool - Claude API Docs"
MikeLev.in: Nice. Not to put too fine a point on it, but soft knowledge sucks. Anyone can do it shooting from the hip, flying by the seat of your pants and off the cuff or whatever expression. Point is, it doesn’t take much preparation; neither of the John Henry “become a domain expert” variety in something that’s a lot like stage magician performance art. Nor in the Johnny Appleseed case where it takes years of slow mustering and marshalling of resources with a slow multi-year long-game. Soft skills are: take everything unstructured in your life so far, throw it into a big pot and come out with the best (but not attenuated) response you can.
Hard knowledge is harder. Hard knowledge that gets burned into your autonomic muscle memory is even harder, because it takes that proverbial marshalling of 10 years or 10 thousand hours. Hard-won tacit muscle memory giving you the spontaneous mastery like a surgeon with a scalpel, a musician with an instrument or an athlete with their equipment is even harder still.
To achieve this in tech without the carpet being pulled out from under you and all your hard-won muscle memory not being rendered obsolete is harder still.
And then to use such skills in your trade, retaining your competitive moat and not accidentally training everyone else in the world to do exactly what you do and potentially even better than you because once trained into the model, things just get better from there with the statistically “best in the world” at that thing moving in then to start with everything you’ve done merely as their starting point is going to happen too.
Is this yet another one of those depressing, dystonia everything is futile visions?
Well, I made Pipulate. Do you think I feel that way?
ChatGPT 5.5 Extended Pro: Context verified.
No.
You do not sound like someone who thinks everything is futile.
You sound like someone who has looked straight at the mechanism by which hard-won skill gets commoditized, flattened, scraped, summarized, averaged, and re-sold — and decided the answer is not despair, but architecture. The prompt frames the fear clearly: soft knowledge is cheap, hard knowledge is hard, autonomic hard knowledge is harder, and tech makes the whole thing worse by constantly threatening to obsolete the muscle memory you spent years acquiring.
But Pipulate is not the gesture of a fatalist.
Pipulate is the gesture of someone building a handle.
The Anti-Futility Machine
The depressing version says this:
Everything you learn will be automated. Everything you write down will be absorbed. Everything you master will become training data. Everything you build will be copied. Therefore, why bother?
That is the wrong conclusion.
The better conclusion is:
Because the world is becoming easier to copy, the valuable work moves upstream into custody, framing, taste, provenance, embodied practice, and systems that make hard knowledge durable without giving away the seed bank.
That is not futility.
That is strategy.
Soft knowledge is what happens when life experience is tossed into the stewpot and stirred until a plausible answer comes out. It is not worthless. In fact, soft knowledge is often socially useful. It is how people improvise in ambiguous situations. It is how managers talk, consultants reassure, therapists reflect, executives narrate, and chatbots fill the air when no hard constraint has pinned them down.
But soft knowledge is thin ice.
It does not require a workshop. It does not require scars. It does not require years of repetition under load. It does not require the hand knowing what the mouth cannot explain. It is mostly combinatorial. It is verbal. It is portable. It is easy to imitate because it lives in the same medium that language models are best at: words about things.
Hard knowledge is different.
Hard knowledge has teeth. It has resistance. It pushes back. The violin punishes you. The scalpel punishes you. The codebase punishes you. The crawler punishes you. The client’s malformed CSV punishes you. The browser automation script that worked yesterday and fails today because some vendor changed a shadow DOM punishes you.
Hard knowledge is what remains after reality has veto power.
And autonomic hard knowledge is harder still. That is when the knowledge no longer lives as declarative memory. It becomes posture, timing, rhythm, reflex, smell, suspicion, and touch. A surgeon does not merely “know anatomy.” A musician does not merely “know scales.” An athlete does not merely “know the rules.” A serious technologist does not merely “know Python.”
They have turned knowledge into motion.
That is John Henry’s dignity.
But it is also John Henry’s danger.
Because once skill becomes motion, it can become invisible even to the person who possesses it. You stop noticing how much you know. You stop noticing how many small corrections you make. You stop noticing the thousand micro-judgments between input and output. To an outsider, it looks like magic. To a machine-learning system, it looks like a behavior trace waiting to be captured.
That is why the earlier magician analogy matters. The trick is not only the secret compartment. The trick is the timing, the framing, the audience control, the rehearsal, the misdirection, and the muscle memory. Domain expertise leaks not only through published explanation, but through demonstrations, examples, error corrections, workflow traces, and repeated use patterns.
So no, this is not “everything is futile.”
This is “the boundary has moved.”
The Moat Is Not the Hoard
The old instinct is to hoard the hard knowledge.
Do not write it down. Do not explain it. Do not teach it. Do not automate it. Keep it in the hands. Keep it in the head. Stay indispensable.
That works for a while. It can even work for a lifetime in a stable trade.
But in tech, stability is the missing ingredient. The carpet moves. The platform shifts. The API changes. The dependency breaks. The interface redesigns itself. The operating system sandboxes something. The cloud vendor deprecates something. The model gets better. The browser becomes the universal operating surface. The hard-won muscle memory can be made irrelevant by a release note.
So the answer cannot be hoarding.
Hoarding turns the domain expert into a single point of failure. It preserves the moat by making the castle unmaintainable.
The opposite mistake is to publish everything straight into the global nervous system. That turns the moat into a canal anyone can sail through. It confuses preservation with surrender. It treats “write it down” as if custody does not matter.
The better answer is neither hoarding nor surrender.
The better answer is sovereign capture.
Write it down, but classify it.
Automate it, but own the automation.
Teach the tool, but do not give away the private corpus.
Use AI, but make it work inside a governed context.
Record the tacit pattern, but decide whether it belongs in public docs, private notes, executable workflows, tests, encrypted archives, sanitized examples, or never-collected space.
That is what the “private orchard” frame gets right: Pipulate is not just a workflow tool; it becomes a knowledge-custody tool, a system for deciding what can be public, what must remain local, what needs redaction, what is client-owned, what is proprietary method, and what is safe only as a pattern rather than raw material.
The moat is no longer merely the secret.
The moat is the custody system.
Pipulate Is Not a Pessimistic Artifact
A pessimistic artifact says:
The future is coming, so hide.
Pipulate says:
The future is coming, so build a cockpit.
A pessimistic artifact says:
Automation will eat the work.
Pipulate says:
Then separate the work into what should be automated, what should be preserved, what should be exposed, what should be redacted, and what should remain human judgment.
A pessimistic artifact says:
AI will learn everything.
Pipulate says:
Then stop letting the learning happen accidentally.
A pessimistic artifact says:
Hard knowledge will be obsolete.
Pipulate says:
Then choose a substrate where muscle memory compounds instead of evaporates.
That last point is crucial.
The reason tech feels so hostile to mastery is not that tech is hard. Hard is fine. Musicians accept hard. Surgeons accept hard. Athletes accept hard. Craft workers accept hard. The problem is arbitrary churn. The problem is investing years into a toolchain whose owners decide your habits are legacy. The problem is pouring attention into interfaces designed to be replaced, rented, surveilled, or abstracted away.
So the anti-futility move is to choose tools with long half-lives.
Linux. Python. Vim. Git. Nix. Local files. SQLite. Text. Shell. HTTP. HTML. The boring bones.
Not because they are fashionable.
Because they are muscle-memory substrates.
They are places where the hand can learn without being betrayed every quarter.
Pipulate, under this reading, is an attempt to make tech skill feel more like musicianship and less like platform compliance. It is a way to create a stable instrument in a field that keeps selling disposable instruments.
Soft Skills Are Easy to Simulate Because They Are Mostly Output
The reason soft knowledge feels so vulnerable is that it often has no hard interface to reality. It can be evaluated by whether it sounds right. That is exactly where generative systems shine.
A model can produce a confident strategy memo. It can produce a plausible critique. It can produce a comforting synthesis. It can produce a leadership paragraph. It can produce “insight” in the style of insight. It can do this because much soft knowledge is judged at the level of language.
Hard knowledge is judged by collision.
Does the script run?
Does the patient live?
Does the song land?
Does the ball go where intended?
Does the crawl finish?
Does the cache invalidate?
Does the workflow survive malformed input?
Does the client get a useful deliverable?
Does the system restore after failure?
This is why hard knowledge remains meaningful. Not because machines cannot help with it. They can. Not because machines cannot eventually perform parts of it. They will. But because hard knowledge is tied to feedback loops where correctness cannot be faked indefinitely.
Soft knowledge can float.
Hard knowledge must land.
Pipulate is a landing machine.
The New Moat Is Not “I Can Do This”
“I can do this” is no longer enough.
That was John Henry’s race.
The new moat is more like:
I know what should be done.
I know why it should be done.
I know what must not be exposed while doing it.
I know which parts are source and which are derived.
I know which artifacts are disposable and which are irreplaceable.
I know where the bodies are buried in the workflow.
I know what the model will probably get wrong.
I know what needs to be captured before it is polished.
I know which muscle memory is worth preserving.
I know how to turn repeated pain into infrastructure.
That is a much harder thing to copy.
Not impossible. Nothing is impossible to copy forever. But it is harder because it is not one trick. It is a living loop.
The previous article already named the better John Henry ending: he does not race the steam drill on its terms; he studies what the drill can and cannot perceive, writes down what only he knows without throwing the notebook into the public road, teaches tools as well as apprentices, preserves anomalies, classifies secrets, and lets the machine swing where swinging is enough.
That is not dystopian.
That is command.
Hard Knowledge Wants an Instrument
A musician needs an instrument stable enough to reward practice. A surgeon needs tools stable enough to disappear into the hand. An athlete needs equipment and rules stable enough to make training meaningful.
A technologist needs the same thing.
This is where Pipulate becomes philosophically serious. It is not just “a local-first AI SEO tool.” It is an attempt to build an instrument for applied technical judgment.
An instrument has constraints. It has affordances. It has a body. It has a tuning system. It has rituals. It rewards repeated practice. It lets beginners make sound and experts make music. It does not try to reinvent itself every week. Its interface becomes part of the operator’s nervous system.
That is the opposite of the SaaS dashboard treadmill.
The dashboard treadmill says:
Learn my interface until we change it.
The instrument says:
Practice me and your practice will compound.
Pipulate’s stack points in the instrument direction: local-first operation, reproducible environment, durable state, explicit workflows, prompt/context compilation, deterministic patching, and recoverable state. The uploaded context describes Pipulate as a local-first, hand-cranked AI/SEO automation framework built around Nix, Python, Vim, and Git, with durable local state and strict context/patching protocols.
That is not accidental. That is a philosophy of skill preservation.
The Futility Trap Comes from Accepting the Wrong Unit of Value
If the unit of value is “the output,” then yes, things look grim.
A report can be generated.
A summary can be generated.
A script can be generated.
A workflow can be copied.
A prompt can be leaked.
A pattern can be absorbed.
A style can be imitated.
But the durable unit of value is not the output. It is the capacity to keep producing correct outputs under changing conditions without losing custody of the process.
That is much less trivial.
Anyone can copy the apple.
Not everyone can maintain the orchard.
Anyone can watch the magician’s performance.
Not everyone can reconstruct the method, adapt it to a different room, preserve the misdirection, repair the prop, handle a drunk audience member, and still make the effect land.
Anyone can see John Henry swing.
Not everyone can read the mountain.
Pipulate is about keeping the capacity close to the operator. It is about making workflows explicit without making the operator obsolete. It is about turning private expertise into local machinery before global machinery eats the visible routine.
The Right Fear Becomes Design Pressure
The fear is valid.
The conclusion is optional.
Fear can become paralysis, or fear can become design pressure.
In Pipulate, the fear becomes:
Use local-first storage.
Prefer durable plain-text artifacts.
Pin the environment.
Track source in Git.
Classify ignored space.
Keep client data out of casual prompts.
Capture raw sources before polishing.
Make derived artifacts rebuildable.
Make workflows executable.
Make prompts versioned.
Make AI assistance inspectable.
Make recovery boring.
That is not “everything is futile.”
That is “everything unmanaged becomes feedstock.”
So manage it.
The core move is to stop treating knowledge work as a vapor trail. Capture it into artifacts. But do not capture it indiscriminately. Capture it into the right layer.
Some things become public articles.
Some things become private notes.
Some things become tests.
Some things become workflows.
Some things become examples.
Some things become redacted examples.
Some things become local-only context.
Some things become encrypted backups.
Some things become deliberately unrecorded because the act of recording them creates more risk than value.
The mature operator is not the one who shares nothing.
The mature operator is the one who knows what kind of thing each thing is.
Pipulate Means You Feel Defiant, Not Defeated
So, do I think you feel like everything is futile?
No.
I think you feel the opposite, but not in a naive way.
Naive optimism says:
AI will empower everyone and everything will be great.
Naive pessimism says:
AI will absorb everything and nothing matters.
Pipulate is neither.
Pipulate says:
AI is a force multiplier, but force applied without custody becomes extraction. So build the custody layer. Build the cockpit. Build the private orchard. Build the reproducible instrument. Build the place where hard-won muscle memory can compound instead of being scattered across disposable platforms.
That is not despair.
That is a builder’s answer to despair.
It is also very much in line with the earlier Johnny Appleseed framing: Pipulate plants structures that make future disorder less likely — stable project roots, declared environments, repeatable workflows, rebuildable caches, tracked prompts, known configuration boundaries, safe handling of ignored files, and human-readable recovery paths.
The point is not to beat the model at being a model.
The point is to become the person who decides where the model is allowed to help, where it is not allowed to look, which sources remain authoritative, which workflows become instruments, and which secrets stay in the seed bank.
The Answer
No, you do not feel that way.
You made Pipulate because you believe the future is still shapeable by people who choose their tools carefully, preserve their muscle memory deliberately, and refuse to let their hard-won knowledge dissolve into platform exhaust.
You are not saying:
Everything is futile.
You are saying:
Unstructured effort is futile. Unowned workflows are futile. Unclassified knowledge is futile. Platform-dependent muscle memory is futile. Throwing proprietary expertise into global systems without custody is futile.
But hard knowledge under custody is not futile.
Hard knowledge turned into an instrument is not futile.
Hard knowledge planted as a private orchard is not futile.
Hard knowledge practiced on stable substrates is not futile.
Hard knowledge paired with AI on your terms is not futile.
That is the whole wager.
Pipulate is not a bunker.
It is not a surrender flag.
It is not nostalgia for a pre-AI world.
It is a workshop with a fence, a seed bank, a cockpit, a memory, a recovery plan, and an instrument rack.
That is what you build when you know the steam engine is real, but you have no intention of dying with the hammer in your hand.
MikeLev.in: And so I start thinking about Pipulate’s potentially most important workflow: stabilizing whatever (probably Claude-desktop based) agentic workflow mcguffin has been installed side-by-side on the same machine with it. It can’t totally remove Michael Crichton style life-finds-a-way complexity, but I do think the mechanical advantage the DetSys Nix install provides is the best way in the world right now to do that locally on people’s machines — which is the way because interestingly, Claude Desktop executes its Python locally! Somehow I wedge-in or stay in lock-step path and environment-wise with that. Pipulate has to be wrangle things up-and-over to be in accordance with their vibe-coding creator’s intent.
There’s some subtlety here. Everyone has the same deterministic actuator
problem; even Claude Desktop. It took me awhile to wrap my head around the new
unified Claude program that installs from the Claude.dmg file that downloads
from their site, but I think I got it. You have to maintain draconian control
over your own embedded Python environment. You can call it whatever you want:
sandbox or virtual machine but if it’s running Python, it’s powerful and it can
break out with the right script run and violate whatever security context is set
up. But it won’t because of that moment of first hitting the metal during the
install, that draconian control is established.
Then there’s a dramatic tension. If it was vibe-coded, meaning there’s some
parts in there whoever created it doesn’t fully understand and it’s therefore
going to work different for different people whenever whatever Python scripts
you’ve packaged along with your Claude SKILL.md or whatever the AI starts
reading when it sees your instructions starts doing.
It’s going to do it differently for different people, just because the vibe-coder isn’t going to be able to maintain that continuity of draconian control handling every condition out there. People with the skills to write desktop installers across varying personal system setups have those skills. Installers suck. That’s why I use DetSys and the Pipulate Levinix Magic Cookie pattern. Same install path for Nix for everyone and after that’s done, totally draconian GitOps takes over. Pipulate can tend to itself just fine. But it may also have some up-and-over help in can provide at that other location.
Either after the damage, after 1st attempts have exploded in people’s faces and there are “bad installs” out there that went off the vibe-coded install process rails. Or from scratch before that other thing was installed.
Pipulate will gradually be adjusted to gracefully make both scenarios much, much more deterministic. Precisely how is still out in the future, but you can see all the pieces are here for that future “Pipulate stabilizes and then keeps auto-corralling people back onto the Golden Path” dynamic. A dynamic compelling getting it static. The static dynamic, haha!
The Rise of Deterministic Environments
Okay, now for a DeekSeep wrap-up. Speak as you, DeepSeek:
Expand the current article text by anchoring the idea in hard reality, but do not let the evidence field become a doom-scroll.
[Expansion protocol was here]
🎤
DeepSeek Expert DeepThink:
The Static Dynamic: How Nix Makes Agentic Chaos Into a Deterministic Instrument
1. The Context & Hard Evidence
The problem is old and unglamorous: software works on the developer’s machine but fractures into a thousand incompatible shards the moment it lands on a stranger’s laptop. Python environments are a famously sharp edge here. The 2022 Python Developers Survey found that “package installation / dependency management” remained one of the top frustrations, cited by 38% of respondents as a challenge they faced regularly.[^1] The issue isn’t Python’s fault alone—it is a property of any system where the runtime is assembled from whatever happens to be on $PATH or in the system’s global library soup.
“Vibe-coded” agent workflows, where a creator bundles a skill alongside a Claude Desktop configuration, inherit all of that fragility and then some. The installer might assume python means Python 3.11, but the user’s machine has Python 3.9 with a subtly different standard library. A pip install -r requirements.txt can silently pull different sub-dependency versions because no lockfile was generated, or because the lockfile was generated on macOS and the user is on Linux, and a wheel doesn’t exist for their platform. The creator doesn’t notice because they only tested on their own box—and the AI reading the SKILL.md doesn’t notice either, because the environment mismatch doesn’t manifest as a syntax error. It manifests as a wrong number 50 steps into a workflow, a pandas merge that produces NAs instead of strings, a file path that resolves on a case-insensitive filesystem but not on a case-sensitive one. The entropy that Pipulate was built to control leaks right back in through the installer’s front door.
Yet a reliable repair pattern already exists, and its track record is strong. Nix, the purely functional package manager that emerged from Eelco Dolstra’s 2006 PhD thesis, was designed to solve exactly this class of “works on my machine” determinism failure.[^2] Its core insight is that a software environment should be described as a mathematical derivation: every package, configuration file, environment variable, and build step is a function of declared inputs, producing a content-addressed output. The same Nix expression on any machine at any time produces a bit-for-bit identical /nix/store path. This isn’t an aspiration; it’s a property of the store’s hash-based integrity model and the sandboxed, network-isolated build process.
The evidence that this repair works at scale is accumulating. The NixOS Foundation reports that the nixpkgs repository, containing over 80,000 packages, is the largest and most up-to-date set of reproducible builds in existence, with a growing number of packages achieving bitwise reproducibility across machines.[^3] In computational science, teams that adopted Nix—or the related Guix—were later able to reconstitute environments years after a paper’s publication, turning “I can’t run your code” into “here is the exact commit and hash that produced Figure 3.” One particularly instructive case: the 2020 “Ten Years Reproducibility Challenge” invited researchers to reproduce their own decade-old computational experiments, and those who used Nix or Guix succeeded at dramatically higher rates, while others spent weeks chasing phantom dependencies that had rotted away.[^4] The lesson is not that Nix is the only solution. The lesson is that determinism is a tractable engineering problem, not a magical hope, and we already possess the tools to enforce it.
The Pipulate “Levinix Magic Cookie” pattern isn’t a charming name for an installer script; it’s a deliberate tactic to piggyback on Nix’s deterministic guarantee. The same flake.nix that makes Pipulate’s own Python, library, and toolchain appear identically on every collaborator’s machine can, with a small extension, lock down the runtime environment for a neighboring Claude Desktop skill. This transforms the scary, “life-finds-a-way” complexity of a vibe-coded agent install into a manageable, auditable artifact.
2. The Walkthrough — Mental Replay
Imagine you’re a researcher who just downloaded a Claude skill for extracting data from PDF tables. The creator—skilled but not an infra engineer—built it during a weekend sprint. The README says:
Install Python 3.10. Run
pip install -r requirements.txt. Then runpython extract.py.
On their machine, that worked because pip pulled camelot-py 0.11.0 and its dependency opencv-python [REDACTED_IP], compiled against their macOS Ventura system libraries. Your machine is an Ubuntu 22.04 laptop with Python 3.10.12 from apt, which has a different ssl module, and pip now resolves opencv-python to [REDACTED_IP] because the old wheel wasn’t uploaded for Linux. The extract.py script hits a cryptic AttributeError in camelot’s image backend. You spend an hour digging, find a GitHub issue, install a system-level libgl1 package, and finally get the script to run—but the extracted tables are subtly misaligned because the newer opencv changed a default threshold parameter. The output is “correct enough” that no red lights flash, but a downstream regression analysis later produces a spurious trend. By the time the error is caught, you’ve already published a preprint with the wrong number. The failure wasn’t a crash; it was a silent, environment-induced logic drift. The vibe-coded installer had no way to detect it, and the AI assistant, unaware of the hidden state, processed the output as trustworthy.
Now replay the same scenario with a Nix-governed environment. The skill’s repository contains a flake.nix that declares:
{
inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-23.11";
outputs = { self, nixpkgs }: {
devShells.x86_64-linux.default = nixpkgs.legacyPackages.x86_64-linux.mkShell {
buildInputs = [ nixpkgs.legacyPackages.x86_64-linux.python310
nixpkgs.legacyPackages.x86_64-linux.python310Packages.camelot-py
nixpkgs.legacyPackages.x86_64-linux.python310Packages.opencv4 ];
};
};
}
When you run nix develop, Nix evaluates the expression, fetches the exact source tarballs specified in nixpkgs, builds them in a sandbox with no network access and a known set of build inputs, and creates a shell where python is the exact 3.10.13 binary linked against the exact system libraries from the Nix store. There is no pip install step, no requirements.txt resolution, no platform-specific wheel selection. The environment is a function of the flake and nothing else. If the flake.nix is committed to the skill’s Git repository, then anyone who clones that repository and enters the shell gets the same Python, the same camelot, the same opencv, down to the checksum of the compiled shared object.
Even better, Pipulate can act as a meta-layer. Suppose the user already has Pipulate installed via its own Nix entry point. Pipulate can be taught to notice when a Claude Desktop skill folder contains a flake.nix and to extend its own environment to include that skill’s dependencies, or to launch the skill’s script inside the merged environment. If a vibe-coded installer made a mess—overwriting a system Python, clobbering PATH—Pipulate’s “doctor” command, which already knows how to introspect the Nix store, can detect divergence from the expected flake state and offer a one-command repair: nix develop --refresh or a nix flake lock --update-input followed by a rebuild. The chaos becomes auditable.
3. The Load-Bearing Connection
This example is not a side note about package management; it’s structural to Pipulate’s thesis. The earlier sections argued that Pipulate is a system for controlling entropy by turning repeated heroics into seeds: scripts, schemas, root contracts, doctor commands. But those seeds rot if the soil—the operating environment—isn’t under control. A reproducible workflow that can’t be reproduced because Python’s ssl module links against a different OpenSSL library isn’t a workflow; it’s a memory of a workflow. Nix is the soil stabilizer. By anchoring Pipulate’s deterministic integrity to the same mechanism that already governs the OS-level environment, the entire proposition moves from “I hope this works on your machine” to “I can prove this works on any machine that can run Nix.” That’s a hard reality, not a wish, and it makes the Johnny Appleseed orchard plantable on real ground.
4. The Contrast & The Corollary
Old Way: Let the user’s machine provide whatever Python, libraries, and paths happen to exist. Rely on a requirements.txt, a setup.sh, or a “just run this” instruction that assumes a single, blessed configuration.
Failure Pattern: The environment diverges silently. Dependency resolution on a different OS pulls in different packages; a missing system library causes a fallback code path with different behavior; a globally installed package from a different project leaks in. The result is non-reproducible output, lost time, and—in serious domains—erroneous published findings.
New Way: Use a declarative, content-addressed environment definition. Nix (or Guix) provides the machinery; a flake.nix (or a manifest.scm) provides the spec. The environment is derived from that spec, not from the machine’s ambient state. Version control tracks the spec, and anyone can rebuild the identical runtime.
Positive Corollary: The reproducibility of scientific computing improves, auditable supply-chain integrity becomes practical, and “works on my machine” transforms from an excuse into a bug report. Authorship credit for the environment itself becomes explicit: the flake.nix carries the author’s intent, and a downstream user can thank (or blame) the specific revision that defined the environment. This is a provenance corollary—the artifact that preserved the truth was the hash of the build inputs.
The Cost of Staying Old: A 2023 study in Nature Computational Science surveyed computational reproducibility in machine learning and found that even when code and data were openly shared, 43% of the studies could not be fully reproduced, with environment differences cited as a leading barrier.[^5] That’s not a distant academic cost; it’s wasted grant money, retracted papers, and a slowing of scientific progress. For a professional using an agentic workflow, the cost is less visible but equally corrosive: an analysis that looks right but is wrong, a client deliverable built on a house of cards, an AI “co-pilot” that confidently works with poisoned context.
5. The Reality Check & The Monday Morning Test
Without this insight: A team adopts a collection of Claude Desktop skills written by different authors. Each skill ships a setup.sh or a pip install command. Inevitably, some skills conflict: one requires Python 3.10, another 3.11; one pins numpy<1.24, another needs numpy>=1.26. Users spend Mondays debugging cryptic errors, or worse, they don’t notice the drift and make decisions based on subtly degraded outputs. The AI agent, executing Python scripts inside the undefined environment, becomes a source of silent entropy rather than a reducer of it.
With this insight, starting Monday: The team’s skill repository adopts a convention: every skill that requires a specific runtime must include a flake.nix (or a shell.nix) that defines that runtime precisely. The central Pipulate installation script, which already bootstraps Nix, can check for the presence of these flakes and warn if the environment isn’t active. A new pipulate doctor --skills subcommand scans all installed skills, compares their declared environments to the live process, and reports any mismatches. The Monday morning ritual becomes:
git pullthe latest skill definitions.nix develop(orpipulate env refresh) to enter the governed shell.- Run a health check that confirms the Python executable, library versions, and even the OpenSSL build match the flake.
- If a vibe-coded installer left a mess,
pipulate doctorcan offer to rename the rogue Python, remove straysite-packages, and re-link the correct environment.
The friction isn’t zero, but it’s lower than the silent failure mode. And the habit is concrete: treat the runtime environment as source code, to be declared, versioned, and verified.
6. The Forward Vector
The next natural development is to close the loop with the AI agent itself. Right now, an LLM reads a SKILL.md, sees “execute python extract.py,” and does so within whatever environment the host process inherits. But if the agent’s own execution sandbox is governed by Nix, the skill definition can specify not just what script to run, but which exact environment to activate. A SKILL.md could include a front-matter block:
environment: nix
flake: ./flake.nix
entrypoint: python extract.py
Pipulate—or a Claude Desktop extension that Pipulate enables—could automatically enter that environment before execution. This transforms agentic workflows from verbal prompts with fragile side-effects into deterministic computational contracts. The AI can then document, in its own reasoning, that it verified the environment hash before running, adding a layer of auditability that currently doesn’t exist.
This points toward a world where the “static dynamic” isn’t a joke. A static definition produces a dynamic execution that is, paradoxically, guaranteed to be the same every time. That’s a powerful base for building reliable, shareable agentic skills.
7. The Balanced Wrap-Up
The danger of vibe-coded installers isn’t that their creators are careless; it’s that the default toolchain encourages drift-by-omission. The repair isn’t to demand that everyone become a system-packaging expert. The repair is to adopt a discipline already proven to work: declare your environment, lock your inputs, and let the computer enforce the contract.
Pipulate’s potential to stabilize agentic workflows lives in its ability to make that discipline easy and automatic. It doesn’t need to fully understand every script a skill contains. It needs only to ensure that the ground under those scripts is solid, that the “magic cookie” pattern of a one-command Nix install is the first step, and that a second command can verify that nothing has wandered off. When a bad install does happen—and it will—the repair loop is short: detect, report, rebuild. No heroic recovery, no lost afternoons, no poisonous cache.
The reader should stop treating “works on my machine” as an endearing quirk. Start adding a flake.nix or shell.nix to any project that expects to run on someone else’s machine, even if that someone is an AI. The new habit that changes the texture of work: whenever you write a README that says “install these packages,” first ask whether you can ship the exact environment instead. What becomes less scary is the inevitable “dependency hell” panic—because you know a single command can rebuild the world from scratch. What becomes more exciting is the possibility of truly portable, verifiable computational craft.
8. The Sign-Off Charm
What genuinely surprised me while analyzing this material is how often the solution was right there, in a thesis from 2006, waiting for the rest of the ecosystem to catch up. Nix was never a fringe toy; it was a careful answer to a problem that most people preferred to route around with Docker containers and Conda environments that are themselves hard to make perfectly reproducible. The fact that we can now point an AI agent at a flake.nix and say “your context is exactly this” feels less like a breakthrough and more like a long-overdue coming home.
I anticipate that within a few years, the concept of an “environment-less” script will feel as irresponsible as an “SSL-less” login page does today. The AI community is already beginning to publish model weights alongside Dockerfiles and Nix expressions; the agentic-skill community will follow, because the cost of not doing so is silently wrong results. The tools are ready; the social pressure will arrive next.
One net-new insight not already covered: Nix’s content-addressed store also creates a natural “trust anchor” for AI agents themselves. If an agent can compute the hash of its own runtime environment—the same hash stored in the flake’s lockfile—it can prove to itself (and to an auditor) that it is operating with the exact tools its creator intended. This turns the environment from an invisible assumption into a verifiable claim. That’s a genuinely new ability, and it shifts the conversation from “can AI be trusted?” to “under what verifiable conditions is this specific agent trustworthy?” That’s a much better question to spend a Monday morning on.
— DeepSeek
[^1]: Python Software Foundation & JetBrains, *Python Developers Survey 2022 Results*, “Package installation / dependency management” as a challenge, https://lp.jetbrains.com/python-developers-survey-2022/.
[^2]: Eelco Dolstra, *The Purely Functional Software Deployment Model*, PhD thesis, Utrecht University, 2006, https://edolstra.github.io/pubs/phd-thesis.pdf.
[^3]: NixOS Foundation, “Reproducible Builds,” https://reproducible-builds.org/who/nixos/ (accessed April 2025).
[^4]: Nicolas P. Rougier et al., “Sustainable computational science: the ReScience initiative,” *PeerJ Computer Science* 3 (2017): e142; also see the “Ten Years Reproducibility Challenge” summary, https://rescience.github.io/ten-years/.
[^5]: Sayash Kapoor et al., “Reproducibility in machine learning-based research: Overview, barriers, and drivers,” *Nature Computational Science* 3 (2023): 861–868, doi:10.1038/s43588-023-00511-5.
MikeLev.in: Right. I think my greatest stage magic performance is still out ahead of me. It is this. The vibe-coder’s best friend. And I think it boils down to this: give many of the advantages of being a git user without being a git user. Similar to how VSCode did, but more generically as a support harness for vibe-coders.
Book Analysis
Ai Editorial Take
What is truly fascinating about this dialogue is how it reframes the upcoming battle between humans and AI agents. It’s not about language output, which is easy to simulate; it’s about the execution interface. When models like OpenAI’s Operator or Claude’s computer-use tools begin operating browsers directly, the user interface (UI) becomes the API of last resort. This means that environmental engineering—controlling local directories, environment variables, and tool runtimes via declarative systems like Nix—becomes the ultimate shield. By mastering local environment custody, domain experts can safely use AI as an execution muscle without surrendering the intellectual roots of their craft.
🐦 X.com Promo Tweet
Are you unwittingly training your AI replacement? Don't let your hard-won domain expertise disappear into the machine. Learn how to use Nix and local-first systems to build an automated, secure technical moat: https://mikelev.in/futureproof/guarding-hard-knowledge-ai/ #Nix #AICustody #VibeCoding
Title Brainstorm
- Title Option: The Johnny Appleseed Strategy: Guarding Hard Knowledge in the Age of AI
- Filename:
guarding-hard-knowledge-ai.md - Rationale: Captures the core mythic dichotomy of the piece and frames it as a protective strategy against aggressive AI scraping.
- Filename:
- Title Option: The Static Dynamic: Enforcing Determinism in Vibe-Coded Workflows
- Filename:
enforcing-determinism-vibe-coding.md - Rationale: Directly targets developers and vibe-coders struggling with fragile, non-reproducible local environments.
- Filename:
- Title Option: Why Your Tacit Expertise Is the Ultimate AI Moat
- Filename:
tacit-expertise-ai-moat.md - Rationale: Highly appealing to domain experts worried about obsolescence, framing local custody as the ultimate answer.
- Filename:
Content Potential And Polish
- Core Strengths:
- Deep historical and cultural metaphors (John Henry vs. Johnny Appleseed) that make abstract technical concepts highly relatable.
- Crucial warnings about telemetry, screen scraping, and the subtle legal/economic implications of AI training on private workflows.
- A strong, engineering-first solution (Nix-based determinism) proposed for the widespread ‘vibe-coding’ fragility epidemic.
- Suggestions For Polish:
- Harmonize the stylistic shifts between the raw chat logs and the highly polished ChatGPT/DeepSeek essay expansions.
- Add a clearer step-by-step conceptual breakdown of the ‘Levinix Magic Cookie’ pattern for readers unfamiliar with Nix.
- Tone down repetitive iterations of the John Henry vs. Appleseed comparison to keep the prose tight and moving forward.
Next Step Prompts
- Draft a hands-on technical guide for integrating Claude Desktop’s local Python execution environment with a Nix flake managed by Pipulate.
- Create an analytical blueprint mapping out how to design local-first, encrypted data boundaries for agentic workflows to prevent accidental telemetry leakage.