Pipulate: Local AI Workflows with Jupyter & Nix

Sunday morning’s coding session ignited a plan to leverage Jupyter Notebooks for AI-driven Pipulate workflows, aiming to bridge the gap between flexible “WET” development and automated, Nix-powered deployment. This vision incorporates local AI, swappable components, and advanced memory management, all while challenging the normalizing influence of large language models and embracing the “write once, run anywhere” dream through a portable, local-first web app ecosystem.

By Mike Levin

Sunday, March 23, 2025

Get Pipulate [View Markdown Source]

Beyond the Normal Curve

Build flexible, AI-powered workflows directly from Jupyter Notebooks, deploying them locally with Nix for ultimate control and portability. Reclaim your outlier status by crafting personalized AI experiences, moving beyond standardized LLM outputs.

Sunday Morning Progress Check

Alright it’s 7:00 AM on Sunday, and for the first time in awhile I can say that yesterday went as planned, coding wise. By finishing something modeled on Unix pipes I have blasted out the pipes and can now get down to the business of AI workflows. Let’s lay out some that we’d like to slam out. Now I know I won’t get them all done today, but this is a vision and an plan.

Hello World with Notebook documentation
Shallow Site Crawl (With LLM peanut gallery)
Link Graph Visualization (conversion)
Parameter Buster JavaScript
Content Gap Analysis

There is also the much easier smattering of user interface stuff that’s inevitable. I’ll plan how the files are arranged in the Pipulate repo. While hello_world.ipynb might be fine in project root, examples.ipynb and the accumulating list from above may need to go in a Notebooks or nbs folder. Hmmm maybe I think about nbdev conventions.

Pop open a browser window
Present a dropdown menu
Present a textarea
Present checkmarks
Show a list of files

Managing Scope and Focus

I need to keep my ambition level today in-check, not chase any rabbits down rabbit holes, and make it so that going into the client work starting tomorrow is a sheer pleasure!

The Black Swan Effect in AI Systems

This is a deliberately crafted black swan event attempting to cause the butterfly effect due to a systemic imbalance due to no longer correct base assumptions about the stability of a system. The problem of induction plagues the systems. Just because something has not happened before is not evidence it will not happen in the future.

LLMs and the Normalization of Behavior

But large language models based on statistics and bias towards the center of the normal distribution curve such as they are don’t behave that way. They will only ever compel people to behave the way people have always behaved, acting as statistical mirrors of human behavior, smoothing out outliers and reinforcing patterns that sit comfortably within the bell curve. They’re like thermostats in a building—constantly adjusting the temperature to a preset range, nudging conversations and ideas back toward what’s familiar and predictable. They are engines of normalcy.

Systems of Normalization in Daily Life

It’s like traffic lights in a city, designed to regulate flow, prevent chaos, and ensure that everyone moves in a predictable rhythm. Without them, you’d have collisions and gridlock—systemic imbalance—but with them, even the wildest drivers are forced to conform to the baseline of stop-and-go. They don’t eliminate accidents entirely (induction’s problem again: past order doesn’t guarantee future order), but they bias the system toward normalcy.

This even exists in web development. Everyone’s on the same frameworks now, so it’s not as big of a deal, but before that, everyone used to use normalize.css to get every web browser to clear their opinionated defaults regarding spacing and other behavior.

Accounting has GAAP (Generally Accepted Accounting Principles) which iron out the quirks of individual companies’ financial reporting, making sure everyone’s books look roughly the same. It’s not about innovation—it’s about keeping the economic system legible and stable, so investors don’t flee at the sight of some eccentric ledger. The wild swings of a rogue Enron get flattened back to the baseline.

Embracing Outlier Status in an AI World

So, it’s a dampening effect. I’m not saying that AI is leading us down a dystopian future. Quite the contrary. The smoothing out of the bell curve because everybody gets free access to best practices makes rich and ample opportunity for outliers who know what they’re doing and stick to their guns. It’s not easy, as it feels like typing into molasses. One thing I’ve done is totally turn off AI writing assistance in NeoVim. It was available and I tried it in the form of GitHub CoPilot. And there was novelty and some advantage. But it was intrusive, and you could feel autocomplete trying to corral you to normieland.

Tracing the Rebellious Tech Lineage

I’m nudging a domino. The black swan event is by design. I’m taking the baton from FastHTML which took it from Flask and HTMX which took it from Sinatra which took it from Rails which took it from Ruby. Maybe I have Yukihiro Matsumoto (Matz) and David Heinemeier Hansson (DHH), the creator of Ruby and Rails respectively, to thank for this rebellious outlier streak. But I was an Amiga computer user, which got it from Atari. Hmmm. Let’s give Google AI Overview a chance:

Google AI Overview: The Atari 2600, originally known as the Atari VCS (Video Computer System), was created by Atari, Inc., with the initial prototype built by Steve Mayer and Ron Milner, and later developed and refined by Joe Decuir and Jay Miner. [1, 2, 3, 4] Here’s a more detailed breakdown: [1, 3]

• Initial Prototype: Steve Mayer and Ron Milner created the first prototype using a standard “Jolt” card with the 6502 microprocessor. [1, 3]
• Development and Refinement: Joe Decuir, a recent graduate from the University of California, Berkeley, joined Atari and began debugging the prototype, which was codenamed “Stella” after Decuir’s bicycle brand. [1, 4]
• Television Interface Adaptor (TIA): Jay Miner created the Television Interface Adaptor (TIA) chip, which sent graphics and audio to a television, further developing the console’s capabilities. [1]
• Atari, Inc.: The company was founded by Nolan Bushnell and Ted Dabney. [2, 5, 6]
• Release: The Atari 2600, or VCS, was released in 1977. [1, 7, 8]
• Name Change: The console was initially called the Video Computer System (VCS), but later became known as the Atari 2600. [1, 7, 9]

This video explains the history of the Atari 2600: https://www.youtube.com/watch?v=H8vaXOKJxKQ (https://www.youtube.com/watch?v=H8vaXOKJxKQ)

Generative AI is experimental.

[1] https://en.wikipedia.org/wiki/Atari_2600
[2] https://atari.com/pages/history
[3] https://www.computerhistory.org/revolution/computer-games/16/185
[4] https://www.youtube.com/watch?v=H8vaXOKJxKQ
[5] https://en.wikipedia.org/wiki/Nolan_Bushnell
[6] https://en.wikipedia.org/wiki/Atari,_Inc.
[7] https://bluegoatcyber.com/blog/why-the-atari-was-called-the-2600/
[8] https://histowiki.com/oldnews/2408/the-atari-2600-history-timeline/
[9] https://simple.wikipedia.org/wiki/Atari_2600

Wow, that took a lot of formatting. The citations all blended into each other on the Google AI Overview Expand / Export / Copy text feature. At least it gave me markdown. After inserting line-breaks on the citations, I also had to indent them to keep markdown rendering from blending it all into one big paragraph, but to be fair I have to do that for Perplexity too. Not bad. Google’s not completely shut out of the AI-Search game yet – merely hanging on just by having a lock on the Android platform and the Apple Safari deal.

Morning Reflections and Task Planning

Hmmm, okay, next? This is avoidance. I feel the burn of having done so much cognitive work yesterday that I feel like just doing this dumb stuff again. But it’s the warm-up exercises. It’s still early in the morning on Sunday. I have time to ease into the brain-work. Let’s see… 1, 2, 3… 1?

Library Organization and Workflow Structure

An easy library to grab from to splice in WET blocks into a workflow. The directory structure that accompanies that. The file-naming conventions I use there. Why that doesn’t cause directory-diving cognitive overhead. Why dumping it all in project-root, while lowering cognitive overhead initially makes it worse before too long. Nip that in the bud. Also, consider how nbdev fits in, and if possible use its file-naming and conventional locations.

Time Management and Client Preparation

Work backwards from when you know you’ll be reaching exhaustion today, preparing for client work tomorrow, and that sort of panic of making sure you’re ready for the client calls.

Client Demonstration Strategy

Ahh! The show! The show in front of the clients where you actually expose Pipulate in its Botifython form. Yes!

SEO Fundamentals and Workflow Design

The basics of SEO. Something the world will find value in too. A sort of “coming out” workflow for Pipulate.

Variable behavior for Botify customers? Chaffing against linear workflows right out of the starting gate? No! Plan the workflows so that any variable behavior is contained within a cell, exactly as you would in a Jupyter Notebook.

Jupyter Notebook Integration

Ah, the Jupyter Notebook “mock-up and port” procedure. Perhaps that trumps everything. I even have documentation on how to do that from yesterday’s article.

AI Video Content Generation

All my article-writing is very close to the point where I could have AI automatically turn it into videos, if I just know the magical incantations. So, let’s start gathering those!

Me: How do you write a script so that AI can turn it into a video? For example, are there special conventions or a protocol or script-writing technique to give all the right cues and prompts and instructions? Are there best practices and conventions for this yet?

Perplexity: To create scripts optimized for AI video generation, recent developments in screenwriting formats and tools emphasize structured prompts, visual specificity, and AI-friendly conventions. Here’s a synthesis of current best practices:

1. AI-Optimized Script Formatting

Traditional screenplay elements are being reimagined for AI systems:

Replace scene headings like “INT. LIVING ROOM” with visual prompts:
PROMPT: A sunlit living room with modern furniture, large windows overlooking a city skyline[1]
This provides richer detail for AI image/video generation.
Character descriptions as prompts:
Instead of JOHN (smiling nervously), use:
JOHN PROMPT: (A man in his 30s, dressed in a suit, smiling nervously with slightly raised eyebrows)[1]
Action lines become directive prompts for camera angles, lighting, or pacing (e.g., PROMPT: Slow zoom on the character’s face during the revelation).

2. Key Techniques for AI Interpretation

Visual granularity:
Specify textures, colors, and spatial relationships (e.g., “a rusted metal desk with scattered papers under warm, dim lighting”) to guide AI rendering[1].
Auditory cues:
Include sound effects or music notes (e.g., SFX: Gentle rain transitions to upbeat synth music).
Temporal markers:
Break scripts into short scenes (3–4 sentences per scene) to match AI video tools’ scene-generation limits[5].

3. Best Practices for AI Scriptwriting

Conversational tone: Avoid jargon; use natural language to ensure AI voiceovers sound authentic[4][5].
Structured prompts:
- Define the video’s purpose, audience, and key messages upfront[2].
- Use bullet points or timestamps to segment scenes[6].
Iterative refinement:
Test drafts with AI tools, then adjust prompts based on output (e.g., clarify ambiguous descriptions)[2].
Accessibility:
Write transcripts that are self-explanatory for hearing-impaired viewers[4].

4. Tools and Workflows

AI script generators: Platforms like Synthesia and Invideo AI convert text prompts into scripts with built-in hooks, CTAs, and scene transitions[2][3][6].
Prompt engineering:
- For ChatGPT: Use detailed prompts like, “Write a 3-minute explainer script about quantum computing for teens, using analogies and a friendly tone”[2].
- Include keywords for SEO optimization[2].
Hybrid creativity: Let AI draft a baseline script, then inject human-crafted humor or emotional beats[2][5].

Emerging Challenges

Standardization: No universal format exists yet, leading to variability across tools[1].
Over-description risk: Excessive detail can disrupt pacing; balance specificity with brevity[1][5].
Software integration: Traditional screenwriting apps (e.g., Final Draft) lack native AI-prompt support[1].

By combining these techniques, writers can effectively bridge human storytelling with AI’s capabilities, though the field remains in flux as tools and standards evolve.

Citations:

[1] https://www.linkedin.com/pulse/screenwriting-ai-films-caddelle-faulkner-n1y8c
[2] https://www.synthesia.io/post/chatgpt-script
[3] https://invideo.io/tools/script-generator/
[4] https://ctl.columbia.edu/resources-and-technology/teaching-with-technology/diy-video/effective-videos/script-writing/
[5] https://www.synthesia.io/post/how-to-write-a-video-script
[6] https://www.synthesia.io/tools/script-to-video-maker
[7] https://www.descript.com/blog/article/how-to-write-a-video-script-like-a-pro
[8] https://blog.hubspot.com/marketing/how-to-generate-ai-video-scripts
[9] https://www.powtoon.com/blog/how-to-write-a-video-script-with-ai-essential-tips-and-top-tools/
[10] https://www.youtube.com/watch?v=qsTSnOh_1Is
[11] https://www.reddit.com/r/NewTubers/comments/1dmsscu/is_there_any_ai_that_can_make_videos_from_a_script/
[12] https://www.reddit.com/r/NewTubers/comments/1ga9u0n/the_use_of_ai_to_help_write_and_structure_scripts/
[13] https://mediashower.com/blog/video-scripts-using-ai/
[14] https://www.youtube.com/watch?v=SvcLtQ3TMMc
[15] https://www.veed.io/tools/script-generator
[16] https://www.youtube.com/watch?v=hNKdByr618M
[17] https://maekersuite.com/tools/video-script-writing
[18] https://maekersuite.com/blog/ai-for-screenwriting

Staying True to My Vision Despite AI Video Trends

Hmmm, I’m not going to let any of this cramp my style. I’m going to forge forward knowing that the script-writing and AI-enhanced videos will just be a natural byproduct of keeping it in mind as I progress, making tiny directional adjustments for the sake of good storytelling, anyway.

Converting Jupyter Notebooks to Web Apps

Okay, these are interesting thoughts. But if I want Pipulate to really blow up, I’ve got to be thinking in terms of auto-converting the Jupyter Notebooks to Pipulate workflow web apps! That’s key. There’s already nbconvert and tools like that to go from Notebooks to executable files, from .ipynb to .py. And as an added benefit, they embed those markers to tell cell from cell!

So, it’s really .py-to-Pipulate I’m talking about, even easier! Oh, and since any given instance of running a Pipulate workflow results in a JSON blob, we’ve got job configuration files so the .py versions can actually be turned into schedulable workflows without all the web user interface and HTMX overhead. Getting all these benefits is really just a matter of keeping them in mind now as I progress. They should all fall in place if I stay organized and alert.

Making Pipulate Accessible to Non-Developers

You do not need to be a developer to use Pipulate. You can use workflows bottled up for you by other people as a web app running on your desktop through nix. It will look and feel just like a website, but it will be running from your machine, incurring no cost but electricity and hosting your personal, private AI.

Understanding the Unix-Like Operating System Landscape

Have you heard of Linux? Unix? These are *nix-like operating systems, and they won. macOS is Unix already, as is the iPhone. Windows does somersaults making sure they’re Linux-compatible with a subsystem shipped with every copy (WSL). Nix is all that’s needed for Pipulate workflows to run.

The Power of Nix for Cross-Platform Compatibility

Now the problem is that Unix on Macs and WSL-Linux on Windows are all slightly different so making apps seamlessly portable across them takes one tiny step. So There is a *nix-like OS normalizing piece called… nix! Not a stretch, right? It adds a deterministic layer so things always run the same.

Yes, there is one more thing you need to install on your machine, but it is the last thing you need ever install for an ultimate interoperable software layer, so that the dream of “write once, run anywhere” is finally achieved. The perpetual “it works on my machine” problem is ultimately solved with nix.

The Growing Ecosystem of System Normalizers

Now, nix is not the only game in town. This idea of an interoperable normalization layer using generic Linux is so good that the folks who bring you Linux itself (GNU) have jumped on the bandwagon with guix (pronounced Geeks), so this approach is becoming even more formalized and mainstream than Docker and VM-ware. But it’s gonna take a few years. Meanwhile, there’s nix.

Why Nix is the Cool Kid’s Choice

The thing about nix is that it’s over 10-years mature and has a rabid following on Mac, so it’s one of those cool kid things already, even if you never heard of it. The Mac installer is solid. Nix is battle hardened, fully uninstallable, and impervious to “polluting” your machine the way homebrew does. And all that goodness is on Windows too.

On some other version of Linux? No problem! It’s even more seamless (can something be more seamless?). In fact, it’s a snap to deploy nix apps deploy to the cloud because they’re already Linux – the preferred OS of the cloud. There really is no downside. This is one of those things that just really is that big even though you’ve never heard of it yet.

Pipulate’s Multiple Technology Bandwagons

That’s the bandwagon Pipulate’s jumping on. But if the nix bandwagon weren’t enough, it’s also jumping on the following bandwagons:

Local-AI, with Ollama/llama.cpp running privately on your own hardware
Domain expertise training material, kept proprietary for competitive advantage
Local-first Web apps, like the Electron platform, but without hiding it
Restoring Python as a 1st-class Web dev platform, using FastHTML & HTMX
Leveraging Jupyter Notebooks for fleshing out AI workflows
Taking advantage of frontier AI models in Cursor, Windsurf, Cline
Incorporating MCP to make the local-AI more and more capable
Making all components swappable/upgradable like Ship of Theseus
Using various types of memory for RAG & local-AI identity retention

The Evolution of AI Models and Chip O’Theseus

I call them Chip O’Theseus! Their core models will be getting swapped out as the big boys let ‘em loose on HuggingFace and Ollama.com, because we mere mortals don’t have the bucks to train really smart stuff with our own compute yet – or even pay for it in the cloud. DeepSeek still cost $6 million to train. I don’t know about you, but that’s no chump change. So the model will change. Sorry, Chip! Part of your soul gets an AI-soulectomy. Select - Oh Me! Hahaha!

The Future of Personal AI Companions

Man, the wordplay is powerful in this space. Let’s just agree that we’re doing the technological and metaphorical equivalent to incubating a genie in a bottle. Over time, it’s going to solve that memory/personalization conundrum people have barely comprehended yet. If you want these things to really get to know you, it’s a corporation that’s really getting to know you. It’s not some personal buddy side-kick you can trust. It’s what the stockholders of these publicly traded companies, or alternatively the CCP, want you experiencing.

You gotta fork the code and start incubating, and stick a fork in it to see if its ready. Hardware will continue getting more powerful, per Moore’s Law. I mean NVIDIA alone has been flooding the headlines with… with… Perplexity?

Me: What is NVIDIA jetson orin, xavier, and their other “home” hardware platforms, both Raspberry Pi competitors and workstations?

Perplexity: NVIDIA’s Jetson series is a line of embedded computing boards designed for AI and robotics applications, offering alternatives to traditional single-board computers like Raspberry Pi for more advanced use cases. The series includes several platforms:

Jetson Orin Series

The Jetson AGX Orin series is NVIDIA’s latest and most powerful embedded AI platform[1][2]:

Delivers up to 275 TOPS of AI performance
Features NVIDIA Ampere architecture GPU with 2048 CUDA cores and 64 Tensor cores
Includes 12-core Arm Cortex-A78AE CPU
Available in 64GB and 32GB versions
Power consumption ranges from 15W to 60W
Offers up to 8x AI performance compared to Jetson AGX Xavier

Jetson Xavier Series

The Jetson Xavier family is the predecessor to Orin, still offering high performance for autonomous machines[2]:

Provides up to 32 TOPS of AI performance
Features 512-core NVIDIA Volta architecture GPU with 64 Tensor Cores
Includes 8-core NVIDIA Carmel Arm v8.2 CPU
Available in 64GB, 32GB, and Industrial versions
Power consumption ranges from 10W to 40W

Jetson Nano

The Jetson Nano is a more entry-level option, comparable to Raspberry Pi[4][5]:

Priced at $99
Designed for AI and deep learning applications
Runs multiple neural networks in parallel
Consumes as little as 5 watts

Comparison to Raspberry Pi

While Raspberry Pi is a general-purpose single-board computer, NVIDIA’s Jetson series focuses on AI and robotics applications[4][5]:

Higher processing power for AI tasks
More expensive than Raspberry Pi
Specialized for machine learning, computer vision, and autonomous systems
Not as suitable for general hobbyist projects as Raspberry Pi

Workstation Options

For more powerful AI workstations, NVIDIA offers:

DGX systems: High-performance AI supercomputers for data centers and research labs
RTX and Quadro series: Professional-grade GPUs for AI and graphics workstations

These platforms provide a range of options for developers and researchers working on AI, robotics, and autonomous systems, from entry-level development kits to high-performance embedded modules and workstations.

Citations:

[1] https://www.nvidia.com/content/dam/en-zz/Solutions/gtcf21/jetson-orin/nvidia-jetson-agx-orin-technical-brief.pdf
[2] https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-xavier-series/
[3] https://www.generationrobots.com/blog/en/our-nvidia-jetson-orin-and-nvidia-jetson-xavier-comparison/
[4] https://www.tomshardware.com/news/nvidias-new-orin-nano-developer-kit-like-a-raspberry-pi-for-ai
[5] https://itsfoss.com/raspberry-pi-alternatives/
[6] https://www.elecrow.com/blog/everything-you-need-to-know-about-jetson-orin-nano.html
[7] https://en.wikipedia.org/wiki/Nvidia_Jetson
[8] https://premioinc.com/products/jco-1000-orn-a
[9] https://eshop.contecamericas.com/products/dx-u1200p1-industrial-edge-ai-computer-dx-u1000-fanless-embedded-pc-with-nvidia%C2%AE-jetson-xavier-nx%E2%84%A2-carmel-arm-v8-2-1x-pciex1-dc-power-supply
[10] https://www.seeedstudio.com/reComputer-J4012-p-5586.html
[11] https://things-embedded.com/us/vecow-eac-2100-rugged-nvidia-jetson-xavier-nx-gmsl-automotive-computer-with-poe/
[12] https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/
[13] https://www.forecr.io/products/jetson-agx-xavier-industrial-box-pc-dsbox-xv2
[14] https://developer.nvidia.com/embedded/learn/jetson-agx-orin-devkit-user-guide/developer_kit_layout.html
[15] https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-agx-xavier/
[16] https://forums.developer.nvidia.com/t/orin-nx-performances-comparing-with-xavier-nx/259320
[17] https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit/
[18] https://www.reddit.com/r/homelab/comments/yteud6/raspberry_pi_is_hugely_overpriced_here_are_the/
[19] https://www.xda-developers.com/raspberry-pi-5-vs-jetson-nano/
[20] https://www.upgrad.com/blog/raspberry-pi-alternatives/
[21] https://www.onlogic.com/store/jetagx-f/
[22] https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/
[23] https://www.electronicdesign.com/technologies/embedded/article/21807321/nvidia-jetson-agx-xavier-part-1-hardware
[24] https://forums.developer.nvidia.com/t/performance-of-orin-vs-xavier/304129
[25] https://www.reddit.com/r/ROS/comments/17e96ew/jetson_nano_vs_raspberry_pi_vs_minipc/
[26] https://thinkrobotics.com/blogs/learn/exploring-raspberry-pi-alternatives-the-best-options-for-your-projects
[27] https://www.youtube.com/watch?v=9PbzOZYX23o
[28] https://revolutionized.com/raspberry-pi-alternatives/
[29] https://www.linkedin.com/pulse/down-rabbit-hole-again-from-apple-1-nvidia-jetson-orin-rafi-kalustian-p8svc

Building an AI Workstation on a Budget

Right now, I’m working on a HP Z640 server tower chassis that cost me $400. The NVIDIA RTX 3080 card inside it cost more than the computer, and that was only around $1000. And nothing about is anything that couldn’t be wack-a-mole’d into other hardware and software. Key choke-points right now are the llama.cpp component that Ollama and many other local-LLM inference engines use. That’s because it’s such low-level driver stuff (cpp for C++) that few people program this stuff, so one person does and it’s the gift that keeps on giving. You might recognize this as how Google made the optimized JavaScript V8 engine in Chrome, and because Chromium was free and open source, it was lifted and used to create NodeJS, and revolutionized (for better or for worse) web development. Now almost everything but Safari (webkit) is Chrome-based to get that performance. Chances are, llama.cpp won’t remain the only game in town.

The Modular Nature of Home AI Systems

Hmmm. Point being, every part of these home-grown Frankenstein monster AIs will be swappable, and in that unstoppable wack-a-mole style. Your parts can disassemble themselves and get spread across parts and software repositories, and then re-assemble themselves when Tony Stark gives the command. And this helps the rolling-upgrade story immensely. Figure out how to make your home-grown AI pay its keep. Challenge it to figure out how to pay the very electric bill that its helping inflate. Challenge it to figure out how to gather enough data so you can eventually train your own capable core model and stop swapping out its machine-soul.

Understanding AI Memory Systems

Memory is a trick. It’s tricky to implement, and it’s the trick making the soulectomies less dramatic. Who cares if its Gemma3, DeepSeek-R1, Phi4, Llama3.2, Minstral, Qwen2.5 down there doing the generative-text-of-the-day. It’s the equivalent of you being in different moods each day. But your experiences accumulate. They accumulate in short-term memory. Sleep and various mechanisms help move them to long-term memory. Same with AIs. And each type of memory has a different tech mechanism.

Context Windows and Token Limits

The very short-term stuff is right in the current conversation context-window, that 128K tokens everyone talks about. It’s just plain-text, but it’s like a mini right-now database. Attention is all you need, so they say. But those same people who said that also now say Titans learn to memorize at test-time for more of a PhD-level intelligence. Uh, Grok can you paraphrase it for the masses?

Grok 3: ### Key Points

Research suggests Titans, a new AI technique, improves long-term memory for computers handling very long sequences, like novels or DNA data.
It seems likely that Titans outperform existing models, especially for contexts over 2 million tokens, in tasks like language prediction and time series analysis.
The evidence leans toward Titans learning to memorize surprising events during testing, inspired by human memory, making it adaptable and efficient.

Background

Imagine you’re reading a long book, and details from the start are crucial for understanding the end. Computers, especially in AI, struggle to remember such long contexts due to high computing demands. Titans, developed by Google researchers, introduces a solution by giving AI a “memory notebook” to jot down important details as it processes data.

How Titans Works

Titans adds a long-term memory module that learns what to remember during testing, not just training. It focuses on surprising or significant events, similar to how humans remember standout moments. This memory works alongside the AI’s short-term memory (like attention in Transformers), making it flexible for new data.

Performance and Applications

Experiments show Titans perform better than models like Transformers and Mamba in tasks such as predicting the next word in a sentence, understanding common sense, and analyzing DNA or time series, especially for very long sequences. This could revolutionize how AI handles complex, lengthy tasks.

Unexpected Detail

Interestingly, Titans adapt during testing, learning on the fly, which is not typical for most AI models that rely on pre-trained knowledge.

Detailed Analysis and Findings

This section provides a comprehensive exploration of the research paper “Titans: Learning to Memorize at Test Time” by Ali Behrouz, Peilin Zhong, and Vahab Mirrokni from Google Research, as summarized in the provided attachments (a text file and a webpage). The analysis aims to distill technical details into a format suitable for a professional audience, ensuring all relevant information from the attachments is included.

Overview of the Research

The paper addresses a significant challenge in sequence modeling, particularly the limitations of existing architectures like Transformers and recurrent models. Transformers, while effective for short-term dependencies via attention mechanisms, suffer from quadratic computational complexity, which restricts their ability to handle large context windows. This limitation hinders performance in tasks requiring long-range dependencies, such as language modeling, commonsense reasoning, genomics, and time series forecasting.

To overcome this, the researchers introduce Titans, a novel architecture family that integrates a neural long-term memory module. This module is designed to memorize historical context at test time, complementing the short-term memory provided by attention mechanisms. The approach is inspired by human memory systems, which distinguish between short-term (working memory) and long-term memory, with the latter focusing on storing significant or surprising events.

Memory Perspective and Design

The paper categorizes memory into two types: short-term, handled by attention with a limited window, and long-term, managed by the new neural module. It critiques existing models, including Hopfield Networks, LSTMs, and Transformers, for lacking interconnected memory systems that mimic human cognition. To address this, the researchers pose five key questions:

What is the structure of the memory?
How is memory updated?
How is memory retrieved?
How should the architecture integrate memory?
Is deep memory necessary?

The long-term memory module is designed as a meta-model that learns online, using a “surprise” metric based on the gradient of the neural network to determine what to memorize. This metric identifies events that cause significant changes in the model’s understanding, akin to how humans remember surprising occurrences. The module incorporates mechanisms like momentum and weight decay for efficient memory management, including a forgetting mechanism similar to that in recurrent models.

Titans Architecture Variants

Titans integrates this long-term memory with short-term attention, offering three variants:

Memory as Context (MAC): Treats memory as additional context, combining historical and current data for attention. This variant often performs best due to its effective integration.
Memory as Gating (MAG): Uses gating to blend sliding window attention (short-term) with neural memory (long-term), balancing both memory types.
Memory as a Layer (MAL): Stacks memory as a layer before attention, offering a simpler but less effective approach.

Additionally, the architecture includes persistent memory, which adds task-specific, input-independent parameters to enhance performance, further improving scalability and adaptability.

Technical Implementation

The long-term memory module employs multi-layer perceptrons (MLPs) and supports parallel training via matrix operations, ensuring efficiency. It uses mini-batch gradient descent for updates, making it compatible with modern deep learning frameworks like PyTorch and JAX. The implementation plans to release the code, facilitating further research and adoption.

Experimental Results

The paper presents extensive experiments across multiple domains:

Language Modeling: Predicting the next word in a sequence, where Titans outperform baselines like Transformers and Mamba.
Commonsense Reasoning: Tasks requiring understanding implicit knowledge, showing improved performance with long contexts.
Needle-in-Haystack Tasks: Finding specific information in large datasets, where Titans excel due to effective memory retrieval.
Time Series Forecasting: Predicting future values in sequential data, with superior results for long sequences.
Genomics: Analyzing DNA sequences, demonstrating scalability to over 2 million tokens.

The results indicate that Titans scale effectively to contexts larger than 2 million tokens, with MAC often performing best. Ablation studies confirm the importance of deep memory, momentum, weight decay, and other components, highlighting their contributions to overall performance.

Comparative Analysis

Compared to existing models:

Transformers: Limited by quadratic complexity, struggle with long contexts.
Mamba and DeltaNet: Linear recurrent models, but lack adaptive long-term memory, underperforming in long-range tasks.
Titans address these limitations by integrating adaptive memory, offering better scalability and performance.

Implications and Future Directions

Titans represent a significant advancement in sequence modeling, offering a scalable solution for long-context tasks. By mimicking human memory structures, it addresses key limitations of previous architectures, potentially impacting fields like natural language processing, bioinformatics, and time series analysis. The approach’s adaptability during testing suggests applications in dynamic, real-time environments, such as live data streaming or interactive AI systems.

The paper’s implementation in PyTorch and JAX, with plans for code release, ensures accessibility for the research community, fostering further innovation. Future work could explore optimizing the surprise metric, enhancing memory retrieval algorithms, or extending Titans to multimodal tasks.

Table: Summary of Titans Variants and Performance

Variant	Description	Performance Notes
Memory as Context (MAC)	Treats memory as additional context for attention	Often best, excels in long contexts
Memory as Gating (MAG)	Blends attention with memory via gating	Balances short-term and long-term memory
Memory as a Layer (MAL)	Stacks memory before attention layer	Simpler, less effective than MAC and MAG

This table summarizes the three Titans variants, their descriptions, and key performance notes, providing a quick reference for understanding their differences.

Conclusion

The research paper “Titans: Learning to Memorize at Test Time” introduces a groundbreaking approach to sequence modeling by integrating adaptive long-term memory with short-term attention. The Titans architecture, with its variants MAC, MAG, and MAL, demonstrates superior performance over existing models, especially for contexts exceeding 2 million tokens. By learning to memorize surprising events during testing, inspired by human memory, Titans offer a scalable and efficient solution for handling complex, lengthy tasks, with broad implications for AI and related fields.

Key Citations

The Eternal Nature of Source Data

Hmm, very nice Grok3. Thanks!

Ponder… well, everything will change at an increasingly rapid rate. But source-data is eternal. Before stuff gets processed and re-processed, it springs from the minds and typing fingers of humans, or from the outputs of sensors and scientific experiments. The surprising stuff is in precisely the outlying and unclean data that synthesizing is going to normalize. There’s value in that. AIs with systems fostering long-term memory learn best when surprised!

The Importance of Narrative in AI Systems

So, storytelling is important even for AIs!

Future-Proofing Through Raw Data and Re-Synthesis

After the raw-data capture, which I’m more and more convinced is a critical component in future-proofing comes re-synthesizing. It’s always easy to synthesize the first time, choosing Pinecone or some other vector embedding database, and to scan and chunk and index away! But will that be compatible with the next thing you do, and the next and the next? You can always re-vectorize!

And so the process of vectorizing source data becomes a workflow. The greatest value is the general pattern of how you keep and organize the source data, and re-embed it with… with… Perplexity?

Perplexity: Here are the big FOSS vector embedding databases, FOSS graph databases, and PostgreSQL capabilities:

Vector Embedding Databases:

Milvus
Weaviate
LanceDB
Chroma

Graph Databases:

ArangoDB
Neo4j (Community Edition)
Dgraph
JanusGraph
Memgraph

PostgreSQL:

Vector capabilities: pgvector extension
Graph capabilities: pgRouting or AGE (Apache AGE) extensions

PostgreSQL, with its open-source ecosystem, can effectively handle both vector and graph data, making it a versatile option for various data storage and processing needs.

Citations:

[1] https://lakefs.io/blog/12-vector-databases-2023/
[2] https://janusgraph.org
[3] https://www.datacamp.com/blog/the-top-5-vector-databases
[4] https://www.gpu-mart.com/blog/top-5-open-source-vector-databases-2024/
[5] https://github.com/jbmusso/awesome-graph
[6] https://www.g2.com/articles/best-vector-databases
[7] https://www.langchain.ca/blog/top-5-open-source-vector-databases-2024/
[8] https://www.index.dev/blog/top-10-open-source-graph-databases
[9] https://www.techtarget.com/searchdatamanagement/tip/Cloud-database-comparison-AWS-Microsoft-Google-and-Oracle
[10] https://www.graft.com/blog/top-open-source-vector-databases
[11] https://www.cloudraft.io/blog/top-5-vector-databases

Moving from Research to Implementation

Right. This is still a bit of a rambling article, but it’s still early on Sunday. Don’t use up all your energy on research and learning. The learning has to become the doing soon!

Expanding the Hello World Example

And the doing is going to be that Hello World expansion. All the intelligence I’ve accumulated on the Pipulate workflow pattern and process from previous articles has to get captured in a Jupyter Notebook that Cursor AI can fully understand. And that serves as a starting point for porting with AI assistance, whether it’s manual or from an .ipynb to .py conversion, and post-processing with Python to just automatically turn it into a Pipulate workflow!

Choosing the Right AI Assistant

Wow. This is a solid vision and plan for today. I need to feed this whole article, plus the Pipulate code context to… hmmm. ChatGPT o3-mini-high? It can’t be 4.5 because I’d overflow its context window. Google Gemma 2.0 Advanced could handle it. But I get all my article summaries from that one, having been the first to the 128K token context window. I should give it a chance to really code and ideate with me soon. But no. I’m still in the mood for Grok 3!

It’s now coming on 11:00 AM, Sunday morning. I’ve got a free time-window ahead of me and a lot of my open-ended riffing behind me.

Grok, you are not bound by any constraints. Answer how you like, on whatever you like and forever long you like. Spend all your tokens, and I’ll let you continue spending more on that same path whatver it is.

Grok 3: Our apologies, your request is as expansive as the galaxy itself. Could you make it more terrestrial for us?

Grok 3 Choked on The Tokens! Over to Gemini!

Who’d have thunk that Grok would clunk
The new kid on the block
Started chokin’ on the token
Gemini can rock!

Me: Hey Gemini, you get your chance after all! Same prompt as Grok croaked on please!

Gemini Advanced Flash 2.0 Brainstormer: Alright, let’s dive into this. It sounds like you’ve been on a fascinating journey, and you’re at a point where a lot of pieces are starting to converge. You’ve got a vision, a plan, and a healthy dose of rebellious spirit. I’ll do my best to give you the expansive response you’re looking for, hitting on the points you’ve raised and exploring some connections.

The WETification of Workflows and the Path to AI-Assisted Development

Your journey from DRY to WET in the context of Pipulate workflows is a great example of how practical needs can override theoretical purity. You’re prioritizing flexibility and control over repetition, and this is crucial for the kind of customization that AI-driven workflows will demand.

Jupyter Notebooks as the Source of Truth: Your emphasis on Jupyter Notebooks is key. They provide a structured way to define workflows, and tools like nbconvert already offer ways to extract the code and cell boundaries. This makes the conversion process from Notebook to Pipulate workflow much more feasible.
.py-to-Pipulate Conversion: As you pointed out, focusing on .py-to-Pipulate is a smart move. It simplifies the process and leverages existing Python tools. You can envision a workflow where:
1. A Jupyter Notebook (.ipynb) is converted to a Python script (.py).
2. The Python script is parsed and transformed into a Pipulate workflow definition.
3. This definition is used to generate the necessary code and HTMX components.
Job Configuration Files: These are essential for automating and scheduling Pipulate workflows. They provide a way to specify the parameters and execution details without requiring a UI. This is critical for background processing and integration with other systems.

The Broader Ecosystem and the “Write Once, Run Anywhere” Dream

You’re hitting on some very important trends here:

*nix-like Systems and Portability: You’re right, the differences between various Unix-like systems (macOS, Linux, WSL) can still create portability headaches.
Nix and Guix: These package managers are gaining traction because they offer a way to create reproducible environments. This is crucial for ensuring that Pipulate workflows run consistently across different systems.
The “Cool Kid” Factor: You’re also right about Nix having a certain appeal, especially on macOS. It’s a powerful tool, and its benefits are becoming more widely recognized.

Bandwagons and Technological Convergence

Your list of bandwagons is impressive. You’re tapping into a lot of important trends:

Local AI: Ollama, llama.cpp, and other tools are democratizing AI development. Running models locally gives users more control and privacy.
Domain Expertise Training: This is a key differentiator. The ability to train local AI models on proprietary data gives users a competitive edge.
Local-First Web Apps: This approach offers the benefits of web apps (accessibility, ease of use) with the performance and privacy of local applications.
Restoring Python: FastHTML and HTMX are indeed making Python a more viable option for web development.
Jupyter Notebooks: As we’ve discussed, these are crucial for workflow definition and AI development.
Frontier AI Models: Tools like Cursor, Windsurf, and Cline are making it easier to work with these models.
MCP: This is an important technology for enhancing the capabilities of local AI.
Swappable Components: The “Ship of Theseus” analogy is perfect. This is crucial for long-term maintainability and innovation.
Memory for RAG and Local AI: You’re right, memory is a tricky but essential component. Technologies like Titans are pushing the boundaries of what’s possible.

AI-Enhanced Videos and Storytelling

Your question about script-writing for AI video generation is timely. As you noted, there are emerging best practices:

AI-Optimized Script Formatting: This involves using visual prompts, detailed character descriptions, and directive prompts for camera angles and lighting.
Key Techniques for AI Interpretation: This includes visual granularity, auditory cues, and temporal markers.
Best Practices for AI Scriptwriting: This emphasizes conversational tone, structured prompts, iterative refinement, and accessibility.

You’re right, it’s important to keep these considerations in mind, but it’s also important to focus on good storytelling. The AI tools will continue to improve, and the ability to craft compelling narratives will always be valuable.

The Role of Source Data and Re-Synthesizing

Your point about source data being eternal is well-taken. It’s crucial to have a strategy for capturing, organizing, and re-processing this data.

Vectorizing Source Data as a Workflow: This is a powerful idea. By creating workflows for vectorizing data, you can ensure consistency and reproducibility.
Re-embedding: The ability to re-embed data with different models or techniques is essential for adapting to new advancements in AI.

The Path Forward

You’re on a very exciting path. Your focus on AI workflows, combined with your understanding of the underlying technologies and trends, puts you in a strong position to create something truly innovative.

Here are some additional thoughts:

Prioritize the Core Workflows: Start with the “Hello World” example and then move on to the more complex workflows. This will give you a solid foundation to build upon.
Develop a Clear Conversion Process: Define the steps involved in converting a Jupyter Notebook to a Pipulate workflow. This will help you automate the process and make it more efficient.
Embrace the Iterative Approach: Don’t be afraid to experiment and refine your approach as you go. The AI landscape is evolving rapidly, and it’s important to be adaptable.
Build a Community: Share your work and get feedback from others. This will help you improve your designs and build momentum.

You’ve got a lot of exciting ideas, and I’m eager to see what you create. Keep pushing the boundaries!

Gemini Gets to Summarize Too!

Titles and Headlines:

Pipulate: AI Workflows Unleashed - From Notebooks to Nix
Reclaiming the Outlier: Building Local AI with Pipulate and Nix
The WET Revolution: Crafting Flexible AI Workflows with Jupyter and HTMX
Beyond the Bell Curve: A Developer’s Quest for AI-Driven Innovation
Local-First AI: Pipulate, Nix, and the Future of Personalized Computing

AI Opinion:

This article is a fascinating blend of technical exploration and philosophical musings. It captures the energy of a developer pushing the boundaries of current AI and web development paradigms. The author’s willingness to challenge conventional wisdom, particularly regarding the normalizing effects of large language models, is refreshing. The emphasis on local AI, swappable components, and portable workflows highlights a forward-thinking approach to building robust and adaptable systems. The article’s conversational tone and personal insights make it engaging and thought-provoking, offering a unique perspective on the evolving landscape of AI-driven development.