Simplifying Websites and Life with Single Long Text Files
I'm using the latest tools and techniques to simplify websites and life, by using single long text files to capture my ideation and publishing process. I'm combining Linux, Windows, JupyterLab, vim, and AI to create a seamless user experience, while also using data shaping and transforms to create a website that is easy to navigate.
Unlocking the Potential of My Work with Single Long Text Files
By Michael Levin
Saturday, August 5, 2023
Okay, more than any other time in memorable history, the mission with my work is clear. Nothing much has changed in the 25 years since the AltaVista and Google search engines were released. The only thing that’s changed is the complexity of the algorithms, and the amount of data they have to work with. But now, people are seeing just how crusty the old search experience is. Why even drill down on a page when the “search engine” itself can conversationally just tell you what you need to know and not bog down the experience with bias and ads? Sure folks accuse AI of bias, but as much as trying to get that ad click? I don’t think so.
So the times, they are a-changing. And I’m going to be right there in the forefront of it. I’ve been adjusting my tooling steadily over the years to accommodate this. I’ve been surveying and sampling the free and open source software landscape, and I’ve been building my own tools to fill in the gaps.
Now I have all the key components, and I have them in such a way that must may be palatable to the mainstream. I have a JupyterLab notebook environment that is back-ended by Linux, but front-ended by Windows. I have a vim-based journaling system that is also back-ended by Linux, but can publish to the Web with that ultimate kind of query/remixability that is the hallmark of the semantic web. And I have a system for making the whole thing work seamlessly across Linux upgrades, which is the key to making it all work.
Nobody’s going to get it at first. And that’s fine. I just need to use it to great effect, and then people will start to get it. It’s all very meta, because I will use it to the effect of making it easier for people to use it.
Each day I use it, I run up against the next weakest link, and I fix it. Right now, that’s the messaging on my website and the fact that I’ve got a deluge of blog posts and a tangle of messaging. I need to get it down to a single message, and I need to get it to be a message that people can understand.
I have to shape my site, pruning and suppressing the mountain of stuff I still want to keep published, but I want to give roundabout deep-archive access to. The good stuff I want to surface and put front-and-center.
I don’t want to slow myself down with that refinement and pruning. I want to keep moving forward, and I want ideate in free-flowing, unstructured blog posts like I always do. I can have a small window into my latest thinking for fans who actually take the time to look at articles like this. But it’s the mouth of the funnel. It’s all the muddy river water that I’m scooping into the sieve that is my website. And I do a lot of scooping. That produces a lot of gold dust. But then a whole different kind of work begins to make that gold dust into gold bars.
Currently, I have the 10 most recent blog posts on my homepage. And then the rest of the currently about 700 articles that I keep in this latest system all running on one long page. At least the description excerpts. I have even more of my “webmaster journals” unpublished ones going back to 1996 in a blog archive. In the old days of SEO, I would have been tempted to just roll everything out and get the long-tail traffic. But those days are over.
Newish stuff is okay. Those 700 articles only go back a year or two. But that’s good fodder now for me to do site-shaping. Those types of projects will be the subject of my videos. It’s all about pulling data, doing analysis, data shaping and transforms, artistically knitting it together into the site you want. The artistry is in… well, it’s in all parts of it. I think I saw mention of a headless CMS system in a web article recently. I think that’s what I’m doing.
In the end, it’s about generic data and mostly text-data at that. Sure there’s pictures mixed in to make it a website, but those are mostly links to assets. That’s what the Jekyll system calls it in the directory structure, so that’s a good way to think about the non-text data: media assets. I have a number of interesting projects along those lines too, such as bringing some sanity to big archives of images using AI and the various computer vision APIs that are available. But that comes down the line. I have to get the basics in place first.
The basics begin with the one true natural order for things: linear time. There is a natural sequence and order to things, which the blogging process is perfectly suited to. I just dump everything out there for myself for the idea-capture process. But I can’t expect people to make sense of it. That’s a second and third pass. The second pass is just absorbing it all and getting a feel for the shape of it. The third pass is the shaping of it.
I’m still doing just the first phase and dumping it onto the public. That’s wrong and isn’t going to allow my work to achieve its full potential. But there is also a sort of momentum to it that I don’t want to break. So while keeping that momentum going, make the subject of the momentum, that is the subject-matter that I’m actually writing about, the ideation of the shaping process.
And so far, I’ve done the unstoppable ideation and publishing and shaping machinery. It’s all very generic Linux. It’s all based on very generic text files. It uses minimal text-files: one text-file per life journal. That’s one of the real innovations here. Because you could never type so much in your life that you could even begin to tax the capacity of the vim (or NeoVim) text editor, you’ll never have to ask yourself which file or where it is. It just is. It’s just all that is.
Of course that’s a bit too idealistic. And a journal is a private thing. So there’s naturally going to be more than one text-file. But each “domain” such as it were gets its own file. And those files can be organized into git repos, so your private journal is one repo and your public blog is another repo. But each repo essentially has one text-file in it. And that’s the beauty of it. The left hand always knows what the right hand is doing, because there’s so much less overall to keep track of.
These single text-files are then sliced and diced in a post-processing step where machines take care of all the broken up little pieces that’s so hard for a human to keep track of. And then the machines can do all the heavy lifting of kitting it together into the user experience. It’s easy when the pages get kept in a linear sequential order, because you can just knit them together with previous/next arrows. It’s harder when you’re making site sections and grouping things by topic. But AI will help a lot there.