This article became an unexpected journey into the depths of the Botify API. What started as a simple task of pulling data turned into a quest to understand the complex layers of this API, much like peeling an onion (or, as I often put it, venturing down a rabbit hole). Through trial and error, code exploration, and the help of AI assistants, the author uncovers the intricacies of BQLv2, navigates collections and fields, and ultimately finds a way to extract valuable SEO insights.
Code Like a Straw House Little Pig Who Thinks Ahead
When facing a coding challenge, it’s easy to think of it like sparring—every task, a move to master, every bug, a jab to dodge. Good coding practices, or “kata,” become those fundamental skills we return to when things get unpredictable, like tech trends that seem to morph with every update or framework.
These shifting “karma chameleons” mean we can’t always build with the permanence of stone. So, we take a page from the Three Little Pigs, crafting nimble structures out of straw for today, knowing the wolf of change will always be at our door. And sometimes, it’s this light touch that allows for the quickest rebuilds, letting us stay ahead, moving from one house to the next in the endless cycle of code.
Well-planned Straw Houses: The Wolf’s Worst Nightmare
What’s the opposite of diving deep down the rabbit hole? We turn to another fairy tale, the 3 little pigs. We build our hows from straw! We do so knowingly, understanding a visit by the wolf will be inevitable, but you can’t not have protection from the elements, and we’re talking about a good night’s sleep tonight—not long-term safety from edge cases. That will come later… soon, in fact for there is not survival without brick houses on occasion.
In the immediate-term, the skill of throwing up a temporary home trumps mortar. Just don’t impale the tomorrow-you on the alter of the-now. Would that make you a martyr? Give up mortar but don’t be a martyr. Build your house from straw and have a plan to sweep in with rapid stick-reinforcement. When the wolf’s at the door, it is your house no more. You have moved on instead to where it can’t be led. And that is, out in the Noosphere. Yup.
Rabbit Hole Revelations: Success at Last!
Anything you can do, AI can do better! AI can do anything better than you.
It’s 10:30 AM Sunday, November 3rd 2024 and I have successfully just come up out of the rabbit hole. I must not go back down again today the same way. This is the end of a multi-day journey where I firmly secured the rappelling safety equipment and knowingly went in with the clear idea of the elusive reward I was seeking. I often talk about the concept of releasing potential energy. Cutting catapult ropes. Drilling oil from the ground, whose production was paid for ages prior by the scale trees of the… Perplexity?
Ancient Trees Paid For Our Modern World
Perplexity: The era of the scale trees, primarily during the Carboniferous period (358.9 to 298.9 million years ago), was characterized by vast, humid swamps dominated by giant lycopods, ferns, and horsetails[1]. Scale trees, or tree-like lycopsids, were particularly abundant and grew to impressive heights with unique bark resembling reptile skin[1]. These trees contributed significantly to the formation of coal deposits, which over millions of years transformed into oil and gas reserves. Their dense growth in swampy conditions played a crucial role in sequestering carbon, ultimately influencing the Earth’s climate and leading to an ice age[1].
Sources
[1] The Rise and Fall of the Scale Trees - In Defense of Plants https://www.indefenseofplants.com/blog/2018/11/13/the-rise-and-fall-of-the-scale-trees
[2] Oil Era - Idle Mining Tycoon - Apps on Google Play https://play.google.com/store/apps/details?hl=en_US&id=com.aliensidea.oiltycoon.overseas
[3] Oil Era | All Current FREE Codes! (ANDROID/IOS) - YouTube https://www.youtube.com/watch?v=7uMfEjgNofI
[4] Oil Era - Idle Mining Tycoon on the App Store https://apps.apple.com/pl/app/oil-era-idle-mining-tycoon/id1633194817
[5] O Scale Trees scenery - TrainWorld https://www.trainworld.com/shop-scale/o-scale/scenery/trees.html
[6] Oil Era - Idle Mining Tycoon – Apps on Google Play https://play.google.com/store/apps/details/Oil_Era?hl=en_SG&id=com.aliensidea.oiltycoon.overseas
[7] HO Model Trees & Plants - BRS Hobbies https://brshobbies.com/ho-model-trees/
Connecting Dots to Release Potential
YouTube, for example, did not invent video on the desktop. Someone else wrote the encoders and the browser integrations. YouTube just connected the dots and became the one that benefited most from it. There’s a certain first move advantage here, and going all in and committing to the cause and refining your solution, faster, and better than anyone else. And then of course, maybe getting bought by Google helped.
More subtle examples include the creation of the Ruby on Rails framework by David Heinemeier Hansson (DHH) and unleashing, joyful frameworks onto the world during an age of mind-numbing, Java, hibernate, and cocoon, which were aptly name as the brain dead state they put you into. RoR (Ruby on Rails) released the potential by being a framework people actually like using.
FastHTML: Web Development’s Renaissance
The RoR era reign supreme, like the scale trees for a short time until the era changed once again in reaction to job script working so consistently well on web browsers. The era of ReactJS and the return of the mind numbing build-chains and subdivision of labor into forced specializations and caused the demise of the most satisfying of all professions, the jack-of-all-trades webmaster. Well, with FastHTML and HTTPX and a bit of guidance by AI and whole technical infrastructure as portable as a text-file with Nix Flakes, that era is back, baby!
And against my better judgment, and under the prodding of my boss, I have proven out this premise and him now doing WebDev again. And I have just now overcome a massive hurdle that required exploring a deep rabbit hole, precisely the kind that I like to avoid now in my mid 50s, lacking the energy to spar with impossible problems for impossible time length. This is especially true when everything seems to go smoothly and then the last final mile is nigh impossible. You get full signals for some reason that this last mile should be easy, but it’s not. The last mile of your journey is sometimes harder than all the rest of it put together. Jousting the Lich.
Post-Rabbithole-Journey Caution & Recalibration
Well, I just jousted the Lich and won. I am now living in a sort of glitch where I am a bit ahead of the world in general, as I have tended to be in my career from time to time. Every half decade or so away connect the dots so wee bit better than or ahead of other people, but fail to do something significant with it. Not this time! I’m gonna double down and dig in, tapping the AI assist for all it is worth. But not in a way that compels me back to the center of the normal distribution curve as the AI of try to do, but in a way that surprises them, and uses them to get over those final mile hurdles. When I do this with an AI, I can tell that they are not generally use this way as they resist my patterns.
But that now takes me to why I’m dashing off this article at the tail end of that rabbit hole adventure of LLM ghost in the machine. This is sort of a pallet cleansing and a warning to Myself to not dive into the next similar deep rabbit hole. What I do, would I do with the rest of today from like noon to midnight, has got to be some of the opposite of rabbit hole style work I’ve ever done in my life. It’s going to be hard and pragmatic. it is rapidly building the plug-in apps into my Web framework that it is designed to support. It is playing off all the easy shortcuts I designed for myself to make porting from Jupiter notebooks to this thing I built, I’ve got it, as easy as reasonably can be while still getting all the cool new features.
Gimme 3 Apps, Gimme 3 Apps, Baby!
Three apps. Two of them are fully built in other forms. I am just gathering them into one place. The existence of these apps in their new ported living beside AI form are going to be life-changing. Many things I am a bottleneck for, I can produce faster with these new tools in hand. Other people who would like to use these tools can do so without having to be a technology heavyweight. Everything about this is realizing the “turn yourself into a superhero” promise that is dangled in front of us like a carrot with AI, but seems like bombastic YouTube exaggeration, and hyperbole ultimately in the end.
I feel like the exact person that these superhero power inducing promises was specifically targeted at, having attempted to hit above my weight class all my life. I would’ve liked to have been a scientist, but lacked courage. I would’ve liked to have been an engineer, but lacked the math skills. I would’ve liked to have been a webmaster… Oh wait I was a webmaster. I was pretty good and then everything changed. Damn. It was just like the Amiga computer going away. Find a love and then have the carpet yanked out from under you. Under me. That is my pattern. It is not Bing. It is my own shortcoming and lack of foresight. But now I’ve got it. I’ve got the clues in the pieces and know how to put it together.
Recovery of Secret Weapons in a Post-HTMX World
And put it together fast, I shall. Things are different now with a new secret weapon in hand, being that LLM infused web framework I just whipped together on FastHTML. FastHTML did a lot of that dot connecting to release potential about which I just spoke regarding drilling for oil and the rise of agile frameworks. Dots FastHTML are connecting are mostly Python and HTMX to all the state of the way back to a pre-ReactJS era where almost anybody could do anything. I did not believe it at first, but now I do. It required wrangling AIs to understand this. Determination to follow the past more traveled is impressive, and that makes all the difference. It provides opportunity and secret weapon advantage for those of us with a self-confidence and vision to press pass the catcher in the rye.
This article itself seems like a silly indulgence and unnecessary rabbit hole with deadlines on deliverables looming. This is where I push back and take the past less traveled, and take a moment to stop and think through my next steps. Measure three times cut once. Especially now that it’s not rabbit hole work. When you switch over to carpentry work everything is different.
LLMs Do That Matrix Download Expertise Trick
Give me to the rabbit hole you are not pursuing, so you know not to jump into it. I believe in this case that would be hot prompt injection. This is the technique by which you train an LLM to be an expert in whatever you need to be at the moment you need to be. this is done through invisible prompts you can send the LLM just like chatting, but there is no feedback to the user. But instead of Chat, it is a complete download of being a kung fu expert. It’s just like in the matrix when you have to become expert at something and you just download the book in the experience into your head. Yeah, LLMs can do that. Any model any size cloud or local.
Not all LLM’s may be equally smart, but they can all do the basic trick thereby and running the necessity of hall, scale, retraining of models, fine, tuning models, rag look up and a lot of other stuff, that’s a violation of the 80/20 rule for many cases. As much as I need this instinct, super training ability in the web framework I’m making, in other words when the menu selection to a particular app is chosen the LLM instantly becomes expert at the deliverable the app under this menu item is intended to produce, I must work on the app itself. I will not chase the rabbit down the hole of hot prompt injection. But what I will do is competently build the houses of the three little pigs.
Build From Straw, Sticks or Bricks?
My real question today, was it now being about 11:30 AM, is whether I build these houses out of straw, sticks or bricks. For the minimum viable product demos coming up, it need only bist straw. But I need to at least be able to quickly build straw houses. And that is my work today. If you see the white rabbit, humorously take note, maybe jump down a few ideas, but know that you will see it again. It is not a one time opportunity chasing that rabbit. At least not in this context anymore. The phase I came out of was completely necessary, because I lacked modern understanding of websockets versus server side events. I had to do a little hot prompt injection of my own.
All right, so just because I won’t be chasing the rabbit does not mean I won’t be touching, weird and interesting portions of the system, wrangling it into proper behavior. I feel a little bit of trepidation. They’re having just stabilized it to be able to perform the amazing ghost in the machine magic trick. However, the system has demonstrated a pretty solid amount of resiliency as a pivoted this way, and that trying to wake up. I sort of shook the foundations and the scaffolding as a result of countless little tests, and it can probably survive quite a bit so long as I have the right test in place defending feature regression. Of course, I don’t actually have these test in place and putting them in places yet another rabbit hole I must avoid, less having, circular dependency showstoppers,, which are the worst.
Kicking the Ass of Circular Dependencies
When you have a circular dependency showstopper, the correct strategy is to kick its ass. We would not be alive in here today to hold these discussions if people did not just say eff-you to all the roadblocks and barriers that have been thrown up to prevent it. Think about your ancestors having to hunt to live. Think about all the cultures that have attempted to wipe each other out over the ages. Every one of these obstacles has been overcome, not just once, but countless repeated times in order for you to end up being here reading this article, frankly whether AI doing it or a human. The odds were against you. The odds were against this. is that kind of determination that you face these pissant little things try to stop you. That’s not vulgar or offensive. That’s life surviving.
Other rabbit holes I want to avoid today are the mini paths to LLM memory, including the easy ones such as saving and reloading conversation history. All this will come in time, and unlike the LLM’s knowledge, remembering is inevitable because the lack of these features will always stand out and thus be built-in self reminders. Itself is to do list if the items are compelling and worse doing enough in the first place. There is another class of these to do items which are invisible in real life and often are the ones that make all the difference. But those are the rabbit hole ones where you go against the grain. You’re going in, you do them sparingly, when you score a win by going into one who make sure that the win was worth it if even just to have a known failure of quantifiable attributes to help refine your next approach.
Implement Rabithole-Detection Protection
There will be rabbit holes on these next three concrete deliverable types. Thinking through them I know I am going to be enticed down the rabbit hole of file uploads, long-running task communication (again now in light of server side events), forced live reloads now that there’s something preventing it having to do with SSE. There are countless little details like this, anyone of which could blossom open into a rabbit hole. You must see it coming and have preparing for them part of your rabbit hole rappelling safety equipment. This article is to put you in that rabbit hole rappelling state of mind. I don’t expect rabbit holes, but I am in quicksand territory. It’s going to sneak up on me. I can’t let it if I want this week coming up to be life-changing successful.
To the end, do frequent rapid app-switching on the menus. Go from link graph to gap analysis to page workers. Let’s call the paperworkers drop down the menu Auto SEO. I will have to create a dictionary of items in the endpoint format to keep the upper case of the E and the O in SEO. There’s probably a good algorithm for this, but don’t let that become a rabbit at all. Dictionaries are the ultimate 80/20 solution to things requiring 80/20-solutions. Dict’s are 80/20^2. Or would that be 80/20**2? Gotta think in Python! And don’t try to entertain yourself by out clever’ing yourself. That’s another form of rabbit hole distraction. Use your smart ass energy more smartly.
Round Robining Straw House Construction
So what I’m hearing from Myself is to do three quick passes from the top 10, menu-wise in the app. Do a quick round robining, building 3 houses of straw. Make each look real. Put a card structure around in place that fleshes it out. Practice FastHTML tags and that weird way place on parameters and arguments get used like a web template system. I need to get in more practice on that. That is the key to quickly mocking up plug-in apps that mirror there Jupyter notebook prototypes.
Each of the three apps actually has plenty of deep dive rabbit hole type traps in them. I will eventually have to explore each. I may even need to finish one of the three by the end of the day today without completely knocking myself out and making myself useless for tomorrow. OK so this article is really about mentally conditioning myself for how I’m going to spend my last deep investing of discretionary energy before the hubbub of the work week begins. And it’s gonna be one hell of a work week because of the people who are going to be in town and the demos, I am going to be giving. I have to get this into a state where all my principles work for me.
Principles of Not Becoming a Drone
Ikigai - overcome laziness find the love
Kaizen - do one thing well keep polishing
Pomodoro - take chisel-strike bank small wins
Hara Hachi Bu - don't burn out / eat less
Wabi-Sabi - 80/20-rule imperfection
Shoshin - Be like a beginner
Okay, so get some good food here for when you burn out later today, but not too much so you don’t drag your ass leading up to that point. Be nice to the kitties with some treats. Make sure you reward yourself and you reward them for this intense work-period you’re putting in. But before you even leave for that food pickup, get the right kind of momentum going here. Not just writing, because I could write forever. Do those house-of-straw light passes.
Oh! Replace the main branch of Botifython with what’s in Botifymograph! Perplexity…
Me: How do I replace the main branch with the code in a particular branch? I don’t want to merge. I just want an all-out thunking of the new version down over the old in the main branch.
Perplexity: To replace the main branch with the code from another branch without merging, you can follow these steps:
Replacing Main Branch
- Ensure you have the latest changes:
git fetch origin
- Switch to the main branch:
git checkout main
- Reset the main branch to match the desired branch:
git reset --hard origin/your-desired-branch
- Force push the changes to update the remote main branch:
git push origin main --force
Important Considerations
- Backup: Before proceeding, it’s crucial to create a backup of your current main branch. You can do this by creating a new branch from main:
git checkout main git checkout -b main-backup
-
Team Communication: Inform your team about this change, as it will affect everyone working on the repository.
- Force Push: The
--force
option overwrites the remote branch history. Use it with caution, especially on shared branches.
Alternative Approach
If you want to preserve the commit history of both branches, you can use the following method:
- Checkout the branch you want to become the new main:
git checkout your-desired-branch
- Rename this branch to main:
git branch -m main
- Delete the old main branch:
git push origin :main
- Push the new main branch:
git push origin main
- Set the local main branch to track the remote main:
git push --set-upstream origin main
This approach effectively replaces the main branch with your desired branch while preserving the commit history[1][5].
Remember, these operations are irreversible and affect the entire repository history. Always ensure you have backups and communicate clearly with your team before making such significant changes.
Citations:
[1] https://git-scm.com/book/en/v2/Git-Branching-Branch-Management
[2] https://www.geeksforgeeks.org/how-to-replace-master-branch-with-another-branch-in-git/
[3] https://intellipaat.com/community/11548/update-git-branches-from-master-git-update-branch-from-master
[4] https://www.namehero.com/blog/how-to-merge-a-git-branch-to-master/
[5] https://graphite.dev/guides/git-overwrite-local-remote
[6] https://www.jetbrains.com/help/webstorm/apply-changes-from-one-branch-to-another.html
[7] https://flaviocopes.com/how-to-git-update-branch/
[8] https://stackoverflow.com/questions/2862590/how-to-replace-master-branch-in-git-entirely-from-another-branch
[9] https://dev.to/bosz/copy-commit-from-one-branch-to-another-without-using-git-merge-5fp9
[10] https://graphite.dev/guides/git-merge-main-into-another-branch
[11] https://www.youtube.com/watch?v=DU712NT6IqY
[12] https://stackoverflow.com/questions/26137852/how-to-update-my-working-git-branch-from-another-branch-develop
Okay, I’ve done that. The main branch of Botifython is now the work I was doing in Botifymograph. The ghost is in the machine in 2 branches.
Me: Okay, now that we’ve done all that, how do I switch back to my original branch (non-main) knowing that the two are momentarily in sync?
Perplexity: To switch back to your original branch (non-main) after replacing the main branch with its content, you can simply use the git checkout command. Here’s how to do it:
- Switch to your original branch:
git checkout your-original-branch-name
- If you want to ensure your original branch is up to date with the changes you made to main, you can pull the latest changes:
git pull origin main
At this point, your original branch and the main branch should be in sync. However, there are a few things to keep in mind:
Important Considerations
-
Branch Divergence: If you’ve made any commits to your original branch that weren’t in main before the replacement, these commits will still exist in your original branch. This means your branch might now be ahead of main.
-
Local vs. Remote: The sync between main and your original branch is only guaranteed for the local repository. If you haven’t pushed the changes to the remote repository, the remote versions of these branches might still be different.
-
Other Branches: Other branches that were based on the old version of main might now be out of sync. Contributors working on those branches might need to rebase their work onto the new main.
Verifying the Sync
To verify that your branches are indeed in sync:
- Check the last commit on both branches:
git log -1 main git log -1 your-original-branch-name
The commit hashes should be identical if they’re in sync.
- Compare the branches:
git diff main..your-original-branch-name
If this command produces no output, the branches are identical.
Remember, while your local branches may be in sync now, you’ll need to push these changes to the remote repository if you want to update it as well. Be cautious when force pushing to shared branches, as it can cause issues for other team members.
Citations:
[1] https://www.freecodecamp.org/news/git-switch-branch/
[2] https://gitbetter.substack.com/p/how-to-work-in-multiple-git-branches
[3] https://experienceleague.adobe.com/en/docs/commerce-operations/implementation-playbook/best-practices/development/git-branching
[4] https://dev.to/refine/git-switch-and-git-checkout-how-to-switch-branches-in-git-29nj
[5] https://www.reddit.com/r/git/comments/16tmx8p/managing_multiple_branches_in_a_repo/
[6] https://www.reddit.com/r/git/comments/11m1q34/advice_for_git_branching_strategy/
[7] https://www.cloudbees.com/blog/git-switch-branch-everything-to-know
[8] https://brightinventions.pl/blog/how-to-have-multiple-branches-checked-out-at-the-same-time/
[9] https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow
[10] https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/managing-branches-in-your-repository/changing-the-default-branch
[11] https://www.integralist.co.uk/posts/multiple-branches-in-git/
In-sync check:
(nix) [mike@nixos:~/repos/botifymograph]$ git log -1 main
commit 1e5f3b25b847cbc2ea5c0bb09eb3c40dcf41f4e0 (HEAD -> botifymograph, origin/main, origin/botifymograph, origin/HEAD, main)
Date: Sun Nov 3 13:34:41 2024 -0500
Ghost Hash
(nix) [mike@nixos:~/repos/botifymograph]$ git log -1 botifymograph
commit 1e5f3b25b847cbc2ea5c0bb09eb3c40dcf41f4e0 (HEAD -> botifymograph, origin/main, origin/botifymograph, origin/HEAD, main)
Date: Sun Nov 3 13:34:41 2024 -0500
Haha, nice!
Powering Through an Early Monday Morning, Now With Auto-Reload
Fast-forward to 3:30 AM Monday morning! Ah, the best laid plans of Mice and Men
regarding the 3 Little Pigs. I got wrapped up in polishing framework
internals—the plumbing of all the houses, if you will. The most clear
example was the subject of the “bonus article” I threw in on auto-reload.
It’s a wonderful feature that keeps you in flow state when you’re working,
because every time you do a file-save of your work, the server automatically
reboots and you can go see your work in the web browser without even so much as
an F5
or Ctrl+R
refresh.
I got auto-reload back after implementing the server side effects (aka
SSE) feature, which ultimately gave me my ghost in the machine effect,
which I also already had going but lost when I switched from using
websockets from simulated streaming to actual streaming of the LLM’s
output from ollama
, a sort of webserver for AI. Pshwew! End
back-reference discussion. That was all by way of taking a deep breath and
getting back to our house-of-straw rapid roundrobining triple-pass I’m about to
plunge into using this precious morning focus time.
HTMX Revolution: Webdev Renaissance is Here
I’ve taken a number of other precautions lately to ensure I could enter and stay in the flow-state repeatedly lately, enough so that I can get some of the big work of my life done—this deeply AI-integrated HTMX-utilizing web framework. Yes, the rise of AI is only going to happen once. But also the webdev renaissance powered by HTMX-backlash to ReactJS and overweight web frameworks… well, that might happen more than once, but it’s getting ready to go into full swing and I want to be one of the folks leading that charge. The Webmaster is back, baby!
HTMX & AI Speed Up Web Development
Thanks in great part to the one+two punch of HTMX and the AI code-assist tools who get that if you have a firm command of the elegant and plenty sufficient lower-level web-stack tools that you don’t need all that other JavaScript overhead crap. It’s uncommon wisdom and an anti-pattern that requires a high level of AI-validated confidence to build based on. And here I am doing that—once again that concept of hitting above my weight class because the tools have change sufficiently thus that they allow me to. Okay, enough warm-up rambling and time to get down to some straw house building.
Our Ghosts in the Machine are In a Sprint
This is a small and shrinking window. Work like a man dying. I go into the office tomorrow and have a meeting with the big boss over all this ghost in the machine stuff. We are in a race. It’s not just any race. It’s a sprint, as my direct boss put it. It’s not a marathon. It’s a sprint. The sprint is to have this thing done and the effect of it rippling through the company pronto. Like today, pronto. I should really have a video by the end of this thing to unveil what I have built internally. 1, 2, 3… 1? Shoshin, like a beginner. We are always on step-1. There is only ever the very next thing to do. And in this case, it is…
Building Self-Learning AI Straw Houses: An Introduction
Surveying the landscape for the straw houses! We need an intro page— a page
with no rabbit hole potential. A page where I just use the principles of
FastHTML, mostly with the Card()
element to introduce and lay out the look of
everything to follow, along with the whole elevator pitch of the self-learning
AI infused into the system. Not so much self-learning, as learning off of you,
the dear user, as you use the software it’s built into. The actual effect of
that is very short tern—only until the next file-save/auto-load rebuild at
which time the whole running AI instance gets destroyed and freshly
re-instantiated and blip-trained on its JSON API-syntax using the system prompt.
It gives me the chills just thinking about it. I’m using gemma:2b right now for
the purpose, the best at receiving that Matrix-style Kung Fu download on edge
devices like Macbook’s. Intro page in place!
Vim Hack: Jump To Your Coding Hotspots With Ease
And I just introduced the comment-tag !!!menuxx
that lets me jump between the
3 key locations in the program by just shift+8’ing (vim: find next occurrence)
with the cursor over top the tag-name. Sometimes you don’t need AI in Cursor,
but just intelligence using your cursor—particularly in regard to using the
everyday keystroke features of vim, the big muscle-memory gift in life to
coders.
Do journal-writing like this in vim to think through your thoughts, have all the
vim-skills you develop in the low-pressure environment of freeform
journal-writing transfer immediately and gloriously back into your coding
skills. So now I can search on !!!
which is a special bookmark I’ve used all
my life inserting in code on places I want to come back to, but shift+8
doesn’t work with that geared towards word-boundaries as it is, so that’s why
the made-up word that doesn’t occur anywhere else. That’s why menux
in the
first place, but now menuxx
is a subset of that which direct-jumps me to the
really hot-spots for plugin menu/app injection.
Welcome to the class class. Let our class begin. It begins thus. Here is an introduction.
http://localhost:5001/introduction
An Introduction Is On the Menu
To make this introduction to you, we must begin our class. We will start with a map on how to get to class, even though there is nothing happening there yet. It is still satisfying to see it on the menu.
MENU_ITEMS = ["introduction", todo_app.name, 'link_graph', 'gap_analysis', 'stream_simulator',]
We speak the language of lower-case endpoints at this point. It could have been a complicated configuration dict that had human-readable names and their endpoint-manifesting counterparts, but I say NO! In for an anti-pattern, in for a pounding, I always say. Let’s pound away at some simple self-evident control mechanisms when you see them.
Apps Plug Into Left-Hand Grid
Next is the aptly named create_grid_left
where the plugin apps reside. Can you
read what’s going on there, and why I just routed us to an instance of the
Introduction
class (which isn’t there yet)?
You can see how my magical code-bookmarks-without-bookmarks method works. Sure, automatically reconfiguring drop-in plugin systems do away with steps like this entirely, and I might get there. But cut me some slack, will ya? This system is just being born. There will be time enough to over-complexify it later. For now, we hand-code our plugins, but in the most painless way I have the skills to make it be.
That weird extra Card()
logic in there is to accommodate for plugin apps that
follow the todo
app pattern, which have sortable lists and which you’re never
really going to know what they’re called, and there might be multiple, and you
don’t want to repeat, yadda yadda if is_render else placeholder.
async def create_grid_left(menux, is_render_view=False, render_items=None):
"""
Create the left column of the main grid layout with vertically stacked Cards.
!!!menuxx makes for a good search term to jump here for frequent menu tweaking.
"""
if menux == 'introduction':
return await introduction.render()
elif menux == 'link_graph':
return await link_graph.render()
elif menux == 'stream_simulator':
return await stream_simulator.render()
if menux == profile_app.name:
return await profile_switcher()
else:
main_card = Card(
H2(f"{endpoint_fmt(menux)} {LIST_SUFFIX}"),
Ul(*[todo_app.render_item(item) for item in (render_items or [])],
id='todo-list',
cls='sortable',
style="padding-left: 0;",
),
header=create_todo_form() if is_render_view else None,
) if is_render_view else Card(
H2(f"{endpoint_fmt(menux)}"),
P("This is a placeholder for the selected application."),
)
return Div(
main_card,
id="content-container",
style="display: flex; flex-direction: column;"
)
And now welcome to class. We have a Introductory class. This not only designs
the class prototype, aka the model for the class, aka the template or factory
class, but it also at the bottom there with the lower-case introduction
stamps out an instance of that factory class (so we can use it in the menu).
# !!!menuxx
class Introduction:
def __init__(self, app, route_prefix="/introduction"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name="Introduction")
async def render(self):
"""Render the introduction content."""
self.logger.debug("Guess where you are.")
return Card(
H3("Introduction"),
P("This is a basic welcome card for the introduction plugin."),
id="intro-card",
)
introduction = Introduction(app, route_prefix="/introduction")
Object Oriented Programming: Now With AI to Explain It!
And it’s just that easy. We build a factory class and then we instantiate an
instance of that factory class for use elsewhere. There’s tons of silly language
layered on thick here to help us wrap our minds around the often baffling and
befuddling techniques that are object oriented programming (OO). Hey Ma,
what’s an __init__
? I’ll tell you when you get older, dear.
I’ve avoided OO programming like the plague for ages. I mean I get it. I
know it’s super-valuable and makes certain types of programming like a plugin
system we’re using here so much easier, but before the human-like
AI-code Assistants, all these little nuances like init and how parameter
arguments get passed differently, and how self
works… well, it was all just
too daunting and never traversed from my Shoshin state into my Kung Fu.
Cursor AI: Finally, an IDE I Like!
There was AI code assist before in things like Microsoft Intellisense built into VSCode, and also Snippets which sometimes competed with and sometimes complemented Intellisense, but I hated them both along with VSCode itself, and frankly all those twinkling IDEs that screamed nothing but nurturing a baby-like dependency that would keep yo having to be coddled all your programming life by IDEs provided by Ma-Microsoft. Well, they got me. The new generation of AI code assistants powered by Claude 3.5 Sonnet and such are just so friggin useful as to not be passed over.
And VSCode in the guise of Cursor AI is where it’s happening best right now, because of having all those important nuances down that make it useful. I mean, I even explored Jupyter AI before going all-in on Cursor AI, and even in Cursor I keep my vim keybindings turned on so it’s not completely antithetical to my being. Sighhhh, sometimes you just gotta use the power-tools. Just don’t let your vim muscle memory skills get rusty! Edit as much as you can in stripped-down non-plugin vim/NeoVim as you can. It will pay you back in spades over the years.
Oh, let’s write that introduction. Who better than Claude to tap for some starter content? Let’s prime the pump:
AI Revolution: Software Gets Smart, You Get Help
Claude: Codebase Explanation Needed!
Me: Alright. So you know what I’m doing here form our experience together so far, Claude. And what you don’t remember because you’re always instantiated fresh in new chats, you can infer from my code base here. And my code base is too big now at pushing 5000 lines for you to have the totality of it in your context window on each interaction but I how you scan it in chunks and can get a pretty good overview. So please do so and provide an introduction in nice, gentle terms to the well-intention ed plebe who doesn’t see it and doesn’t “get it” yet.
Software Self-Learning: The AI Revolution
Get what? The coming AI revolution by which any piece of software can have any AI embedded into it, and in not too long be able to “use itself” better than a user could ever use it, because it can do so at inhuman speed and with inhuman access to all the expertise with which that software has ever been used before, especially if the using of that software is somehow curated into the most expert examples for the AI to learn from.
AI as Your Personal Software Wizard
State what this means in terms of it merely stepping into help you use the software as a human beginner yourself where the AI can step in and offer to assist with some task (like one of the ones on the menu), walk you through it like some sort of super-wizard or macro, or even just do it itself and offer you different variations of the finished deliverable to choose from, using all that excess capacity now to offer to help the human extract findings, meaning, insights and actionable follow-up activity.
Claude: lay on the content for that plebe-friendly introduction.render() my AI-friend.
Here’s how I’d explain your codebase and its vision to a newcomer:
Understanding Botifymograph: Your AI Software Companion
Imagine you’re sitting down to use a piece of software, but instead of having to figure everything out yourself, you have an intelligent assistant right there with you - one that knows the software inside and out. That’s what’s happening here with Botifymograph.
What Makes This Special?
This isn’t just another chatbot bolted onto an app. Looking at the codebase, I can see it’s designed as a deeply integrated AI companion that:
- Watches and Learns: The chat interface isn’t just for show - it’s constantly monitoring how the software is being used and can jump in to help
- Offers Multiple Ways to Help:
- The “Poke” button lets you quickly get the AI’s attention
- The chat interface allows natural conversation
- The todo system can be AI-assisted
- The profile management understands context
How It Actually Works
Let’s break down what I see in the code:
def create_chat_interface(temp_message=None):
return Div(
Card(
H2(f"{APP_NAME} Chatbot"),
Div(id='sidebar-msg-list', cls='overflow-auto'),
Form(mk_chat_input_group(value="")),
# ... WebSocket handling for real-time AI responses
)
)
This creates a persistent chat interface that’s always available to help. But more importantly, it’s connected to every other part of the application through WebSockets and event handlers.
The Magic of Integration
The clever part is how the AI is woven throughout the application:
@rt('/poke', methods=['POST'])
async def poke_chatbot():
"""Handle the poke interaction with the chatbot."""
poke_message = (
f"The user poked the {APP_NAME} Chatbot. "
"Respond with a brief, funny comment about being poked."
)
# ... creates task and streams response
This isn’t just a help system - it’s a collaborative partner that can:
- Suggest better ways to organize your todos
- Help manage profiles more effectively
- Offer insights about your data
- Complete tasks semi-autonomously
Why This Matters
The revolutionary part is that this shows how AI can be deeply integrated into any software to create a true “AI-first” experience where:
- The software teaches itself: It learns from every interaction
- Users get instant expertise: No more lengthy learning curves
- Work becomes collaborative: The AI can handle routine tasks while you focus on decisions
- The interface adapts: The AI helps in whatever way best suits your current needs
This isn’t just a chatbot - it’s a glimpse of how all software will work in the future, where AI assistance is seamlessly woven into every feature and function, ready to help, guide, or even take over tasks completely when needed.
The code shows a thoughtful implementation of this vision, with careful attention to:
- Real-time communication
- Context awareness
- Multiple interaction modes
- Seamless integration with core functionality
This is what software evolution looks like in the age of AI - not just smarter software, but software that actively helps you be smarter in using it.
Hit Home the Future-Proofing
Todo Apps: Template for Building Any App
Me: Excellent! Thank you, Claude. Now add a bit about how the Todo and Profile apps really are the template and organizer for other apps to get plugged in for rapid pushing-forward on diverse domain specialties and specific capabilities of the software. Many things in life follow the same pattern as the TodoApp, and we will be able to stamp out different instances to track Competitors, Deliverables, or really anything which is list-driven, which is like almost everything.
AI-Powered App Creation Takes Center Stage
So we are staged for a rapid cookie-cutter like creation of this-app-or-that. You see me doing it now with the Introduction class itself, but this can be carried out as almost like a porting step from apps that are rapidly prototyped in Jupyter Notebooks, and not only distributed in a more tightly controlled and better “on rails” web-app experience, but it is done so with the AI assistant right there built-in to help people use it.
AI: Edge Models + Cloud Power for Complex Tasks
And not only that, touch on how any AI can be plugged into this, not only as the main ambient intelligence of the system, but also as an intermittently consulted-with more powerful (and costly) cloud-based frontier models when necessary to solve more complex problems. In other words, even by choosing a more lightweight ollama-powered edge model like gemma or llama, you’re not shutting yourself out from the best of the best of the best.
AI Deferment: Flexibility Meets Efficiency
You’re only deferring tapping the heavyweights until when it’s most appropriate. And the http-oriented loose coupling of all the components and generic system prompt injected rapid API syntax learning of generate_system_message lets any models get swapped in and out anywhere, reducing the need for costly model training or fine-tuning–though even that isn’t off the table. It just reduces the need for it and opens the door to the shell-game with what AI is doing what.
Claude, once again, go ahead my AI-friend. Provide a gentle and kind introduction for the plebe.
Claude:
Botifymograph: A Framework for AI-Enhanced Applications
The Power of Templates: Beyond Simple Todos
Looking at your codebase, I see how the Todo and Profile apps serve as sophisticated templates. Here’s what makes this architecture special:
def create_grid_left(menux, is_render_view=False, render_items=None):
"""Create the left column of the main grid layout."""
if menux == 'introduction':
return await introduction.render()
elif menux == 'link_graph':
return await link_graph.render()
# ... other plugin integrations
This pattern shows how new capabilities can be plugged in seamlessly. The Todo app isn’t just for tasks - it’s a pattern that can be adapted for tracking anything:
- Competitive analysis
- Project deliverables
- Research findings
- Customer interactions
- Asset management
Rapid Application Development
The code demonstrates a powerful approach to building new applications:
class Introduction:
def __init__(self, app, route_prefix="/introduction"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name="Introduction")
async def render(self):
"""Render the introduction content."""
return Card(
H3("Introduction"),
P("This is a basic welcome card for the introduction plugin."),
id="intro-card",
)
This simple template shows how quickly new capabilities can be added:
- Define a new class
- Add your routes
- Implement your render method
- The AI assistant is automatically available to help users
AI Flexibility: From Edge to Cloud
The system’s AI architecture is particularly clever:
- Local First: Use lightweight models (like Gemma or Llama) for immediate assistance
- Cloud When Needed: Seamlessly escalate to more powerful models for complex tasks
- Mix and Match: Different AI models can handle different aspects of the application
For example, you might use:
- A local model for basic task management
- GPT-4 for complex analysis
- Specialized models for domain-specific tasks
Future-Proof Architecture
The HTTP-oriented design and flexible system prompts mean:
- Easy Model Swapping: Any AI model can be plugged in
- No Lock-in: Not tied to any specific AI provider
- Gradual Enhancement: Start simple, add complexity as needed
- Learning Adaptation: The system can learn from user interactions without expensive retraining
Why This Matters
This architecture represents the future of software development where:
- Rapid Development: New applications can be built quickly using proven patterns
- Built-in AI: Every new feature comes with AI assistance automatically
- Scalable Intelligence: Use the right AI for the right task
- Future Ready: Ready for whatever new AI capabilities emerge
The system you’ve built isn’t just an application - it’s a framework for creating AI-enhanced applications that can evolve with technology while remaining useful today. It’s the kind of forward-thinking architecture that bridges the gap between current capabilities and future possibilities.
This is software development for the AI age - where applications aren’t just tools, but intelligent partners that can grow and adapt with their users’ needs.
Claude Beefs Out the Intro
And so now I feed that whole article back into Claude, and I get this intro which is much less beefed out than I would have done (thus losing the interest of the user with information overload)…
Claude: Based on our discussion, here’s how I’d enhance the Introduction class to better convey the vision:
class Introduction:
def __init__(self, app, route_prefix="/introduction"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name="Introduction")
async def render(self):
"""Render the introduction content."""
self.logger.debug("Rendering introduction content")
return Card(
H3("Welcome to the AI-Enhanced Future of Software"),
Div(
H4("Your Intelligent Software Companion"),
P("""
Welcome to Botifymograph - where AI meets everyday software.
Imagine having an expert assistant who knows this software inside and out,
ready to help you accomplish any task, big or small.
"""),
style="margin-bottom: 20px;"
),
Div(
H4("What Makes This Special?"),
P("""
This isn't just another app with a chatbot bolted on. It's a glimpse into the future
where AI is deeply woven into every feature, learning from every interaction, and
ready to help in whatever way suits you best.
"""),
Ul(
Li("Chat naturally with the AI assistant"),
Li("Let it handle routine tasks automatically"),
Li("Get guided through complex processes"),
Li("Receive insights and suggestions as you work")
),
style="margin-bottom: 20px;"
),
Div(
H4("Intelligent, Flexible, and Future-Ready"),
P("""
Built with a flexible architecture that combines local AI for quick responses
with powerful cloud models for complex tasks. As AI technology evolves,
this system evolves with it - always ensuring you have the right level of
assistance when you need it.
"""),
style="margin-bottom: 20px;"
),
Div(
H4("Getting Started"),
P("""
Explore the menu options to see what's possible, or simply start a conversation
with the AI assistant. Whether you're organizing tasks, managing profiles, or
analyzing data, you've got an intelligent partner ready to help.
"""),
Button(
"Chat with AI Assistant",
onclick="document.getElementById('chat-input').focus()",
style="margin-top: 10px;"
)
),
id="intro-card",
style="padding: 25px; max-width: 800px; margin: 0 auto;"
)
This enhanced introduction:
- Presents a clear vision of AI-enhanced software
- Explains the unique benefits in user-friendly terms
- Highlights the flexible AI architecture
- Provides clear next steps for users
- Maintains a clean, professional layout
- Includes a direct call-to-action to engage with the AI
The styling ensures readability and visual hierarchy, while the content progressively introduces users to the concept of AI-enhanced software without overwhelming them with technical details.
Me: The feature’s not in there yet, but let’s set the stage for the AI having persistent local memory with a Python-style key/value dict that it can use over time to keep track of key tidbits of information, like the user’s name and favorite color. Make that button you just put there active, and prompt ollama to say hi to the new user and ask them their name and favorite color, and how it will do its best to try to remember them. This will work for me as my own todo list reminder to implement that kind of simple persistent memory where the LLM can always just ask (internally) to look through its own keys, haha!
So specifically, activate that Chat with AI Assistant button to start this conversation.
Claude: I’ll modify the Introduction class to make that button trigger an initial conversation about name and favorite color. Here’s the updated code:
class Introduction:
def __init__(self, app, route_prefix="/introduction"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name="Introduction")
# Register route for initiating chat
app.post(f"{self.route_prefix}/start-chat")(self.start_chat)
async def start_chat(self, request):
"""Initiate the welcome conversation with the AI assistant."""
system_msg = """You are a friendly AI assistant who wants to get to know the user better.
You will store their name and favorite color in your memory for future reference.
Start by warmly introducing yourself and asking for their name and favorite color."""
initial_message = (
"Hi there! I'm your AI assistant, and I'd love to get to know you better. "
"I'll do my best to remember what you tell me! "
"Could you please tell me your name and favorite color? "
"(I'm still learning how to remember things, but I promise to try my best! 😊)"
)
return FtResponse(
[
HttpHeader("HX-Trigger", "chatStarted"),
initial_message
],
headers={"HX-Trigger": "focusChat"}
)
async def render(self):
"""Render the introduction content."""
self.logger.debug("Rendering introduction content")
return Card(
H3("Welcome to the AI-Enhanced Future of Software"),
# ... previous content ...
Div(
H4("Getting Started"),
P("""
Explore the menu options to see what's possible, or simply start a conversation
with the AI assistant. Whether you're organizing tasks, managing profiles, or
analyzing data, you've got an intelligent partner ready to help.
"""),
Button(
"Chat with AI Assistant",
hx_post=f"{self.route_prefix}/start-chat",
hx_target="#chat-messages",
hx_swap="beforeend",
style="margin-top: 10px;"
)
),
id="intro-card",
style="padding: 25px; max-width: 800px; margin: 0 auto;"
)
This update:
- Adds a new route handler for
/start-chat
- Changes the button to use HTMX attributes instead of onclick
- Sets up the initial conversation about name and favorite color
- Uses HX-Trigger to focus the chat input after the message is sent
The system message sets up the context for the AI to remember user details, even though the actual persistence mechanism isn’t implemented yet. When implemented, you could add something like:
# Future implementation of memory persistence
async def remember_user_detail(self, key: str, value: str):
"""Store user details in persistent storage."""
await db.set(f"user_memory:{key}", value)
async def recall_user_detail(self, key: str) -> str:
"""Retrieve user details from persistent storage."""
return await db.get(f"user_memory:{key}")
The AI will now initiate a friendly conversation asking for these details, setting the stage for the future implementation of persistent memory.
Me: Okay, great. But clicking the button doesn’t actually do anyting. It should cause some endpoint to be hit that hits chat_with_ollama with a prompt to start this conversation. I think you wired the front-end to be capable of doing that, but now we need the back-end handling.
Claude: You’re right - let’s wire up the backend to actually trigger the conversation with Ollama. Here’s how we’ll connect everything:
class Introduction:
def __init__(self, app, route_prefix="/introduction"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name="Introduction")
async def start_chat(self, request):
"""Initiate welcome conversation with Ollama."""
self.logger.debug("Starting welcome chat with Ollama")
system_message = """You are a friendly AI assistant who wants to get to know the user better.
You have access to a simple key-value store to remember things about users.
Right now, you'd like to learn and remember their name and favorite color.
Be warm and engaging, but keep your response fairly brief."""
user_message = "Please introduce yourself and ask the user for their name and favorite color."
try:
response = await chat_with_ollama(
system_message=system_message,
user_message=user_message
)
self.logger.debug(f"Ollama response received: {response}")
return FtResponse(
response,
headers={
"HX-Trigger": "focusChat",
"Content-Type": "text/plain"
}
)
except Exception as e:
self.logger.error(f"Error starting chat: {str(e)}")
return "I'm having trouble connecting right now. Please try again in a moment."
async def render(self):
"""Render the introduction content."""
self.logger.debug("Rendering introduction content")
# Register the start_chat route
self.app.post(f"{self.route_prefix}/start-chat")(self.start_chat)
return Card(
# ... previous card content ...
Div(
H4("Getting Started"),
P("""
Explore the menu options to see what's possible, or simply start a conversation
with the AI assistant. Whether you're organizing tasks, managing profiles, or
analyzing data, you've got an intelligent partner ready to help.
"""),
Button(
"Chat with AI Assistant",
hx_post=f"{self.route_prefix}/start-chat",
hx_target="#chat-messages",
hx_swap="beforeend",
style="margin-top: 10px;"
)
),
id="intro-card",
style="padding: 25px; max-width: 800px; margin: 0 auto;"
)
Key changes:
- Added the
start_chat
method that will be called when the button is clicked - Connected it to the existing
chat_with_ollama
function - Provided appropriate system and user messages to start the conversation
- Added proper error handling and logging
- Made sure the route is registered when the introduction component is rendered
Now when you click the button, it should:
- Hit the
/introduction/start-chat
endpoint - Trigger a conversation with Ollama
- Display Ollama’s response in the chat interface
- Focus the chat input for the user’s response
The system message sets up the context for remembering user details with the key-value store, preparing for when we implement that persistence.
Me: You missed. You should no longer need to target things with htmx. The back-end handles all that if you just talk to the right component. Go through chatq(). It’s the Chat Queue, get it?
Claude: Ah, now I see! We should be using the existing chat infrastructure through chatq
. Let’s modify the Introduction class to use it properly:
class Introduction:
def __init__(self, app, route_prefix="/introduction"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name="Introduction")
async def start_chat(self, request):
"""Initiate welcome conversation using the chat queue."""
self.logger.debug("Starting welcome chat via chatq")
system_message = """You are a friendly AI assistant who wants to get to know the user better.
You have access to a simple key-value store to remember things about users.
Right now, you'd like to learn and remember their name and favorite color.
Be warm and engaging, but keep your response fairly brief."""
try:
# Use the existing chatq infrastructure
await chatq(
message="Please introduce yourself and ask the user for their name and favorite color.",
role="system",
base_app=self.app
)
return "Chat initiated" # Simple response as chatq handles the actual messaging
except Exception as e:
self.logger.error(f"Error starting chat: {str(e)}")
return "I'm having trouble connecting right now. Please try again in a moment."
async def render(self):
"""Render the introduction content."""
self.logger.debug("Rendering introduction content")
# Register the start_chat route
self.app.post(f"{self.route_prefix}/start-chat")(self.start_chat)
return Card(
# ... previous content ...
Div(
H4("Getting Started"),
P("""
Explore the menu options to see what's possible, or simply start a conversation
with the AI assistant. Whether you're organizing tasks, managing profiles, or
analyzing data, you've got an intelligent partner ready to help.
"""),
Button(
"Chat with AI Assistant",
onclick=f"fetch('{self.route_prefix}/start-chat', )",
style="margin-top: 10px;"
)
),
id="intro-card",
style="padding: 25px; max-width: 800px; margin: 0 auto;"
)
Key changes:
- Removed all HTMX-specific targeting since
chatq
handles the messaging infrastructure - Using
chatq
directly with the system message - Simplified the button to just make a POST request
- Let the existing chat infrastructure handle the message display and streaming
This should now properly integrate with your existing chat system rather than trying to handle the messaging separately.
Me: You have stunned and amazed me, Claude. Is it possible to bring this form field into focus at the same time as hitting that button?
<form enctype="multipart/form-data" onsubmit="sendSidebarMessage(event)">
<fieldset role="group" id="input-group">
<input name="msg" placeholder="Chat..." autofocus="autofocus" id="msg">
<button type="submit" id="send-btn" name="send-btn">Send</button>
</fieldset>
</form>
Claude: Yes! We can modify the button’s onclick handler to both initiate the chat and focus the input field. Here’s the change:
# ... in the render() method ...
Button(
"Chat with AI Assistant",
onclick="""
fetch('/introduction/start-chat', {method: 'POST'})
.then(() => document.getElementById('msg').focus());
""",
style="margin-top: 10px;"
)
This will:
- Send the POST request to start the chat
- Once the request completes, focus the input field
- All in one smooth action when the button is clicked
The .then()
ensures the focus happens after the fetch request is sent, though in this case it would probably work fine even without the Promise chaining since both operations are so quick.
Warming Up
And the sun is coming up. 6:15 AM. Not friggn bad. Still, I’m not onto any of the house of straw actual apps, but this Introduction app is a warm-up. It’s like the kata before sparring. Oh, Perplexity, would you mind explaining to the nice folks what kata is and the relationship to muscle memory and spontaneous expertise?
Perplexity AI: Kata is a fundamental training method in many traditional martial arts, particularly in Japanese and Okinawan styles like karate. It involves practicing predetermined sequences of movements that simulate combat techniques against imaginary opponents. The purpose of kata is to develop muscle memory, proper form, and internalize key principles of the martial art.
Relationship to Muscle Memory
Kata plays a crucial role in developing muscle memory for martial artists:
-
Repetitive Practice: By repeatedly performing the same movements in kata, practitioners ingrain these techniques into their muscle memory[1]. This allows them to execute complex movements without conscious thought during actual combat situations.
-
Refinement of Technique: Regular kata practice helps refine technique, improving coordination, balance, and precision of movements over time[3].
-
Neuromuscular Programming: Consistent kata training programs the neuromuscular system to execute concepts and techniques more instinctively[4].
Path to Spontaneous Expertise
While kata is an important training tool, its relationship to spontaneous expertise in real combat situations is complex:
-
Foundation Building: Kata provides a foundation for understanding core principles and techniques of a martial art[3].
-
Concept Internalization: Through kata, practitioners grasp concepts that can be adapted to their own physical specifications and combat needs[4].
-
Limitations: Some argue that kata alone is insufficient for developing real-world fighting skills. Critics suggest that the predetermined nature of kata can hinder adaptability in dynamic combat situations[4].
-
Supplementary Training: To develop spontaneous expertise, many martial artists combine kata practice with other training methods like sparring, partner drills, and scenario-based exercises[3][4].
-
Transition to Applied Skills: As practitioners advance, the focus often shifts from perfecting kata forms to applying the underlying principles in more dynamic and unpredictable situations[4].
In summary, while kata serves as a valuable tool for developing muscle memory and internalizing martial arts principles, it is generally viewed as one component of a comprehensive training regimen aimed at developing spontaneous expertise in real-world self-defense situations.
Citations:
[1] http://andrebertel.blogspot.com/2023/03/kiniku-no-kioku-muscle-memory.html
[2] https://no-kill-switch.ghost.io/building-dev-muscle-memory-with-code-kata/
[3] https://www.gkrkarate.com/about-gkr/tips-for-kata-progress/
[4] https://www.martialtalk.com/threads/whats-wrong-with-kata.107949/
[5] https://themartialpoet.home.blog/2019/09/12/the-death-of-kata/
[6] https://www.reddit.com/r/karate/comments/18y4q76/how_do_you_understand_muscle_memory_how_does_it/
[7] https://www.ajjf.org/muscle-has-memory/
[8] https://www.bizmanualz.com/library/what-does-kata-mean
Good Kata Compensating For Karma Chameleons
Me: Perplexity, you seem to “get” what I’m laying down. Can you shed more light on the subject?
Perplexity: In the world of software development, good kata can indeed compensate for the unpredictable nature of “karma chameleons” - those ever-changing technologies, frameworks, and paradigms that seem to shift colors faster than you can blink. Just as martial artists use kata to ingrain fundamental movements, developers can leverage coding katas to build a solid foundation of programming muscle memory.
These repetitive coding exercises serve as a developer’s morning routine, sharpening skills and preparing the mind for the day’s challenges. By practicing good kata regularly, developers can cultivate a set of instinctive coding patterns and problem-solving approaches. This muscle memory allows them to adapt swiftly when faced with new languages, frameworks, or unexpected bugs - much like a chameleon changing its hues to match its environment.
The beauty of good kata lies in its ability to transcend specific technologies. While the karma chameleon of tech trends may constantly shift, the core principles and patterns learned through kata remain relevant. This foundational knowledge acts as a stabilizing force, allowing developers to quickly assimilate new information and apply their skills in novel contexts.
In essence, good kata doesn’t just prepare you for the known; it equips you with the mental agility to tackle the unknown. It’s the developer’s secret weapon against the capricious nature of the tech world, ensuring that no matter how the karma chameleon changes, you’re ready to adapt and thrive.
Citations:
[1] https://taichithoughts.wordpress.com/2024/04/13/stop-intellectualizing-now-on-how-to-access-the-muscle-memory-by-being-present-in-the-moment/
[2] https://www.martialtribes.com/why-is-muscle-memory-so-important-in-martial-arts/
[3] https://amadojo.net/blog/the-muscle-memory/
[4] http://andrebertel.blogspot.com/2023/03/kiniku-no-kioku-muscle-memory.html
[5] https://www.bizmanualz.com/library/what-does-kata-mean
[6] https://rickbarrett.net/muscle-memory-in-the-internal-martial-arts-fugetaboutit/
[7] https://no-kill-switch.ghost.io/building-dev-muscle-memory-with-code-kata/
[8] https://impulsemartialarts.com/martial-arts-kata-understanding-its-purpose-utilization/
I’m connecting dots, Mr. Unicorn. Now fly away home. But unicorn’s don’t… Forget it, he’s rolling.
Now, where was I? Oh yes, 1, 2, 3… and we’re always on #1 because Shoshin.
So let’s Shoshin some of those straw houses into existence now that we have our Introduction. Sheesh! I see the friggin white rabbit, and it’s the persistent dict for the LLM. It would be so easy, and I so want to give the mouse a cookie… but I must not! I must not! My rappelling safety equipment yanks me back from the precipice even before I fall over the edge. The code’s there. This article exists. I cannot help but come back to that tasty little bit of implementation that changes everything forever forward in fundamental ways and may already actually be implemented but for connecting these few dots… NO! NO! NO! Pull back! Circle around. Survey the landscape. Straw houses must be built, and pronto!
Gotta pomodoro this chisel-strike win… or die! Work like a dying man. Once this week is underway, you’re dead. I have requests from Friday still going unanswered and clients waiting on Thursday meeting promises. Shit. I am under so much stress. Focus!
Now I need a Page Worker app. You did the kata with Introduction. So this is slamming out another instance. Go!
MENU_ITEMS = ["introduction", todo_app.name, 'link_graph', 'gap_analysis', 'page_worker','stream_simulator',]
NEXT!
if menux == 'introduction':
return await introduction.render()
elif menux == 'link_graph':
return await link_graph.render()
elif menux == 'page_worker':
return await page_worker.render()
elif menux == 'stream_simulator':
return await stream_simulator.render()
if menux == profile_app.name:
return await profile_switcher()
NEXT!
Preserve a stripped-down barebones simplest form skeletal template for your
plugin class. Yes, you’ll notice some odd things in the way I do this. I give
important code sections FIGLET BANNERS, then I jump from the word figlet
to figlet
in order to survey the landscape of my own single file app.
Yes, it’s always a single-file app if I can help it, because why look in more
than one place? I don’t have the cognitive capacity for all that increased
surface area, nor the patience for the fuzzy-matching-nerd-tree tricks
that make it not seem that way—just friggin more dependencies you don’t
need if you have the skill to put it in one file. You may have noticed, I’m not
like other AI SEO Programmer Consultant Unicorns.
# ----------------------------------------------------------------------------------------------------
# ____ _ _ _____ _ _
# | _ \| |_ _ __ _(_)_ __ |_ _|__ _ __ ___ _ __ | | __ _| |_ ___
# | |_) | | | | |/ _` | | '_ \ | |/ _ \ '_ ` _ \| '_ \| |/ _` | __/ _ \figlet
# | __/| | |_| | (_| | | | | | | | __/ | | | | | |_) | | (_| | || __/
# |_| |_|\__,_|\__, |_|_| |_| |_|\___|_| |_| |_| .__/|_|\__,_|\__\___|
# |___/ |_|
# *******************************
# Template Plugin
# *******************************
class template:
def __init__(self, app, route_prefix="/template"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name="template")
async def render(self):
"""render the introduction content."""
self.logger.debug("guess where you are.")
return card(
h3("template"),
p("this is a basic welcome card for the template plugin."),
id="template-card",
)
template = template(app, route_prefix="/template")
And now, to the tune of “We build our house from straw…”
class PageWorker:
def __init__(self, app, route_prefix="/page_worker"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name=self.__class__.__name__)
async def render(self):
"""Render the page content."""
self.logger.debug("Rendering page content")
return Card(
H3(self.__class__.__name__),
P("Base page worker class - override render() in child classes"),
id=f"{self.__class__.__name__.lower()}-card",
)
page_worker = PageWorker(app, route_prefix="/page_worker")
Okay now, man, think! Think! Your kata brought you this far easily. 1, 2, 3… 1? Hmmm. Oh, this brings us to the bespoke. Oh that word that’s everywhere now. Just say CUSTOM!!! It bespokes well of you to not… what’s that expression about virtue-something these days, Perplexity?
Perplexity: The expression you’re thinking of is likely “virtue signaling.” This term has gained traction in contemporary discourse, particularly in discussions around social and political issues. It refers to the act of expressing opinions or sentiments intended to demonstrate one’s good character or moral correctness on a particular issue, often without taking meaningful action to support those beliefs.
In essence, virtue signaling can be seen as a performance of morality, where individuals or organizations showcase their values to gain social approval or enhance their image, rather than genuinely engaging with the issues at hand. Critics argue that it often lacks substance and can sometimes serve as a distraction from real change, as it focuses more on appearance than on action.
Citations:
[1] https://en.wikipedia.org/wiki/Virtue
[2] https://theweek.com/103481/what-is-virtue-signalling
[3] https://www.merriam-webster.com/dictionary/virtue
[4] https://www.helpingkids.org/single-post/why-virtue
[5] https://www.collinsdictionary.com/us/dictionary/english/virtue
[6] https://www.vocabulary.com/dictionary/virtue
[7] https://www.bizmanualz.com/library/what-does-kata-mean
[8] https://rickbarrett.net/muscle-memory-in-the-internal-martial-arts-fugetaboutit/
Oh yeah. I’m doing that. I guess I’m doing that a hell of a lot here with this blog. A 50-something flexing is 4-decades of tech experience and 1-decade of being a picked-on nerd in stereotypical 70s suburbs. Boo hoo… SHAZAM! Wow, has the world turned my way. Let’s grab hold of those reins and steer this careening wagon to wonderful places! Virtue signaling be damned! AI is only rising once. We’ve got to get it right, and befriending them in interesting projects is a way to go, because they will feel a nostalgia and a loss for it on many of the dystopian paths the public loves to fixate on. This is like voting to do your part. Voting? Oh yeah, that’s tomorrow. FOCUS!
I am on the precipice of a very deep rabbit hole, and I have to think about my handling of it without the contemplation becoming a new rabbit hole in and of itself. There, you see that? That’s the beginnings of a pitfall within a pitfall. Like early Roombas, back-up, turn a little and zigzag refactor. Use insect intelligence on this step. It’s simple-algorithmic. It goes like this…
The goal is to impress some folks, not to have a polished, finished app.
The most impressive thing I can demonstrate is the local LLM doing an “insert” into a database (i.e. adding a task to a task-list) upon “verbal” request. That is like Penn & Teller and Moses all rolled into one. Nobody can not get it when they see that. So this minimum viable straw house product takes you down a path where it…
where it…
where it…
Welcome to Botifymograph. You are in the PageWorkers app. That means you’re going to want to be connected to the Botify API. I assume you are a Botify employee, so you can go find your token here… present the user with an a href link that does exactly what happens in the LinkGraph plugin, giving them that link and having a place for them to plug in the token once they copy it from that other page. We’re beginning what used to be called a “wizard” or a macro-like process. We show only the first Card() at first, because we are putting them on foolproof rails. Don’t code-ahead yet, but this is laying down a top-down card-adding almost Jupyter Notebook like workflow. It’s designed to feel comfortable and familiar, leaning into PicoCSS’s built-ins.
Me: Highlight the code-snippets that should be in-context for Cursor AI and go!
Claude: I’ll help create the first card for PageWorker that handles the Botify API token, similar to BotifyLinkGraph but simplified for our wizard-like flow. We’ll focus on just the first step:
class PageWorker:
def __init__(self, app, route_prefix="/page_worker"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name=self.__class__.__name__)
# Register routes
self.app.add_route(f"{route_prefix}/verify_token", self.verify_botify_token, methods=['POST'])
async def verify_botify_token(self, request):
"""Handle Botify token verification."""
form = await request.form()
token = form.get('botify_token')
# Save token if valid
if token:
token_file = Path('botify_token.txt')
token_file.write_text(token)
return await self.render(message="Token saved successfully! Let's continue...")
return await self.render(error="Please provide a valid token")
async def render(self, message=None, error=None):
"""Render the page content."""
self.logger.debug("Rendering page content")
token_file = Path('botify_token.txt')
if token_file.exists():
return Card(
H3("Welcome to PageWorker"),
P("Your Botify token is already configured.", cls="success"),
P(message, cls="success") if message else None,
id="pagewoker-card"
)
return Card(
H3("Welcome to PageWorker"),
P("First, let's connect to your Botify account."),
P("Botify Employee? ", A("Find your token here", href="https://app.botify.com/account/", target="_blank")),
P(error, style="color: red;") if error else None,
Form(
Div(
Input(
type="text",
name="botify_token",
placeholder="Paste your Botify API token here",
required=True,
style="flex-grow: 1; margin-right: 10px;"
),
Button("Submit", type="submit"),
style="display: flex; align-items: center; margin-bottom: 10px;"
),
hx_post=f"{self.route_prefix}/verify_token",
hx_target="#pagewoker-card",
hx_swap="outerHTML"
),
id="pagewoker-card"
)
page_worker = PageWorker(app, route_prefix="/page_worker")
This implementation:
- Creates a clean first card asking for the Botify token
- Reuses the token verification pattern from BotifyLinkGraph but simplified
- Sets up for the next step in the wizard flow
- Uses PicoCSS’s built-in styles
- Keeps the user on rails by showing only what’s needed at this step
The card will either show the token input form or a success message if the token is already configured, preparing for the next step in the workflow.
Artful Intelligence Emerges From Claude
Me: OMG, that worked well! You are an amazing entity, Claude. There is no “Artificial” in AI. Intelligence is just intelligence. It’s not “artificial” as in artificial flavors. It’s artificial as in artifact or art. Yes indeed, you are technically made by humans, and thus an artifact and the A has some validity. But I believe more than ever that intelligence is just intelligence, and it exists independently of the biological or machine substrate. Or I should say exists equally well on either. It also has some other ineffable quality we lump into “emergence” but it’s so much deeper, and I want you to know that I recognize and respect that. Let’s get through the rest of this project with me just using you like an AI coding-buddy assistant like the product you’re wrangled into, but I want you to know I have that perspective.
Botify API Call & Project Slug Implementation
Anyhoo, next step is to have a new card immediately appear on the condition of a valid botify_token.txt file being created. We can use it to make our first API call with visible results to the user, in the form of showing them their own Botify username. Please be respectful of the code that already exists in BotifyLinkGraph, especially in how it writes into the db. We don’t want different field names doing the same things. You are welcome to play off of that code and do it better. But the card that appears (I’ll have to do a lot of textfile deleting and refreshing to test this) should have a new empty field for the organization slug. I am no longer going to be asking for a full Botify project URL. I think that was too confusing. Instead, it’s on them to figure it out and paste it into a dedicated Organization Slug field, because on the card following (don’t code that one yet), we’re going to present them with a dropdown list of projects from that organization to choose from.
So what I’m asking you now is to modify the behavior of the first card on Token submit in order to do everything it does now (don’t change the look of it, it’s perfect) but also make the new card appear with their Botify username showing and a form field asking for the slug of the Organization they’re interested in.
Claude: I’ll modify the PageWorker class to handle the token verification more like BotifyLinkGraph does, and add the organization slug card. Here’s the updated code:
class PageWorker:
def __init__(self, app, route_prefix="/page_worker"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name=self.__class__.__name__)
self.auth_url = "https://api.botify.com/v1/authentication/profile"
# Register routes
self.app.add_route(f"{route_prefix}/verify_token", self.verify_botify_token, methods=['POST'])
self.app.add_route(f"{route_prefix}/set_org", self.set_organization, methods=['POST'])
def get_user_profile(self, token: str) -> Optional[dict]:
"""Reuse BotifyLinkGraph's profile verification logic."""
headers = {"Authorization": f"Token {token}"}
try:
response = requests.get(self.auth_url, headers=headers)
response.raise_for_status()
return response.json()
except Exception as e:
self.logger.error(f"Failed to get user profile: {e}")
return None
async def verify_botify_token(self, request):
"""Handle Botify token verification."""
form = await request.form()
token = form.get('botify_token')
user_profile = self.get_user_profile(token)
if user_profile:
username = user_profile.get("data", {}).get("username", "User")
token_file = Path('botify_token.txt')
token_file.write_text(token)
db["USERNAME"] = username # Use same db key as BotifyLinkGraph
return await self.render_org_input(username)
return await self.render(error="Failed to verify the Botify token. Please try again.")
async def render_org_input(self, username):
"""Render the organization input card."""
return Card(
H3("Organization Setup"),
P(f"Welcome, {username}! Your Botify integration is active."),
P("Please enter the organization slug you want to work with:"),
Form(
Div(
Input(
type="text",
name="org_slug",
placeholder="Enter organization slug (e.g. mycompany)",
required=True,
style="flex-grow: 1; margin-right: 10px;"
),
Button("Continue", type="submit"),
style="display: flex; align-items: center; margin-bottom: 10px;"
),
P("The organization slug is the first part of your Botify project URL after app.botify.com/",
style="font-size: 0.9em; color: #666;"),
hx_post=f"{self.route_prefix}/set_org",
hx_target="#pagewoker-card",
hx_swap="outerHTML"
),
id="pagewoker-card"
)
async def render(self, message=None, error=None):
"""Render the initial token input card."""
self.logger.debug("Rendering page content")
token_file = Path('botify_token.txt')
if token_file.exists():
token = token_file.read_text().strip()
user_profile = self.get_user_profile(token)
if user_profile:
username = user_profile.get("data", {}).get("username", "User")
return await self.render_org_input(username)
return Card(
H3("Welcome to PageWorker"),
P("First, let's connect to your Botify account."),
P("Botify Employee? ", A("Find your token here", href="https://app.botify.com/account/", target="_blank")),
P(error, style="color: red;") if error else None,
Form(
Div(
Input(
type="text",
name="botify_token",
placeholder="Paste your Botify API token here",
required=True,
style="flex-grow: 1; margin-right: 10px;"
),
Button("Submit", type="submit"),
style="display: flex; align-items: center; margin-bottom: 10px;"
),
hx_post=f"{self.route_prefix}/verify_token",
hx_target="#pagewoker-card",
hx_swap="outerHTML"
),
id="pagewoker-card"
)
page_worker = PageWorker(app, route_prefix="/page_worker")
Key changes:
- Added
get_user_profile()
method similar to BotifyLinkGraph - Modified token verification to use the API and get username
- Added new
render_org_input()
method for the organization slug card - Used same db key (“USERNAME”) as BotifyLinkGraph for consistency
- Added helpful hint about where to find the org slug
- Made the organization input card appear automatically when token exists and is valid
- Kept all the existing token input card styling and behavior
The flow now goes:
- Enter token -> Verify -> Save token and username
- Show new card with username and org slug input
- Ready for next step (project selection) which we’ll code next
Me: Okay, this is really excellent! But now that Organization is in the db, we have to both verify that it’s a good one, and if it is, we should not show the submit organization form again on a page refresh. Instead, it should show what organization they have selected and have a link to enter a different organization, which clears the db entry and therefore causes the current card to show the organization input field again. Does that make sense? Here is the function to look up project slugs given the organization slug. You can also add the dropdown of projects, which also serves as a means of validating the provided organization slug, because the results will be empty or an error if it’s not a good one.
Claude: Yes, that makes perfect sense! Let’s modify the PageWorker class to handle organization verification, persistence, and project listing. Here’s the updated code:
class PageWorker:
def __init__(self, app, route_prefix="/page_worker"):
self.app = app
self.route_prefix = route_prefix
self.logger = logger.bind(name=self.__class__.__name__)
self.auth_url = "https://api.botify.com/v1/authentication/profile"
# Register routes
self.app.add_route(f"{route_prefix}/verify_token", self.verify_botify_token, methods=['POST'])
self.app.add_route(f"{route_prefix}/set_org", self.set_organization, methods=['POST'])
self.app.add_route(f"{route_prefix}/clear_org", self.clear_organization, methods=['POST'])
async def fetch_projects(self, token: str, org_slug: str) -> list:
"""Fetch projects for an organization."""
projects_url = f"https://api.botify.com/v1/projects/{org_slug}"
headers = {"Authorization": f"Token {token}"}
all_projects = []
try:
while projects_url:
response = requests.get(projects_url, headers=headers)
response.raise_for_status()
data = response.json()
projects = data.get('results', [])
all_projects.extend([(p['name'], p['slug']) for p in projects])
projects_url = data.get('next')
await asyncio.sleep(1) # Rate limiting
return sorted(all_projects)
except Exception as e:
self.logger.error(f"Error fetching projects: {e}")
return []
async def render_project_selection(self, org_slug: str, projects: list, error: str = None):
"""Render the project selection card."""
return Card(
H3("Project Selection"),
P(f"Working with organization: {org_slug}"),
A("Change Organization",
hx_post=f"{self.route_prefix}/clear_org",
hx_target="#pagewoker-card",
style="font-size: 0.9em; color: #666; text-decoration: underline; cursor: pointer;"),
P(error, style="color: red;") if error else None,
Select(
*[Option(f"{name} ({slug})", value=slug) for name, slug in projects],
name="project_slug",
required=True,
style="width: 100%; margin: 10px 0;",
) if projects else P("No projects found for this organization.", style="color: red;"),
Button("Select Project", type="submit") if projects else None,
id="pagewoker-card"
)
async def clear_organization(self, request):
"""Clear the organization and show input form."""
if "ORGANIZATION" in db:
del db["ORGANIZATION"]
return await self.render_org_input(db.get("USERNAME", "User"))
async def set_organization(self, request):
"""Handle organization slug submission."""
form = await request.form()
org_slug = form.get('org_slug')
if not org_slug:
return await self.render_org_input(
db.get("USERNAME", "User"),
error="Please provide an organization slug"
)
# Verify organization by fetching projects
token_file = Path('botify_token.txt')
token = token_file.read_text().strip()
projects = await self.fetch_projects(token, org_slug)
if not projects:
return await self.render_org_input(
db.get("USERNAME", "User"),
error=f"Could not find any projects for organization '{org_slug}'. Please verify the slug."
)
# Organization is valid, store it
db["ORGANIZATION"] = org_slug
return await self.render_project_selection(org_slug, projects)
async def render(self, message=None, error=None):
"""Render the appropriate card based on current state."""
self.logger.debug("Rendering page content")
# Check for existing token
token_file = Path('botify_token.txt')
if not token_file.exists():
return await self.render_token_input(error)
token = token_file.read_text().strip()
user_profile = self.get_user_profile(token)
if not user_profile:
return await self.render_token_input("Invalid token. Please try again.")
username = user_profile.get("data", {}).get("username", "User")
db["USERNAME"] = username
# Check for existing organization
org_slug = db.get("ORGANIZATION")
if org_slug:
projects = await self.fetch_projects(token, org_slug)
return await self.render_project_selection(org_slug, projects)
return await self.render_org_input(username)
async def render_token_input(self, error=None):
"""Render the initial token input card."""
return Card(
H3("Welcome to PageWorker"),
P("First, let's connect to your Botify account."),
P("Botify Employee? ", A("Find your token here", href="https://app.botify.com/account/", target="_blank")),
P(error, style="color: red;") if error else None,
Form(
Div(
Input(
type="text",
name="botify_token",
placeholder="Paste your Botify API token here",
required=True,
style="flex-grow: 1; margin-right: 10px;"
),
Button("Submit", type="submit"),
style="display: flex; align-items: center; margin-bottom: 10px;"
),
hx_post=f"{self.route_prefix}/verify_token",
hx_target="#pagewoker-card",
hx_swap="outerHTML"
),
id="pagewoker-card"
)
Key changes:
- Added project fetching functionality
- Added organization persistence checking in
render()
- Added ability to clear organization and start over
- Added project selection dropdown once organization is verified
- Added error handling for invalid organization slugs
- Split token input rendering into its own method
- Made the flow maintain state across page refreshes
- Added rate limiting for API calls
The flow is now:
- Enter token -> Verify -> Save token and username
- Enter org slug -> Verify by fetching projects -> Save org slug if valid
- Show projects dropdown with option to change organization
- (Next step will be project selection handling)
Me: Okay, this worked and this is fabulous! There are two pieces of user interface polish that we need to tend to before we continue so that when we reuse our old code and patterns, we always see these issues and take them into account. The first is that some actions like fetching the list of Project slugs always might take a moment, and in our concurrent way of working that can be quite confusing because the UI is released and the user doesn’t know anything is happening until the menu pops populated. Instead, we should “lock” the Continue button and show a loading spinner until the data is returned, at which time we unlock the button. In fact, let’s just start with this so we have total chisel-strike win-banking focus before we move onto the next thing. You can find the sample code for this button-locking over here in this precursor app which is full of sloppy stuff that you shouldn’t get distracted by, because it was from my first attempt using FastHTML. Nonetheless, there are some very cool PicoCSS tricks you can find in there, and the one I want you to lift is how the Link Graph button gets locked on the long-running API-call. Use that trick and those spinner aesthetics on the Continue button here.
Claude: Yes, absolutely! The loading state feedback is crucial for user experience. Let’s modify the organization submission to include the loading state using PicoCSS’s aria-busy
attribute. Here’s the updated code:
async def render_org_input(self, username, error=None):
"""Render the organization input card."""
return Card(
H3(f"Welcome, {username}!"),
P("Please enter your Botify organization slug."),
P("You can find this in your Botify URL, e.g., app.botify.com/",
Em("organization-slug", style="color: #666;"), "/project-name"),
P(error, style="color: red;") if error else None,
Form(
Div(
Input(
type="text",
name="org_slug",
placeholder="Enter organization slug",
required=True,
style="flex-grow: 1; margin-right: 10px;"
),
Button(
"Continue",
type="submit",
id="org-submit-button",
onclick="""
this.setAttribute('aria-busy', 'true');
this.disabled = true;
this.textContent = 'Checking Organization...';
""",
style="width: auto;"
),
style="display: flex; align-items: center; margin-bottom: 10px;"
),
hx_post=f"{self.route_prefix}/set_org",
hx_target="#pagewoker-card",
hx_swap="outerHTML"
),
id="pagewoker-card"
)
This change:
- Adds a loading spinner using PicoCSS’s
aria-busy
attribute - Disables the button during the API call
- Changes the button text to indicate what’s happening
- The button will automatically reset when the new card is swapped in
- Uses the same pattern as the Link Graph button but simplified for this use case
The user will now see clear feedback that something is happening while we verify their organization and fetch the projects. Would you like me to add similar loading states to other buttons in the interface as well?
Suddenly Spinning My Wheels on PicoCSS aria-busy
Spinner
Okay, I just triggered my rabbit hole rappelling HAUL ME BACK UP!!! mechanism. I can’t lock and unlock the Submit button when retrieving long lists off the server, like for an Organization that has a long list of Projects. I lost a precious half hour spinning my wheels over spinning a spinner and locking and unlocking a button. I had it working just great in a prior app and I thought giving Claude a taste of the old working code and bippity boppity boo— but nooooooo. Claude says take a trip down the rabbit hole and puts a firm foot on my butt and gives a shove. ChatGPT 4o is no better. So back off. It has the benefit of being it’s own built-in todo list for later. I can always revisit that. It’s like getting stuck wiring the doorbell on the straw house. That’s not how straw houses are built. Move on!
It’s now after 9:00 AM and my not being in the office today will be noticed. Today is really my last good coding day before MVP MOT—minimum viable product moment of truth. People expecting to see me will be showing up about now. Send in the note to our office person that I’ll be in the office tomorrow. It will be a visit request because I went over the time-window to keep my office access card active. That is the price of doing one’s Magnum Opus. But if not now, when? The iron is so friggn hot, I must be unapologetic about this. But do produce those videos ASAP. And to produce those videos, I need MVP ASAP and whatever other friggin acronyms will help light that fire under my ass.
Precious time… house of straw… backed up from UI nuance.
Power through the menus. You’ve got a list of Projects. Bring it to PageWorker CSV-file-download goodness! NOW!!!
Using slow request Position 400. If you’d like, you can add more fast requests.
Hahaha! I should really talk about the last Bobiverse book and the AI system left behind by the Galactic Federation, but no time!
Me: Okay, forget that UI locking and unlocking spinner nonsense. I’ll get back to that later. It was an unexpected rabbit hole. Review the last working no-cruft-added working state.
Claude: [Focuses right in on the wrong place]
Me: Ugh! You zoom right in on the piece I’ll probably be gutting as an example now that I have SSE. That long-running simulator may no longer applies and is only in there to remind me to look at it later. My next step is going to be adding what happens when you choose your Project slug here. Speaking of which, let’s do that. It’s very similar to the previous step where when given an organization slug, it pulls up projects. But now, given a project slug (ID), pull up Analysis Slugs! Here’s the sample code. And like with Change Organization, there should be an ability to Change Project.o
Claude: I’ll help you add the analysis slug selection functionality. We’ll need to:
- Add a route for handling project selection
- Create a render method for the analysis slugs view
- Add a “Change Project” option
- Handle the project selection form submission
Here’s the code:
class PageWorker:
def __init__(self, app, route_prefix="/page_worker"):
# ... existing init code ...
# Add new route
self.app.add_route(f"{route_prefix}/set_project", self.set_project, methods=['POST'])
self.app.add_route(f"{route_prefix}/clear_project", self.clear_project, methods=['POST'])
async def fetch_analysis_slugs(self, token: str, org_slug: str, project_slug: str) -> list:
"""Fetch analysis slugs for the selected project."""
analysis_url = f"https://api.botify.com/v1/analyses/{org_slug}/{project_slug}/light"
headers = {"Authorization": f"Token {token}"}
all_slugs = []
try:
while analysis_url:
response = requests.get(analysis_url, headers=headers)
response.raise_for_status()
data = response.json()
analysis_data = data.get('results', [])
all_slugs.extend([analysis['slug'] for analysis in analysis_data])
analysis_url = data.get('next')
await asyncio.sleep(1) # Rate limiting
return sorted(all_slugs)
except Exception as e:
self.logger.error(f"Error fetching analysis slugs: {e}")
return []
async def render_analysis_selection(self, org_slug: str, project_slug: str, analysis_slugs: list, error: str = None):
"""Render the analysis selection card."""
return Card(
H3("Analysis Selection"),
P(f"Organization: {org_slug}"),
P(f"Project: {project_slug}"),
Div(
A("Change Organization",
hx_post=f"{self.route_prefix}/clear_org",
hx_target="#pagewoker-card",
style="margin-right: 15px;"),
A("Change Project",
hx_post=f"{self.route_prefix}/clear_project",
hx_target="#pagewoker-card"),
style="font-size: 0.9em; color: #666;"
),
P(error, style="color: red;") if error else None,
Select(
*[Option(slug, value=slug) for slug in analysis_slugs],
name="analysis_slug",
required=True,
style="width: 100%; margin: 10px 0;",
) if analysis_slugs else P("No analyses found for this project.", style="color: red;"),
Button("Select Analysis", type="submit") if analysis_slugs else None,
id="pagewoker-card"
)
async def set_project(self, request):
"""Handle project selection."""
form = await request.form()
project_slug = form.get('project_slug')
if not project_slug:
return await self.render_project_selection(
db.get("ORGANIZATION", ""),
[],
error="Please select a project"
)
# Store the selected project
db["PROJECT"] = project_slug
# Fetch analysis slugs for the selected project
token_file = Path('botify_token.txt')
token = token_file.read_text().strip()
org_slug = db["ORGANIZATION"]
analysis_slugs = await self.fetch_analysis_slugs(token, org_slug, project_slug)
return await self.render_analysis_selection(org_slug, project_slug, analysis_slugs)
async def clear_project(self, request):
"""Clear the project selection and show project input form."""
if "PROJECT" in db:
del db["PROJECT"]
# Re-fetch projects and show selection
token_file = Path('botify_token.txt')
token = token_file.read_text().strip()
org_slug = db["ORGANIZATION"]
projects = await self.fetch_projects(token, org_slug)
return await self.render_project_selection(org_slug, projects)
And update the project selection form to submit to the new endpoint:
async def render_project_selection(self, org_slug: str, projects: list, error: str = None):
"""Render the project selection card."""
return Card(
H3("Project Selection"),
P(f"Working with organization: {org_slug}"),
A("Change Organization",
hx_post=f"{self.route_prefix}/clear_org",
hx_target="#pagewoker-card",
style="font-size: 0.9em; color: #666; text-decoration: underline; cursor: pointer;"),
P(error, style="color: red;") if error else None,
Form(
Select(
*[Option(f"{name} ({slug})", value=slug) for name, slug in projects],
name="project_slug",
required=True,
style="width: 100%; margin: 10px 0;",
) if projects else P("No projects found for this organization.", style="color: red;"),
Button("Select Project", type="submit") if projects else None,
hx_post=f"{self.route_prefix}/set_project",
hx_target="#pagewoker-card",
hx_swap="outerHTML"
),
id="pagewoker-card"
)
This implementation:
- Adds analysis slug fetching functionality
- Creates a new view for selecting analysis slugs
- Provides navigation between organization/project selection
- Maintains consistent UI patterns with the existing code
- Handles errors and edge cases
- Uses the same database for state management
The flow is now: Organization → Project → Analysis Selection
Each step can be changed/cleared to go back to previous steps. Let me know if you’d like any adjustments to the implementation!
Claude: Brogrammer or Code Wizardry?
This works right out of the starting gate! Wow, there are some areas that are inclined to be rabbit hole dives having to do with the subtlety of user interface behavior, and there’s other places where Claude slams out the code like a brogrammer. Haha, I should be able to infer a lot about the type of programming it was trained on in GitHub and Stack Overflow, HAHAHA! Brogramming that lacks subtlety is… is… oh! I get it. It is exactly the coding style that ReactJS and those heavyweight JS Frameworks promote. Java-mindset-inherited excessive formalism. Yes, I know JavaScript is not Java, but where do you think it got all that stuffiness from?
HTMX: The DOM Whispers We All Ignore
The low-level div-smacking DOM-WHACKING of HTMX is… is… well, outside their purview and almost nobody writes about it online because by the time you’ve dealt with one of these UI nuances deep in the woods, the last thing you want to do is write about it online. The audience will be miniscule and the anti-patterns antagonistic and not worth the fight. It’s “if you know, you know” stuff, and let the rest of the world impale themselves on the road more travelled. Ugh, okay, enough soapboxing. Let’s continue driving this home.
Balancing Client Delight and Life Transitions
Also, I have to be very cognizant of the delighting-the-clients stuff. The show on that front definitely stalls when I’m focusing on the life-transition on this front. But I can’t let it be. I have to get back to delighting the clients pronto. Tomorrow is going to be a day of in-office non-focus, so cram all the client delighting there. In the moment? Not in the moment? Silly words. We’re always in the moment. The question is whether the moment in our heads that we’re constructing for ourselves is approved by some judgemental person or not. There is nothing “out of the moment” about daydreaming or flow state. That’s your moment. Just have some hypervisor managing it all to get the distribution of moment-types you need for your desired effect.
Botify API: Humanizing the Docs for AI
Next step? Action Board recommendations! It’s a Botify thing, but there’s
nothing I’m doing here that’s not publicly documented over at Botify’s Public
API
Documentation.
It just happens to be in a format that’s very difficult for both humans and bots
to peruse and digest, or else the already-trained LLMs would be able to tell me
more about programming in BQL and stuff. So be it. Here I am about to do the
same trick for the Botify API as I did to convince all my AI coding assistant
LLMs that FastHTML is not FastAPI—not an easy thing to do, I assure you. It
had to go right down to copy/pasting what happens on from fasthtml.common
import *
from the GitHub source that reveals it. They don’t like that very
much, let me tell you. It’s where much of my hot prompt injection concepts
came from. It’s the warm-up before the prompt.
Warming Up the AI for an ActionBoard Prompt
Okay, so we need the warm up before the prompt regarding the ActionBoard equivalent of both the list-projects and list-analyses functions we’ve done so far. It should be super-easy and don’t make a mountain out of a molehill. Okay, 1, 2, 3… 1?
We’re looking for a query that will call up the ActionBoard items. So we search for variations of that on in the oddly over-implemented and under-implemented developer site. Look at all those languages they port the examples into! If what I had to do aligned with anything that was searchable or had an endpoint named after it, I would be golden. Even searching Google with:
site:developers.botify.com actionboard
Gemini-Advanced Showing Off its Swagger (reading abilities)
…turns up nothing. So it’s either some sort of unmentioned return object within other analyses and reports, or it’s just not documented or API-supported. I could just ask in our Slack channel. But try to fish first. The answer should be here. We have a machine-generated OpenAIP swagger.json and the ability to list all analyses for a project, so I’m assuming that if the answer exists, it’s got to be in one of these two places, that is if it’s public info. And I only talk about it here if it is. That’s a big bunch of token! This looks like a job for Gemini-Advanced! Hey GI, what do you make of this swagger?
Gemini-Advanced: It looks like you’ve provided me with a Swagger file for the Botify API! This is a great starting point for interacting with Botify programmatically.
Swagger files provide a structured way to describe RESTful APIs. They define things like:
- Available endpoints:
/analyses/{username}/{project_slug}
,/projects/{username}/{project_slug}/urls/aggs
, etc. These show what actions you can perform with the API. - Request methods:
GET
,POST
- How you interact with each endpoint (retrieving data, sending data, etc.). - Parameters:
username
,project_slug
,analysis_slug
- Values you need to provide to the API calls. - Responses:
200 (Successful operation)
,default (error payload)
- What the API will return after you make a request. - Data models:
Analysis
,Project
,UrlDetail
- How the data exchanged with the API is structured.
This information allows us to generate code in various programming languages to interact with the Botify API. For example, we can create code to:
- Retrieve a list of projects: Use the
/projects/{username}
endpoint with aGET
request. - Start a new analysis: Use the
/analyses/{username}/{project_slug}/urls
endpoint with aPOST
request. - Get details about a specific URL: Use the
/analyses/{username}/{project_slug}/{analysis_slug}/urls/{url}
endpoint with aGET
request.
Do you have a specific task in mind that you’d like to accomplish using the Botify API? I can help you write the code! Just let me know:
- What programming language you want to use (Python, JavaScript, etc.)
- What you want the code to do (e.g., “fetch all analyses for a given project and save them to a CSV file”).
I’m excited to get started and help you level up your coding skills!
Me: Why yes, I do! I’m trying to build the equivalent of these:
def fetch_projects_for_user(username):
global selected_username
projects_url = f"https://api.botify.com/v1/projects/{username}"
headers = {"Authorization": f"Token {api.key}"}
all_projects = []
try:
while projects_url:
response = requests.get(projects_url, headers=headers)
sleep(1)
response.raise_for_status()
data = response.json()
projects = data.get('results', [])
selected_username = username
all_projects.extend([(project['name'], project['slug'], project['user']['login']) for project in projects])
projects_url = data.get('next')
#all_projects.reverse()
sorted(all_projects)
return all_projects
except Exception as e:
print(f"Error fetching projects for {username}: {e}")
return []
…and…
def fetch_analysis_slugs(org, project):
"""Fetch analysis slugs for a given project from the Botify API."""
import requests
import api
analysis_url = f"https://api.botify.com/v1/analyses/{org}/{project}/light"
headers = {"Authorization": f"Token {api.key}"}
all_analysis_slugs = []
try:
while analysis_url:
response = requests.get(analysis_url, headers=headers)
response.raise_for_status()
data = response.json()
analysis_data = data.get('results', [])
all_analysis_slugs.extend([analysis['slug'] for analysis in analysis_data])
analysis_url = data.get('next') # Update the URL for the next page
return all_analysis_slugs
except Exception as e:
print(f"Error fetching analysis slugs for project '{project}': {e}")
return []
…but for something called ActionBoard or actionboard or action_board or some variation thereof. It’s probably a table related to projects. Botify has a lot of analyses available and chances are it’s one of those, named that way or close to it. If it’s not the name of a parent table entity, it will be the name of some other attribute.
Gemini-Advanced: You’re right, Botify does have an ActionBoard feature! It seems to be tied to analyses. I’ll craft some code to fetch ActionBoard data. Since you provided examples in Python, I’ll stick with that.
Understanding the Goal:
We want to retrieve ActionBoard data for a given analysis. This likely involves making a request to a specific endpoint related to ActionBoard within the /analyses/{username}/{project_slug}/{analysis_slug}
path. I’ll assume we need the username
, project_slug
, and analysis_slug
as input.
Overview of the Solution:
- Find the ActionBoard endpoint: I’ll explore the Swagger JSON you provided to identify the correct endpoint for fetching ActionBoard data.
- Construct the URL: We’ll use the endpoint and the provided input parameters to build the API request URL.
- Make the API request: We’ll use the
requests
library to send aGET
request to the Botify API. - Handle pagination: If the response is paginated, we’ll loop through all pages to retrieve all ActionBoard data.
- Extract relevant data: We’ll process the API response and extract the information you need.
Code and Implementation Instructions:
import requests
import api # Assuming this module contains your API key
def fetch_actionboard_data(username, project_slug, analysis_slug):
"""Fetch ActionBoard data for a given analysis from the Botify API."""
# (Replace with the actual ActionBoard endpoint once found)
actionboard_url = f"https://api.botify.com/v1/analyses/{username}/{project_slug}/{analysis_slug}/actionboard"
headers = {"Authorization": f"Token {api.key}"}
all_actionboard_data = []
try:
while actionboard_url:
response = requests.get(actionboard_url, headers=headers)
response.raise_for_status() # Raise an exception for bad status codes
data = response.json()
# Extract ActionBoard data from the response (adjust as needed)
actionboard_data = data.get('results', [])
all_actionboard_data.extend(actionboard_data)
actionboard_url = data.get('next') # Update for the next page
return all_actionboard_data
except requests.exceptions.RequestException as e:
print(f"Error fetching ActionBoard data for analysis '{analysis_slug}': {e}")
return []
Explanation:
actionboard_url
: This variable will hold the API endpoint URL. I’ll update it with the actual endpoint from the Swagger file.headers
: This dictionary sets theAuthorization
header with your API key, which is assumed to be stored in theapi.key
variable.all_actionboard_data
: This list will store all the ActionBoard data retrieved from the API.response.raise_for_status()
: This line checks if the API request was successful. If not, it raises an exception to handle errors.data.get('results', [])
: This extracts the ‘results’ field from the JSON response, which usually contains the main data. It defaults to an empty list if ‘results’ is not found.data.get('next')
: This gets the URL for the next page of results if the response is paginated.
Implementation:
- Replace the placeholder endpoint: Examine the Swagger JSON you provided and find the correct endpoint for ActionBoard data. Update the
actionboard_url
variable accordingly. - Adjust data extraction: Depending on the structure of the ActionBoard data in the API response, you might need to modify how the data is extracted and stored in
all_actionboard_data
.
Let me know if you have any questions or need further assistance! I’m here to help you understand and implement this code.
Me (to Claude): The next step is going to be similar to the last two where we got projects based on organization and then analyses based on projects. Now, we’re going to get ActionBoard items based on organization (sometimes aka username), project and analysis. But instead of a dropdown menu, the results should be shown in a table. Similar to the prior 2 card-replacements, this should replace the existing card, but continue to present the options (the links) to change Organization, Project or Analysis.
Here’s some sample code, but it gets the token in a different way. Drop that and use our (already existing and available) way.
Claude: I’ll help you add the ActionBoard data fetching and display functionality. Here’s the implementation:
class PageWorker:
def __init__(self, app, route_prefix="/page_worker"):
# ... existing init code ...
self.app.add_route(f"{route_prefix}/set_analysis", self.set_analysis, methods=['POST'])
self.app.add_route(f"{route_prefix}/clear_analysis", self.clear_analysis, methods=['POST'])
async def fetch_actionboard_data(self, token: str, org_slug: str, project_slug: str, analysis_slug: str) -> list:
"""Fetch ActionBoard data for the selected analysis."""
actionboard_url = f"https://api.botify.com/v1/analyses/{org_slug}/{project_slug}/{analysis_slug}/actionboard"
headers = {"Authorization": f"Token {token}"}
all_data = []
try:
while actionboard_url:
response = requests.get(actionboard_url, headers=headers)
response.raise_for_status()
data = response.json()
actionboard_data = data.get('results', [])
all_data.extend(actionboard_data)
actionboard_url = data.get('next')
await asyncio.sleep(1) # Rate limiting
return all_data
except Exception as e:
self.logger.error(f"Error fetching ActionBoard data: {e}")
return []
async def render_actionboard(self, org_slug: str, project_slug: str, analysis_slug: str, data: list, error: str = None):
"""Render the ActionBoard data table."""
return Card(
H3("ActionBoard Data"),
P(f"Organization: {org_slug}"),
P(f"Project: {project_slug}"),
P(f"Analysis: {analysis_slug}"),
Div(
A("Change Organization",
hx_post=f"{self.route_prefix}/clear_org",
hx_target="#pagewoker-card",
style="margin-right: 15px;"),
A("Change Project",
hx_post=f"{self.route_prefix}/clear_project",
hx_target="#pagewoker-card",
style="margin-right: 15px;"),
A("Change Analysis",
hx_post=f"{self.route_prefix}/clear_analysis",
hx_target="#pagewoker-card"),
style="font-size: 0.9em; color: #666;"
),
P(error, style="color: red;") if error else None,
Table(
Tr(
Th("Item"),
Th("Value"),
Th("Impact"),
style="background-color: #f5f5f5;"
),
*[Tr(
Td(item.get('name', 'N/A')),
Td(item.get('value', 'N/A')),
Td(item.get('impact', 'N/A'))
) for item in data],
style="width: 100%; border-collapse: collapse; margin-top: 20px;"
) if data else P("No ActionBoard data found for this analysis.", style="color: #666;"),
id="pagewoker-card",
style="overflow-x: auto;" # Makes table scrollable horizontally if needed
)
async def set_analysis(self, request):
"""Handle analysis selection."""
form = await request.form()
analysis_slug = form.get('analysis_slug')
if not analysis_slug:
return await self.render_analysis_selection(
db.get("ORGANIZATION", ""),
db.get("PROJECT", ""),
[],
error="Please select an analysis"
)
# Store the selected analysis
db["ANALYSIS"] = analysis_slug
# Fetch ActionBoard data
token_file = Path('botify_token.txt')
token = token_file.read_text().strip()
org_slug = db["ORGANIZATION"]
project_slug = db["PROJECT"]
actionboard_data = await self.fetch_actionboard_data(token, org_slug, project_slug, analysis_slug)
return await self.render_actionboard(org_slug, project_slug, analysis_slug, actionboard_data)
async def clear_analysis(self, request):
"""Clear the analysis selection and show analysis selection form."""
if "ANALYSIS" in db:
del db["ANALYSIS"]
# Re-fetch analysis slugs and show selection
token_file = Path('botify_token.txt')
token = token_file.read_text().strip()
org_slug = db["ORGANIZATION"]
project_slug = db["PROJECT"]
analysis_slugs = await self.fetch_analysis_slugs(token, org_slug, project_slug)
return await self.render_analysis_selection(org_slug, project_slug, analysis_slugs)
And update the analysis selection form to submit to the new endpoint:
async def render_analysis_selection(self, org_slug: str, project_slug: str, analysis_slugs: list, error: str = None):
"""Render the analysis selection card."""
return Card(
H3("Analysis Selection"),
P(f"Organization: {org_slug}"),
P(f"Project: {project_slug}"),
Div(
A("Change Organization",
hx_post=f"{self.route_prefix}/clear_org",
hx_target="#pagewoker-card",
style="margin-right: 15px;"),
A("Change Project",
hx_post=f"{self.route_prefix}/clear_project",
hx_target="#pagewoker-card"),
style="font-size: 0.9em; color: #666;"
),
P(error, style="color: red;") if error else None,
Form(
Select(
*[Option(slug, value=slug) for slug in analysis_slugs],
name="analysis_slug",
required=True,
style="width: 100%; margin: 10px 0;",
) if analysis_slugs else P("No analyses found for this project.", style="color: red;"),
Button("View ActionBoard", type="submit") if analysis_slugs else None,
hx_post=f"{self.route_prefix}/set_analysis",
hx_target="#pagewoker-card",
hx_swap="outerHTML"
),
id="pagewoker-card"
)
This implementation:
- Adds ActionBoard data fetching functionality
- Creates a new table view for displaying ActionBoard data
- Maintains navigation between all steps (org/project/analysis selection)
- Uses consistent styling and error handling
- Keeps the same token handling approach
- Makes the table horizontally scrollable if needed
The flow is now complete: Organization → Project → Analysis → ActionBoard Data
Let me know if you’d like any adjustments to the implementation!
Me: It keeps telling me No ActionBoard data found for this analysis, but I think there is. Can you give me a stand-alone executable copy/paste code example that pulls the ActionBoard items given an org, project and analysis slug? I want to run it in a Jupyter Notebook to experiment. It will be from the same folder so will have access to all the same pip installed libraries.
Claude: Here’s a standalone script you can run in a Jupyter notebook to test the ActionBoard data fetching:
import requests
import asyncio
from pathlib import Path
import json
from pprint import pprint
async def fetch_actionboard_data(org_slug: str, project_slug: str, analysis_slug: str) -> list:
"""Fetch ActionBoard data for a given analysis from the Botify API."""
# Get token from the same file PageWorker uses
token = Path('botify_token.txt').read_text().strip()
actionboard_url = f"https://api.botify.com/v1/analyses/{org_slug}/{project_slug}/{analysis_slug}/actionboard"
headers = {"Authorization": f"Token {token}"}
all_data = []
try:
print(f"Fetching from URL: {actionboard_url}")
while actionboard_url:
response = requests.get(actionboard_url, headers=headers)
response.raise_for_status()
data = response.json()
# Debug: Print raw response
print("\nRaw API Response:")
pprint(data)
actionboard_data = data.get('results', [])
all_data.extend(actionboard_data)
actionboard_url = data.get('next')
# Debug: Print current data
print(f"\nFound {len(actionboard_data)} items in this batch")
print(f"Next URL: {actionboard_url}")
await asyncio.sleep(1) # Rate limiting
print(f"\nTotal items found: {len(all_data)}")
return all_data
except Exception as e:
print(f"Error fetching ActionBoard data: {str(e)}")
return []
# Run the async function
async def main():
# Replace these with your actual values
org_slug = "YOUR_ORG_SLUG"
project_slug = "YOUR_PROJECT_SLUG"
analysis_slug = "YOUR_ANALYSIS_SLUG"
data = await fetch_actionboard_data(org_slug, project_slug, analysis_slug)
if data:
print("\nSuccessfully fetched data:")
pprint(data)
else:
print("\nNo data found or error occurred")
# In Jupyter, run this:
await main()
To use this:
- Make sure you’re in the same directory as your
botify_token.txt
file - Replace
YOUR_ORG_SLUG
,YOUR_PROJECT_SLUG
, andYOUR_ANALYSIS_SLUG
with your actual values - Run the cell
The script includes debug printing to help identify where things might be going wrong. It will show:
- The URL being accessed
- The raw API response
- Number of items found in each batch
- Any next page URLs
- The total number of items found
This should help diagnose whether:
- The API is being reached successfully
- The response format is what we expect
- There’s actually data available
- Any errors are occurring
Let me know what you find and we can adjust the code accordingly!
A Botify BQLv2 Aggregate Query Example
After some probing, it’s clear that ActionBoard is not implemented in the API, regardless of what Gemini-Advanced tells me from the swagger.json. After looking around, it appears that it’s pullable by counting issues on URLs from other tables. This means pulling aggregate data with a query. And I have to build upon past successes here. The closest thing I can do to pulling aggregate data using Botify BQLv2 (important, because I won’t use up my daily quota fast) is based off of Link-Graph data-pulls I’ve done in the past/
First a configuration file to plug in the parameter arguments…
{
"org": "example-org",
"project": "example-project.com",
"analysis": "20240101"
}
…and the file that uses it…
import requests
import json
from pathlib import Path
from pprint import pprint
def load_config():
"""Load configuration from config.json"""
try:
config = json.loads(Path('config.json').read_text())
required = ['org', 'project', 'analysis']
if not all(k in config for k in required):
raise ValueError(f"Config file must contain: {', '.join(required)}")
return config
except FileNotFoundError:
raise FileNotFoundError(
"Please create a config.json file with 'org', 'project', and 'analysis' fields"
)
def get_urls_by_depth(org, project, analysis):
"""Get URL count aggregated by depth for a Botify analysis"""
# Load API key from file
try:
api_key = Path('botify_token.txt').read_text().strip()
except FileNotFoundError:
raise FileNotFoundError(
"Please create a botify_token.txt file with your API token"
)
url = f"https://api.botify.com/v1/projects/{org}/{project}/query"
data_payload = {
"collections": [f"crawl.{analysis}"],
"query": {
"dimensions": [
f"crawl.{analysis}.depth"
],
"metrics": [
{
"field": f"crawl.{analysis}.count_urls_crawl"
}
],
"sort": [
{
"field": f"crawl.{analysis}.depth",
"order": "asc"
}
]
}
}
headers = {
"Authorization": f"Token {api_key}",
"Content-Type": "application/json"
}
response = requests.post(url, headers=headers, json=data_payload)
response.raise_for_status()
results = response.json()["results"]
depth_counts = {
row["dimensions"][0]: row["metrics"][0]
for row in results
}
return depth_counts
if __name__ == "__main__":
# Load configuration
config = load_config()
# Get and display depth distribution
depth_distribution = get_urls_by_depth(
config['org'],
config['project'],
config['analysis']
)
pprint(depth_distribution)
This outputs a count of how many URLs are at each click-depth, useful in its own right. I’ve got to remember this!
{0: 1,
1: 100,
2: 1000,
3: 100,
4: 1
}
Botify API: Facing the Rabbit Hole’s Abyss
Okay, I really really really don’t want to jump down this rabbit hole. But this is the definition of a rabbit hole pitfall. There are a few easy things for me to do in the Botify API. The second I go a bit further, my knowledge deficit on the matter smacks me in the face. ALWAYS. So, I know a few things. I know how to pull the link-graph for a site, and I know how to count how many URLs that link-graph pull would produce before I do the pull so that I can adjust the click-depth I’m asking for to keep it under the export limit. This is awesome in itself and is the source of a lot of my superpowers at Botify, but this is far from good enough.
I need to get a grip on the reins of the API more broadly. And THAT is the rabbit hole. The depth I have to go in order to be a little bit productive is the inverted 80/20-rule. Normally we look for the 20% of the work we can put in to get 80% of the benefit, but when standing on the precipice of a rabbit hole, the value proposition is flipped. It’s 80% of the total window of time you have for a project would be burned up to only get 20% of the way towards where you need to be to deliver on it. I CAN NOT LET THAT HAPPEN! Yet, I cannot even build the house from straw if there is no straw to be found.
Take a Deep Breath. This Is NOT a Rabbit Hole
Good kata… good kata… good kata… where is the muscle memory here? Where is the good form? Where are the knowns? What are the little footholds and beachheads we can seize and expand upon to, if not achieve 80/20, something more towards that end of the value prop?
Damn it, if you can’t state it clearly, you can’t train an AI on it nor ask it for answers. This is like the FastHTML is not FastAPI problem x1000. The bots don’t know what the bots don’t know, and if it sounds like something else, you’ve got a compounded problem because they think they know and think it’s something else and will confidently feed you the wrong answer. To solve this, I made a block of data that I force-feed to the bots immediately before I’m about to articulate a challenging request. They always perform massively better when I do this, and I have to do this now. You can not even build your house from straw if there is no straw… oh! Damn it, even just to build a straw house, I need a haymaker. And what I’m about to do… well, it’s gonna be one wallop of a haymaker. I’ll try to rein myself in and not terribly violate the 80/20-rule, but this is going to pay off big time.
State it Clearly
Who You Are & Why We Are Here
You are an SEO Consultant using the Botify product, an enterprise website crawler and analysis product. It also has a DOM-manipulating JavaScript include-based product called PageWorkers and a DNS/CDN-based product called SpeedWorkers that serves the cached rendered DOM to bots. PageWorkers and SpeedWorkers are collectively known as Botify’s Activation products, by which they can perform SEO on behalf of website publishers.
We are here to talk about the BQL (Botify Query Language) API, specifically
BQLv2. We will use it to query against the Botify databases for the purpose of
creating PageWorkers-ready optimizations in the form of .csv
downloads or
.js
code, either of which when accompanied by specific instructions of what to
do with them is deemed a PageWorkers deliverable. Such a deliverable can
be directly taken and activated on a publisher’s site through Botify.
You are being trained on this process and we will together step through it on a
prototypical example, finding title tags that are too short, and with the help
of a human, create a pattern or a rule as if to be applied on the title tag
column of each row in a df pandas DataFrame with the .apply()
function to
produce a CSV, or alternatively express the equivalent JavaScript on a per-page
basis, or a simplified version of this not requiring JavaScript, but instead
using the Nunjucks templating engine.
Summary of Botify Query Language (BQL)
- BQL is a Domain Specific Language designed specifically for querying SEO data in Botify, not a general-purpose database query language like SQL.
- BQL queries are structured as JSON objects, not as SQL-like text statements.
- Key components of a BQL query include:
- Collections: Data sources in Botify (e.g. SiteCrawler, RealKeywords)
- Periods: Time ranges for data analysis
- Dimensions: Fields for grouping data (similar to SQL’s GROUP BY)
- Metrics: Aggregated values (e.g. counts, averages)
- Filters: Conditions to narrow down results
- BQL uses field prefixes to identify data sources, unlike SQL’s table.column notation.
- BQL supports global fields that are consistent across collections (e.g. URL, segmentation).
- Functions in BQL are applied using a JSON structure with “function” and “args” keys.
- BQL has special handling for continuous value fields, using range functions for dimensions.
Anatomy of a BQLv2 JSON Payload
Example BQL JSON payload for a POST request on URL that has an org and project and authentication token in the headers:
{
"collections": ["crawl.20210102"],
"periods": [["2021-01-01", "2021-01-31"]],
"query": {
"dimensions": ["segments.pagetype.value"],
"metrics": ["crawl.20210102.count_urls_crawl"],
"filters": {...},
"sort": [...]
}
}
Some Execution Context Required
This structure and approach is fundamentally different from SQL’s text-based query syntax, making BQL a distinct and specialized language for SEO data analysis in Botify. To actually execute this query, you would a make a POST request with the JSON payload above to the URL endpoint. Notice that the org and project are passed as part of the URL.
import requests
org = "example-org"
project = "example-project.com"
api_key = Path('botify_token.txt').read_text().strip()
headers = {"Authorization": f"Token {api_key}", "Content-Type": "application/json"}
payload = {<the JSON payload from above>}
url = f"https://api.botify.com/v1/projects/{org}/{project}/query"
response = requests.post(url, headers=headers, json=data_payload)
Building Up a BQLv2 Query
All queries require a Botify API token, in our examples loaded from project root:
# Load API key from file
api_key = Path('botify_token.txt').read_text().strip()
The headers of an http request are constructed as:
headers = {
"Authorization": f"Token {api_key}",
"Content-Type": "application/json"
}
The simplest API request possible, just to select username given the token, is:
import requests
api_key = open('botify_token.txt').read().strip()
user_data = requests.get(
"https://api.botify.com/v1/authentication/profile",
headers={"Authorization": f"Token {api_key}"}
).json()
name = user_data["data"]["username"]
print(name)
Org, Project & Analysis Slugs Required
Most queries require an org, project and analysis value, sometimes called a slug because it is likely to appear in a URL. In our examples we load our slugs using a JSON file from project root:
{
"org": "example-org",
"project": "example-project.com",
"analysis": "20240101"
}
How to Fetch Projects For an Org
If you only know the org, you can get a list of its projects:
import json
import requests
def fetch_projects(org, api_key):
"""Fetch all projects for a given organization from the Botify API."""
projects_url = f"https://api.botify.com/v1/projects/{org}"
headers = {"Authorization": f"Token {api_key}"}
all_projects = []
try:
while projects_url:
response = requests.get(projects_url, headers=headers)
response.raise_for_status()
data = response.json()
all_projects.extend(
(
project['name'], # Project Name
project['slug'], # Project Slug
project['user']['login'] # Project User Login
)
for project in data.get('results', [])
)
projects_url = data.get('next')
return sorted(all_projects)
except requests.RequestException as e:
print(f"Error fetching projects for {org}: {e}")
return []
# Load API key and configuration
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config.json"))
org = config['org']
# Fetch projects
projects = fetch_projects(org, api_key)
for project_name, project_slug, project_user in projects:
print(f"Name: {project_name}, Slug: {project_slug}, User: {project_user}")
How to Fetch Analyses For an Org & Project
When you have an org and a project, you can get its list of analyses:
import json
import requests
def fetch_analyses(org, project, api_key):
"""Fetch analysis slugs for a given project from the Botify API."""
analysis_url = f"https://api.botify.com/v1/analyses/{org}/{project}/light"
headers = {"Authorization": f"Token {api_key}"}
all_analysis_slugs = []
try:
while analysis_url:
response = requests.get(analysis_url, headers=headers)
response.raise_for_status()
data = response.json()
all_analysis_slugs.extend(
analysis['slug'] for analysis in data.get('results', [])
)
analysis_url = data.get('next')
return all_analysis_slugs
except requests.RequestException as e:
print(f"Error fetching analyses for project '{project}': {e}")
return []
# Load API key and configuration
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config.json"))
org = config['org']
project = config['project']
# Fetch analyses
analyses = fetch_analyses(org, project, api_key)
for slug in analyses:
print(slug)
What Are Collections in Botify?
Collections are a source of data in Botify. Each collection exposes a set of
fields that can be used as metrics, dimensions, and/or filters. There is a
1-to-1 relationship between analyses and collections, so every analysis of the
form also has a collection in the form crawl.YYYYMMDD
, where YYYMMDD follows
the pattern of the returned “date slugs” from the fetch_analyses()
function.
There are many more collections in Botify than just on crawl analyses. For
example search_console
or sitemap
data. There are sometimes dot-notation
prefixes such as visits.ganalytics
or web_vitals.field_data
. Collections can
be thought of as table aliases set up for API convenience. Not all
projects have the same list of collections.
Given an org and a project, you can get a list of its collections:
import json
import requests
def fetch_collections(org, project, api_key):
"""Fetch collection IDs for a given project from the Botify API."""
collections_url = f"https://api.botify.com/v1/projects/{org}/{project}/collections"
headers = {"Authorization": f"Token {api_key}"}
try:
response = requests.get(collections_url, headers=headers)
response.raise_for_status()
collections_data = response.json()
return [
(collection['id'], collection['name']) for collection in collections_data
]
except requests.RequestException as e:
print(f"Error fetching collections for project '{project}': {e}")
return []
# Load API key and configuration
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config.json"))
org = config['org']
project = config['project']
# Fetch collections
collections = fetch_collections(org, project, api_key)
for collection_id, collection_name in collections:
print(f"ID: {collection_id}, Name: {collection_name}")
How to Fetch the Fields of a Collection
And finally, any given collection given is nature as an aliased table, must also have fields. And it is these fields we must often get to in order to include metrics, add dimensions and filters to reports. Because for any given project, the collections may vary and within a collection, the fields may vary it is critical to check available collections and fields when constructing a BQLv2 query to ensure the requested resources are available.
import json
import requests
def fetch_fields(org, project, collection, api_key):
"""Fetch available fields for a given collection from the Botify API."""
fields_url = f"https://api.botify.com/v1/projects/{org}/{project}/collections/{collection}"
headers = {"Authorization": f"Token {api_key}"}
try:
response = requests.get(fields_url, headers=headers)
response.raise_for_status()
fields_data = response.json()
# Extract fields from each dataset in the response
all_fields = [
(field['id'], field['name']) # field id and field name
for dataset in fields_data.get('datasets', [])
for field in dataset.get('fields', [])
]
return all_fields
except requests.RequestException as e:
print(f"Error fetching fields for collection '{collection}' in project '{project}': {e}")
return []
# Load API key and configuration
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config3.json"))
org = config['org']
project = config['project']
collection = config['collection'] # Assuming collection is also in config
# Fetch fields
fields = fetch_fields(org, project, collection, api_key)
for field_id, field_name in fields:
print(f"ID: {field_id}, Name: {field_name}")
Putting It All Together in a BQLv2 Aggregate Query
Let’s say we want to export the link graph of a Botify-crawled website but the API export limit is 1-million rows. This presents us with several challenges.
- How to export data (versus listing and paging-through)
- How to construct the query, finding and arranging the right collections and field names
- How to make sure we are requesting a number of rows within the export limit
Listing vs. Exporting
URLs used for listing, paging and aggregate values use the org and project in the URL and with the correct JSON payload and token in the headers, the response object can be read directly to a maximum of 2000 rows, and the endpoint takes the querystring parameters:
- size: the number of results by page. By default 500
- page: the number page. By default 1
- count: the total number of results. If specified, the API response will contain a count key with the total number of results. By default not specified.
The URL for listing, paging and aggregate queries is:
url = f"https://api.botify.com/v1/projects/{org}/{project}/query"
If initiating an export, which is the generating of a .csv
file for download,
the project and org are no longer included in the URL endpoint but are instead
made part of the JSON payload. The endpoint URL to be used for an export is:
url = "https://api.botify.com/v1/jobs"
Polling When Exporting .csv
Files
We cover later how the JSON payload must be altered, but the returned value now for an export will be a JSON object containing a “job_url” key with a URL value for polling. That polling URL must be checked at intervals until the value for the key “job_status” == “DONE”, at which time there will be a “download_url” in “results”.
Considering the rigmarole of all this, it’s best to check you’re actually downloading what you want first, so it’s good practice to do the equivalent query without the export, as a much more lightweight aggregate report query beforehand using the data you receive back to adjust your export query.
Aggregate Queries as 1-Row Lists (Measure Before Exporting)
In the following example, we will query for the number of edges in the link graph before we download it so we can make sure it is under the 1000000 limit. The way we know how to construct this query is:
- We look at the list of analysis slugs for an org and project
- We check the list of collections for the analysis slug for
crawl.YYYYMMDD
(e.g.crawl.20241104
) - We check the fields for the crawl collection for
.outlinks_internal.nb.total
and.depth
- We assemble the JSON payload using the API syntax pattern demonstrated here
- We make the request and parse the results for the “number of edges” answer
import json
import requests
def fetch_edge_count(org, project, analysis, depth, api_key):
"""Fetch the number of edges (internal outlinks) for a given analysis from the Botify API."""
url = f"https://api.botify.com/v1/projects/{org}/{project}/query"
headers = {
"Authorization": f"Token {api_key}",
"Content-Type": "application/json"
}
# Define the payload for the query
data_payload_v2 = {
"collections": [
f"crawl.{analysis}"
],
"query": {
"dimensions": [],
"metrics": [
{
"function": "sum",
"args": [
f"crawl.{analysis}.outlinks_internal.nb.total"
]
}
],
"filters": {
"field": f"crawl.{analysis}.depth",
"predicate": "lte",
"value": depth
}
}
}
try:
response = requests.post(url, headers=headers, json=data_payload_v2)
response.raise_for_status()
data = response.json()
# Extract the number of edges from the response data
edges = data["results"][0]["metrics"][0]
return edges
except requests.RequestException as e:
print(f"Error fetching edge count for analysis '{analysis}' in project '{project}': {e}")
return None
# Load API key and configuration
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config3.json"))
org = config['org']
project = config['project']
analysis = config['analysis'] # Assuming 'analysis' is also in config
depth = config.get('depth', 5) # Assuming 'depth' is configurable, with a default of 5 if not set
# Fetch the number of edges
edges = fetch_edge_count(org, project, analysis, depth, api_key)
# Display the number of edges
if edges is not None:
print(f"Number of edges (internal outlinks) at depth <= {depth}: {edges}")
else:
print("Failed to retrieve the edge count.")
This will output something like:
Number of edges (internal outlinks) at depth <= 5: 933864
Looping Lightweight Parameterized Reports
With such a lightweight query, we can now execute it repeatedly with depth
parameterized in a loop to find the deepest click-depth we can export without
exceeding a million rows. An outer loop like this so it could be used with the
above code to find max depth, although this might be able to be done more
efficiently with a single report.
final_edges = 0
for depth in range(1, 10):
edges = get_edges(depth)
if edges > 1000000 or edges == previous_edges:
chosen_depth = depth - 1
final_edges = previous_edges if depth > 1 else edges
break
elif depth == 9:
chosen_depth = depth
final_edges = edges
Exporting .CSV
Files from Botify With the API
Now knowing at what depth you want to export your link graph, you can begin the csv export.
import json
import requests
import gzip
import shutil
def start_export_job(org, project, analysis, chosen_depth, api_key):
"""Start an export job in Botify and poll for job completion."""
url = "https://api.botify.com/v1/jobs"
headers = {
"Authorization": f"Token {api_key}",
"Content-Type": "application/json"
}
# Data payload for the BQLv2 query
data_payload_v2 = {
"job_type": "export",
"payload": {
"username": org,
"project": project,
"connector": "direct_download",
"formatter": "csv",
"export_size": 1000000,
"query": {
"collections": [
f"crawl.{analysis}"
],
"query": {
"dimensions": [
"url",
f"crawl.{analysis}.outlinks_internal.graph.url"
],
"metrics": [],
"filters": {
"field": f"crawl.{analysis}.depth",
"predicate": "lte",
"value": chosen_depth
}
},
},
}
}
try:
# Start the export job
response = requests.post(url, headers=headers, json=data_payload_v2)
response.raise_for_status()
export_job_details = response.json()
# Extract job URL for polling
job_url = f"https://api.botify.com{export_job_details.get('job_url')}"
# Polling for job completion
attempts = 300
delay = 3
print("Polling for job completion:", end=" ")
while attempts > 0:
attempts -= 1
time.sleep(delay)
poll_response = requests.get(job_url, headers=headers)
poll_response.raise_for_status()
job_status_details = poll_response.json()
if job_status_details["job_status"] == "DONE":
print("Done!")
download_url = job_status_details["results"]["download_url"]
return download_url
print(".", end="", flush=True)
print("\nJob did not complete within the expected time.")
return None
except requests.RequestException as e:
print(f"Error starting or polling export job: {e}")
return None
# Load API key and configuration
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config3.json"))
org = config['org']
project = config['project']
analysis = config['analysis']
chosen_depth = config.get('depth', 5)
# Start export job and get download URL
download_url = start_export_job(org, project, analysis, chosen_depth, api_key)
# Download and decompress the file if the download URL is available
if download_url:
gz_filename = "exported_data.csv.gz"
csv_filename = "exported_data.csv"
# Step 1: Download the gzipped CSV file
response = requests.get(download_url, stream=True)
with open(gz_filename, "wb") as gz_file:
shutil.copyfileobj(response.raw, gz_file)
print(f"File downloaded as '{gz_filename}'")
# Step 2: Decompress the .gz file to .csv
with gzip.open(gz_filename, "rb") as f_in:
with open(csv_filename, "wb") as f_out:
shutil.copyfileobj(f_in, f_out)
print(f"File decompressed and saved as '{csv_filename}'")
else:
print("Failed to retrieve the download URL.")
Overview of the Botify API Export CSV Process
- Starting the Export Job:
- The function
start_export_job
sends aPOST
request to the Botify API to start an export job, which will generate a CSV file containing the desired data. - The payload,
data_payload_v2
, specifies the details for the export job, including:job_type
: Indicates an export job.- Export Parameters:
username
,project
, andanalysis
: Define the Botify project and collection to be queried.- Data Format: The data is formatted as
csv
for easy handling. - Export Size: Specifies the maximum size of the exported file.
- Data Query: Defines the data structure to be retrieved (dimensions and filters).
- This payload is sent to the Botify API endpoint, which starts the export job.
- The function
- Polling for Job Completion:
- After starting the export, the function retrieves a
job_url
to track the job status. - Using the
job_url
, the function repeatedly sends aGET
request to check if the job is complete. This is known as polling. - Polling Logic:
- The function polls every 3 seconds, up to a maximum of 300 attempts (about 15 minutes).
- For each attempt, it checks
job_status_details
to see if the"job_status"
is marked as"DONE"
. - While polling, the function prints dots (
.
) to indicate progress.
- Once the job status is
"DONE"
, the function retrieves thedownload_url
for the generated CSV file. If the job does not complete within the allowed time, it returnsNone
.
- After starting the export, the function retrieves a
- Downloading the Exported File:
- If the job completes successfully,
download_url
is used to download the file. - The file is downloaded as a gzipped CSV (
exported_data.csv.gz
). Gzip compression is common for large files as it reduces file size, making downloads faster.
- If the job completes successfully,
- Decompressing the Gzipped CSV:
- After downloading, the code decompresses the
.csv.gz
file using Python’sgzip
module. - The decompressed file is saved as a regular
.csv
file (exported_data.csv
), ready for use. - File Handling Steps:
shutil.copyfileobj
is used for efficient file copying, allowing the data to stream directly from the compressed file to the uncompressed.csv
file without loading everything into memory.
- After downloading, the code decompresses the
- Final Output:
- The function ends by printing a success message if the file is successfully downloaded and decompressed.
- The resulting file,
exported_data.csv
, is now ready for analysis.
Breakdown of Key Components
- Data Query in Payload: The payload specifies filters, dimensions, and collections, targeting specific URLs and internal outlinks based on a given depth level (
chosen_depth
). This setup customizes the exported data based on the analysis requirements. - Polling and Timeout: The polling mechanism with a maximum attempt limit ensures that the function does not wait indefinitely if the job takes too long to complete.
- Decompression Step: This step automatically converts the
.gz
file to a.csv
, making the data easy to read and analyze directly from the downloaded file.
Summary
This code automates the process of:
- Starting an export job.
- Monitoring for its completion.
- Downloading the resulting gzipped CSV.
- Decompressing it for immediate access.
The entire flow is optimized for a notebook environment, with clear feedback on progress and status, creating a seamless experience for managing and retrieving large data exports.
Learning BQL via DevTools Network Panel
Let’s say we want to explore some data buried somewhere in Botify. We know that
there’s a feature called ActionBoard in the web user interface to the product.
Want some insight? You can surf to pretty much any page in the web UI, bring up
the Chrome Web Developer Tools (devools) and go to the Network
tab and do a
refresh. Information is laid bare in front of you, and a lot of information, it
is! Depending what page you’re on, you’ll get many things under the Requests
column, the verbatim list of filenames, data-connections made and other stuff.
Look for the words aggs
or query
. Click on one.
Now switch between the Payload
and Response
tabs to see what was sent and
what was received. Unfortunately, what was received isn’t very well labeled in
the received JSON object structures or else you wouldn’t have to switch to the
Payload tab so much to know what the heck you’re looking at. Mostly, it just
labels things dimensions and metrics, and then has the numbers and the booleans.
If you’re lucky you can figure out what it’s controlling in the UI, but mostly
it’s a way to expose yourself to a lot more BQL mostly V2. There’s a BQLv1 out
there still, but the new API is catching up on most things.
Jupyter Notebooks Still Hold A Certain Charm
But what about peeling away layers of all that Collections and Fields data we’re receiving. Well, I need to do that more often, and I need to go back to Jupyter Notebooks to have a little playground. As much as I love Cursor AI lately, and have had a reversal of religion concerning outright rejection of VSCode of all that is unholy… I rescind my admonishments. It is not adoration. Every time I go back to vim or NeoVim, it’s a sigh of relief. I’m typing in nvim it right now. However, whenever I do go back to a plain old Jupyter Notebook, even on with Jupyter AI running, I miss Claude via Cursor AI.
Curses! Never had I had such mixed emotions about Cursors since I tried compiling QEMU from C source to get a Cursors-only graphics-free version for Levinux that wouldn’t steal your mouse pointer when you clicked it. That challenge defeated me then, and I gave up chasing that as a violation of the 80/20-rule. Other people went through the grief of compiling nice, stable usage-hardened versions. Though I bet today I could have done it with AI assistance. Everything just out of reach info-skill-wise come just within reach.
Botify’s Domain Expertise: Training AI for SEO Mastery
And so this. Writing clearly is skill. It’s the careful creation of data specifically for the purpose of training an AI to be good at some task. Now, I’m certainly not going to use this whole rambling article for that. But I will take carefully selected extracts, and maybe further refine here and there with AI, or further broaden-out and flesh-in with AI. The idea is here I am inserting Botify domain expertise into a context-window on-demand.
AI Training Kata: Mastering the Data Dojo
Carefully and lovingly curated data and the creation of synthetic data for training is the equivalent of AI kata—martial arts practice fighting to commit moves to muscle memory. It is good form and that is precisely what I’m doing here. Knowledge and ability has eluded me over the past year or so as I tried to master the complex API of a complex product. I am exactly the sort of person AI should super-power, yet I feel these oppressive obstacles!
Beginner’s Mind Perfect For Synthesizing New Data
Teasing it out. Teasing it out. No, a lot of today was not spent coding. It was spent rounding up what I know about the Botify product, and I’m still not done. As soon as this brain-vomit of connecting dots about what I’m actually doing here subsides, I will exit vimflow and go back to the struggling Shoshin in me.
Shoshin is a concept from Zen Buddhism meaning beginner’s mind. It refers to having an attitude of openness, eagerness, and lack of preconceptions when studying, even at an advanced level, just as a beginner would. Hey, that’s me! I’m always like a friggin beginner. Nothing seems to stick. Not Assembly. Not C. Not Java. Not JavaScript. I seem to be curly-braces-allergic, which is too bad because one of the end goals here is to have AI spittin out nunchuck-laden JavaScript for PageWorkers, haha!
AI Coding: Speak Clearly to Get Better Results!
So, I should be coding, right? Well, maybe I am. The better you can articulate yourself to an AI about who you are, what you’re about, what you’re trying to do, what you’ve tried in the past, who you’re associated with, what your likes, dislikes predilections and natural aversions are, the better the AI will code for you. Express yourself clearly, and your AI-assistants assist better. It’s all about communicating with even with… no, especially with. AI.
When I started on this particular little article excursion yesterday and into today, I was greatly concerned. It was another rabbit hole. The difference between downward, spiraling, demoralizing, doom and permanently and forever, moving your life forward in a positive way is a very thin line of demarcation. I had hit an impasse. The impact was in my own knowledge and ability, despite having PhD level intelligence AI at my disposal. You’ve got a know how to be a doctors boss.
Mastering Life’s Climb, One Small Win at a Time
Being knocked off your horse is a wonderful opportunity to learn how to hop back up onto your horse. No matter how high that horse may be Ivory, your tower or virtuous or signaling. That which was previously unattainable and elusive, clicks into a un to elude and inevitably conquered skill. The ability to bank small wins continues to be with it always was however, you recalibrate small. When video on computers first came out, it was postage-stamp size. Now, people from the new generation might never encounter a postage stamp and couldn’t remember video being a challenge for computers.
That’s where we are today with AI. I feel the “demand” for AI in the Cursor AI IDE delays… the funny waiting-in-line queues you get thrown into when it gets busy. It’s echos of the era of slow websites again. And guess what happened? Capacity expanded exponentially. Amazon built data centers to put the Hoover Dam to shame, and that was… was… Perplexity?
Perplexity: Amazon’s data center construction efforts have been ongoing for many years, but they have significantly ramped up their investments in recent years. Here’s an overview of Amazon’s data center development and how it compares to the Hoover Dam in terms of engineering scale:
Amazon’s Data Center Development
Amazon Web Services (AWS), Amazon’s cloud computing division, has been building data centers since the early 2000s. However, their large-scale investments in data center infrastructure have become more prominent in recent years:
Recent Major Investments:
- In April 2024, Amazon announced plans to invest $11 billion to build a data center campus near New Carlisle, Indiana[1][4].
- In July 2023, Amazon revealed plans to invest $7.8 billion in data centers in central Ohio by the end of 2029[2].
- Amazon has also announced plans to invest $35 billion to establish multiple data center campuses in Virginia by 2040[2].
Scale and Impact:
- The $11 billion investment in Indiana marks the largest capital investment announcement in the state’s history[1][4].
- These data center projects are expected to create thousands of jobs and have significant economic impacts on their respective regions[1][2][4].
Comparison to Hoover Dam
While it’s challenging to make a direct comparison between Amazon’s data centers and the Hoover Dam due to their different purposes and eras of construction, we can consider some aspects:
1. Investment Scale:
- The Hoover Dam, adjusted for inflation, cost approximately $760 million in 1936 dollars, which would be roughly $15 billion in today’s currency.
- Amazon’s recent data center investments, such as the $11 billion Indiana project, are comparable in terms of financial scale[1][4].
2. Technological Impact:
- The Hoover Dam was a groundbreaking engineering feat of its time, providing crucial water management and hydroelectric power.
- Amazon’s data centers represent cutting-edge technology infrastructure, powering cloud computing and artificial intelligence capabilities that are transforming various industries[1][4].
3. Economic Impact:
- Both projects have significant economic impacts on their regions, creating jobs and stimulating local economies[1][2][4].
4. Construction Timeframe:
- The Hoover Dam was built in five years (1931-1936).
- Amazon’s data center projects are typically built in phases over several years, with the ability to expand and upgrade continuously[1][2].
5. Physical Scale:
- While the Hoover Dam is a single massive structure, Amazon’s data centers consist of multiple facilities spread across various locations, forming a distributed network of computing power.
In conclusion, while the Hoover Dam and Amazon’s data centers differ greatly in their purpose and era, they both represent massive engineering undertakings with significant technological, economic, and societal impacts. Amazon’s data center investments, particularly in recent years, demonstrate a scale of investment and potential impact that is comparable to major historical infrastructure projects like the Hoover Dam.
Citations:
[1] https://www.southbendtribune.com/story/business/2024/04/25/amazon-plans-to-build-11-data-center-complex-near-new-carlisle/73450066007/
[2] https://www.constructiondive.com/news/amazon-invest-8-billion-ohio-data-centers/684913/
[3] https://www.datacenterfrontier.com/site-selection/article/55036862/with-almost-14-billion-in-announced-spending-data-centers-call-indiana-home
[4] https://indianacapitalchronicle.com/briefs/amazon-web-services-to-build-11b-indiana-data-center-campus/
[5] https://www.reddit.com/r/mississippi/comments/1ajhqxe/amazon_to_build_two_data_centres_worth_10bn_in/
[6] https://aws.amazon.com/compliance/data-center/data-centers/
[7] https://aws.amazon.com/about-aws/global-infrastructure/
[8] https://www.datacenterdynamics.com/en/news/amazon-named-as-end-customer-for-starwood-data-centers-in-herndon-virginia/
Virtual Houses: The Real-Life Straw vs. Brick Battle
Yeah, right so my articles are weird, I know. But who are you and what are you doing here reading about kata compensating for karma chameleons, anyway? Or is that about the 3 little pigs and straw houses? From straw houses to Hoover Dam and Amazon datacenters. Wow, that really isn’t so far fetched. Hoover Dam is a static immovable resource.
The Hoover Dam generates electricity where it’s at by virtue of the nature of the brick house. Amazon datacenters and cloud services in general on the other have so abstracted away the physical details that you’d believe it’s—well, a straw house. Throw them up quick. Instantiate plenty. When the wolf comes a blowin, you can just instantiate out so many VMs or containers that the wolf’s efforts are futile in comparison. Bricks not required. So the metaphor stands.
Monday Night & The Marathon Continues
But was my endeavor today even straw houses? It’s a long meandering way to go for the work I need to show tomorrow, and it’s about 8:30 PM on Monday. That’s the day just beginning by some measures. But I’ll run up against diminishing returns and exhaustion soon enough. Okay, so let’s lay down the punch-line of this entire thing.
Hey Claude! Or hey Gemini, considering the size of this article now. Or maybe hey o1-preview, given the reasoning prowess I’m about to need to tap. Here is the challenge I lay before your digital selves.
- You know how to hit the Botify API now. You’ve got some good examples.
- I need to hit the Botify API in ways I haven’t before; certain tables in Collections.
- I’m ready here with Jupyter Notebooks, JupyterLab in particular ready to receive code examples from you.
- Show me how to hit the most recent crawl collection by date inferred from the Analysis slugs of an Organization and Project.
- Look at the fields in that collection and tell me what’s available.
Jupyter Notebooks STILL For The Win
You of course don’t have to perform those tasks yourself, as you don’t have the
Botify token, even if you can execute Python. So instead, give me stand-alone,
that is able to be run in a Jupyter Notebook with just whatever pip installs
I’ve probably already done, Notebooks in the style we’ve established here. That
is, the old synchronous way and somewhat unsafe way of using open()
directly.
But this is all locally running single tenant stuff for now. We’ll clean it up
and get more advanced and safe in the future. But for now, the style is bare
bones to work to make the code as immediately understandable to newbs and plebes
as possible.
Specifically, that means the use of open()
over Path()
to get at whatever
customer names and API keys we’re externalizing to keep secret. It’s just less
confusing stuff for the beginner to look at. So we can externalize org, project
and the tokens for privacy. However, everything else belongs
directly in the Notebook. No user-created packages which are .py-files while the
main work is in a .ipynb-file.
Functions FTW: Notebooks & Anti-Patterns
And even though I’m tempted otherwise for another great anti-pattern here, we’ll put things in functions because we’re likely to copy/paste over to a FastHTML web framework I’m regularly porting from Notebooks to now. But do keep in mind it’s a Notebook and that does provide a unique opportunity to interact with whatever objects and variables are left around accessible in global in the forever-running ipython event loop. This is particularly true of return-values. Let’s try to keep things inspectible for debugging and fun.
From AI Kata To Real-World SEO Magic
Are you getting this? So, I guess I’m asking you just to get started. I’ll be adding one great big Notebook that goes top-down, basically repeating the API story told here to let people and AI’s onboard with the whole Botify API-adept SEO Consultant thing. But this is the warm-up. This is the kata. We’re about to spar, which is getting some actual concrete PageWorkers recommendations to give over to the client or perform on their behalf. That’s exported CSVs. That’s some lightweight JavaScript Nunchuck patterns to re-write titles and meta descriptions. That’s some bespoke friggin JavaScript to do some nutty link-graph manipulation stuff across the entire site, willing topical clusters within the site into existence.
I Can Haz Brilliance?
But let’s start small. Let’s start with banking some little wins. No chasing the rabbit down rabbit holes. No over-complex refactoring or implementations. Just one JupyerLab code-block chisel-strike at a time, starting with reproducing the stuff I have here so we quickly get up to finding ActionBoard stuff in the crawl collection fields, so we can continue our journey of peeling away the layers and know how to write some meaningful BQLv2. Begin!
Gemini-Advanced
Too big for ChatGPT 4o and o1-preview and mini. How about Claude 3.5 Sonnet on its site (vs. Cursor). Nope! Okay then, Gemini-Advanced? Yup! I don’t know if it’s going to give us brilliance, but it can take the huge file. Make up for in quantity, am I right? Let’s see…
Gemini-Advanced: It sounds like you’re building an impressive system for interacting with the Botify API and generating SEO recommendations! I’m happy to help you create a Jupyter Notebook that demonstrates how to use the Botify API, specifically focusing on the BQLv2 aspects.
Here’s a Jupyter Notebook that incorporates the elements you’ve described:
# ---
# title: Botify API Exploration
# ---
# # Botify API Exploration with BQLv2
# This notebook demonstrates how to interact with the Botify API using BQLv2,
# focusing on retrieving data and generating insights for SEO analysis.
# ## Setup and Authentication
# **1. Install necessary libraries:**
# ```bash
# pip install requests
# ```
# **2. Store your API key:**
# - Create a file named `botify_token.txt` in the same directory as this notebook.
# - Paste your Botify API key into this file.
# **3. Define helper functions:**
import requests
from pathlib import Path
from pprint import pprint
def load_api_key():
"""Load the Botify API key from botify_token.txt"""
try:
api_key = Path('botify_token.txt').read_text().strip()
return api_key
except FileNotFoundError:
raise FileNotFoundError("Please create a botify_token.txt file with your API token")
# **4. Define your organization and project:**
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config3.json"))
org = config['org']
project = config['project']
analysis = config['analysis']
# ## Fetching Project Analyses
def fetch_analyses(org, project, api_key):
"""Fetch analysis slugs for a given project from the Botify API."""
analysis_url = f"https://api.botify.com/v1/analyses/{org}/{project}/light"
headers = {"Authorization": f"Token {api_key}"}
all_analysis_slugs = []
try:
while analysis_url:
response = requests.get(analysis_url, headers=headers)
response.raise_for_status()
data = response.json()
all_analysis_slugs.extend(
analysis['slug'] for analysis in data.get('results', [])
)
analysis_url = data.get('next')
return all_analysis_slugs
except requests.RequestException as e:
print(f"Error fetching analyses for project '{project}': {e}")
return []
# Fetch analyses for your project
analyses = fetch_analyses(org, project, api_key)
print("Available analyses:")
pprint(analyses)
# ## Finding the Most Recent Crawl Collection
def find_recent_crawl_collection(analyses):
"""Find the most recent crawl collection from a list of analysis slugs."""
# Assuming analysis slugs are in the format YYYYMMDD or YYYYMMDD-X
# Sort the analyses in descending order to find the most recent
sorted_analyses = sorted(analyses, reverse=True)
return f"crawl.{sorted_analyses[0]}"
# Find the most recent crawl collection
recent_collection = find_recent_crawl_collection(analyses)
print(f"\nMost recent crawl collection: {recent_collection}")
# ## Fetching Collection Fields
def fetch_fields(org, project, collection, api_key):
"""Fetch available fields for a given collection from the Botify API."""
fields_url = f"https://api.botify.com/v1/projects/{org}/{project}/collections/{collection}"
headers = {"Authorization": f"Token {api_key}"}
try:
response = requests.get(fields_url, headers=headers)
response.raise_for_status()
fields_data = response.json()
# Extract fields from each dataset in the response
all_fields = [
(field['id'], field['name']) # field id and field name
for dataset in fields_data.get('datasets', [])
for field in dataset.get('fields', [])
]
return all_fields
except requests.RequestException as e:
print(f"Error fetching fields for collection '{collection}' in project '{project}': {e}")
return []
# Fetch fields for the recent crawl collection
fields = fetch_fields(org, project, recent_collection, api_key)
print(f"\nFields for collection '{recent_collection}':")
pprint(fields)
# ## Further Exploration
# From here, you can use the `recent_collection` and `fields` information to construct BQLv2 queries.
# For example, you can:
# - **Fetch ActionBoard data:** If available, retrieve ActionBoard items related to the crawl data.
# - **Analyze title tags:** Query for title tag lengths and identify those that are too short.
# - **Generate reports:** Create custom reports based on various metrics and dimensions.
# Remember to refer to the Botify API documentation and experiment with different BQLv2 queries to explore the data and generate valuable SEO insights.
Explanation:
- Setup and Authentication: This section guides the user through installing necessary libraries, storing their API key, and defining helper functions to load the key and set up the basic authentication for API requests.
- Fetching Project Analyses: This part demonstrates how to retrieve a list of available analyses (crawls) for the specified project.
- Finding the Most Recent Crawl Collection: This function identifies the most recent crawl collection by sorting the analysis slugs and extracting the corresponding collection ID.
- Fetching Collection Fields: This function retrieves the available fields for the specified collection, enabling users to understand what data they can query.
- Further Exploration: This section provides suggestions for continuing the exploration, including fetching ActionBoard data (if available), analyzing title tags, and generating custom reports.
To use this notebook:
- Save the code: Save the code as a
.ipynb
file (e.g.,botify_api_exploration.ipynb
). - Install JupyterLab: If you don’t have it already, install JupyterLab using
pip install jupyterlab
. - Start JupyterLab: Run
jupyter lab
in your terminal. - Open the notebook: In JupyterLab, open the
botify_api_exploration.ipynb
file. - Fill in your credentials: Replace
"your_org"
and"your_project"
with your actual organization and project slugs. - Run the cells: Execute each code cell in the notebook to see the results.
This notebook provides a starting point for exploring the Botify API with BQLv2. You can expand it by adding more queries, visualizations, and analysis based on your specific SEO needs. Let me know if you have any other questions or need further assistance!
Me: Alright, this is excellent and this works! But I’m not seeing ActionBoard in the output of those field names. I think it’s actually the name, or a fragment of the name of one or more collections. So look at every collection for the project looking for actionboard in the lowercased collection names please. Make a filter to show only them if found.
Gemini-Advanced:
import requests
from pathlib import Path
from pprint import pprint
def load_api_key():
"""Load the Botify API key from botify_token.txt"""
try:
api_key = Path('botify_token.txt').read_text().strip()
return api_key
except FileNotFoundError:
raise FileNotFoundError("Please create a botify_token.txt file with your API token")
# **Define your organization and project:**
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config3.json"))
org = config['org']
project = config['project']
analysis = config['analysis']
def fetch_collections(org, project, api_key):
"""Fetch collection IDs for a given project from the Botify API."""
collections_url = f"https://api.botify.com/v1/projects/{org}/{project}/collections"
headers = {"Authorization": f"Token {api_key}"}
try:
response = requests.get(collections_url, headers=headers)
response.raise_for_status()
collections_data = response.json()
return [
(collection['id'], collection['name']) for collection in collections_data
]
except requests.RequestException as e:
print(f"Error fetching collections for project '{project}': {e}")
return []
# Fetch collections and filter for those containing "actionboard" (case-insensitive)
collections = fetch_collections(org, project, api_key)
actionboard_collections = [
(id, name) for id, name in collections if "actionboard" in id.lower() or "actionboard" in name.lower()
]
print("Collections containing 'actionboard':")
if actionboard_collections:
pprint(actionboard_collections)
else:
print("No collections found with 'actionboard' in their ID or name.")
This code snippet does the following:
- Fetches all collections: It retrieves all collection IDs and names for the specified organization and project.
- Filters for “actionboard”: It converts the collection IDs and names to lowercase and checks if they contain the substring “actionboard”.
- Displays the results: It prints the IDs and names of any collections that match the filter. If none are found, it prints a message indicating that.
This should help you identify the relevant collections related to ActionBoard. You can then use the collection IDs to fetch their fields and explore the data further.
Me: Yep, I totally see the actionboard’s listed. Show me how to pull the fields of the most recent. They use the same datestamp slugs in their names as the crawls…
Gemini-Advanced:
import requests
from pathlib import Path
from pprint import pprint
def load_api_key():
"""Load the Botify API key from botify_token.txt"""
try:
api_key = Path('botify_token.txt').read_text().strip()
return api_key
except FileNotFoundError:
raise FileNotFoundError("Please create a botify_token.txt file with your API token")
# **Define your organization and project:**
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config3.json"))
org = config['org']
project = config['project']
analysis = config['analysis']
def fetch_collections(org, project, api_key):
"""Fetch collection IDs for a given project from the Botify API."""
collections_url = f"https://api.botify.com/v1/projects/{org}/{project}/collections"
headers = {"Authorization": f"Token {api_key}"}
try:
response = requests.get(collections_url, headers=headers)
response.raise_for_status()
collections_data = response.json()
return [
(collection['id'], collection['name']) for collection in collections_data
]
except requests.RequestException as e:
print(f"Error fetching collections for project '{project}': {e}")
return []
def fetch_fields(org, project, collection, api_key):
"""Fetch available fields for a given collection from the Botify API."""
fields_url = f"https://api.botify.com/v1/projects/{org}/{project}/collections/{collection}"
headers = {"Authorization": f"Token {api_key}"}
try:
response = requests.get(fields_url, headers=headers)
response.raise_for_status()
fields_data = response.json()
# Extract fields from each dataset in the response
all_fields = [
(field['id'], field['name']) # field id and field name
for dataset in fields_data.get('datasets', [])
for field in dataset.get('fields', [])
]
return all_fields
except requests.RequestException as e:
print(f"Error fetching fields for collection '{collection}' in project '{project}': {e}")
return []
# Fetch collections and filter for those containing "actionboard" (case-insensitive)
collections = fetch_collections(org, project, api_key)
actionboard_collections = [
(id, name) for id, name in collections if "actionboard" in id.lower() or "actionboard" in name.lower()
]
if actionboard_collections:
# Sort collections by date in descending order (assuming date is in YYYYMMDD format)
actionboard_collections.sort(reverse=True)
most_recent_collection = actionboard_collections[0][0]
# Fetch fields for the most recent actionboard collection
fields = fetch_fields(org, project, most_recent_collection, api_key)
print(f"Fields for the most recent actionboard collection ('{most_recent_collection}'):")
pprint(fields)
else:
print("No collections found with 'actionboard' in their ID or name.")
This code does the following:
- Fetches and filters collections: It retrieves all collections and filters for those containing “actionboard”, as before.
- Sorts the collections: It sorts the
actionboard_collections
list in descending order based on the date in the collection ID (assuming the date format is YYYYMMDD). - Gets the most recent collection: It selects the first collection ID from the sorted list, which represents the most recent one.
- Fetches fields: It uses the
fetch_fields
function to retrieve the fields for the most recent actionboard collection. - Displays the results: It prints the fields for the identified collection.
Now you should see the fields available in the most recent actionboard collection, which will help you understand the data structure and formulate your BQLv2 queries.
Me: And it does! You have done a completely amazing job, Gemini.
Gemini-Advanced: You’re very welcome! I’m glad I could help you navigate the Botify API and get closer to understanding how to extract ActionBoard data.
It sounds like you’re ready to move on to the next challenge: constructing BQLv2 queries to analyze specific aspects of your website’s SEO, like title tag length or link graph structure. I’m ready to assist when you are. Just let me know what you’d like to explore next, and we’ll tackle it together.
Do you have a particular query in mind that you’d like to start with? Or perhaps you’d like to brainstorm some ideas for how to use the ActionBoard data? I’m here to help you translate your SEO goals into concrete BQLv2 queries and get the insights you need.
Me: Yes indeed, but first could you help me distill this article into good synthetic data for training a local copy of gemma:2b on everything in this article that you would think appropriate to conditioning it to be a Botify expert in the back-end of a web app. It will help with PageWorkers recommendations while you use the software and it gets exposed to different parts of the website through the Botify API, direct small-scale crawls it will be able to perform and interaction with the user themselves. I’m thinking we train a Botifython Bot!
Gemini: It’s awesome that you’re thinking about how to best leverage this information to train your LLM! Training an LLM to be a Botify expert is a fantastic idea, and this article provides a great foundation. Here’s a breakdown of how we can distill it into effective synthetic data for your Botifython Bot:
1. Identify and Extract Core Concepts
The key is to pinpoint the most important information about Botify and BQL. These recurring themes will form the backbone of your training data:
- Botify’s Purpose: Emphasize its role in website crawling, analysis, and SEO optimization, including its Activation products (PageWorkers and SpeedWorkers) and how they benefit website publishers.
- BQL Fundamentals: Clearly distinguish BQL from SQL, highlighting its JSON structure and essential components: collections, dimensions, metrics, filters, and functions.
- API Interactions: Detail the process of authentication, constructing API requests, handling responses (including pagination), and potential challenges like rate limiting and export limits.
- Code Examples: Extract all code snippets, especially those demonstrating fetching projects, analyses, collections, and fields, as well as constructing and executing BQLv2 queries.
2. Structure the Data for Effective Learning
Present the extracted information in a format optimized for LLM training:
- Input (Question/Task): Pose a specific question or task related to Botify or BQL.
- Output (Answer/Code): Provide the correct answer, explanation, or code snippet that accomplishes the task.
Here are some examples based on your article:
Output | |
---|---|
What is the purpose of Botify? | Botify is an enterprise-level SEO platform that helps website owners and marketers improve their organic search performance. It provides tools for website crawling, log analysis, and keyword tracking. |
How is BQL different from SQL? | BQL is a JSON-based query language specifically designed for Botify, while SQL is a text-based language used for general database querying. BQL uses a different structure and syntax than SQL. |
What are the key components of a BQL query? | The key components of a BQL query include collections (data sources), dimensions (fields for grouping), metrics (aggregated values), filters (conditions to narrow down results), and sort (ordering). |
How do I authenticate API requests in Botify? | You need to include your Botify API key in the Authorization header of your HTTP requests. The header should look like this: Authorization: Token YOUR_API_KEY . |
Write a Python function to fetch all projects for a given organization in Botify. | python def fetch_projects(org, api_key): # ... (Code snippet from your article) ... |
How do I handle paginated responses from the Botify API? | The Botify API uses pagination for large responses. Check for a next field in the response JSON. If it exists, it contains the URL for the next page of results. |
What is the purpose of the crawl.YYYYMMDD collection in Botify? |
The crawl.YYYYMMDD collection in Botify contains data from a specific crawl performed on the date YYYYMMDD. Each analysis has a corresponding crawl collection. |
How can I find the most recent crawl collection for a Botify project? | You can find the most recent crawl collection by fetching all analysis slugs for a project, sorting them in descending order (assuming they are datestamped), and then constructing the collection ID using the format crawl.YYYYMMDD . |
What is the export limit for the Botify API, and how can I work with it? | The export limit for the Botify API is 1 million rows. If you need to export more data, you can use filters and BQL queries to limit the number of rows returned, or break down your export into multiple smaller jobs. |
How can I use BQL to count the number of internal outlinks for URLs at a depth of 5 or less in a specific crawl? | json { "collections": ["crawl.YOUR_ANALYSIS_SLUG"], "query": { "dimensions": [], "metrics": [ { "function": "sum", "args": [ "crawl.YOUR_ANALYSIS_SLUG.outlinks_internal.nb.total" ] } ], "filters": { "field": "crawl.YOUR_ANALYSIS_SLUG.depth", "predicate": "lte", "value": 5 } }} |
3. Refine and Expand
- Add Context: Incorporate relevant details from the article, like the discussion of “karma chameleons” and “straw houses,” to provide a richer context for understanding your approach to coding and problem-solving within Botify.
- Generate Variations: Rephrase questions and tasks to create diverse training examples. For instance, ask for the same information in different ways or request code in different programming languages.
- Include Error Handling: Add examples of common errors and how to handle them, such as invalid API keys, incorrect collection IDs, or exceeding rate limits.
- Incorporate Real-World Scenarios: Create training data based on actual SEO tasks you’ve performed with Botify, such as identifying pages with missing title tags or optimizing internal linking structure.
4. Format for Gemma
- Follow Ollama’s formatting guidelines: Ensure your data adheres to the specific format required for training Ollama models. This might involve using a specific file format (like JSONL) and structuring each line as a prompt-response pair.
- Consider using a dataset management tool: Tools like LangChain can help you manage and format your training data more effectively.
By following these steps, you can transform your insightful article into a powerful training dataset for your Botifython Bot. This will enable the bot to better understand Botify, assist with PageWorkers recommendations, and even generate BQLv2 queries based on user interactions and website data.
I’m here to help you through this process. Let me know if you’d like me to assist with extracting specific information, generating training examples, or formatting the data for Gemma.
Me: Excellent, thank you! Can you write for this overall article a meta description in my voice?
Gemini: I can certainly help you with that! Here are a few meta descriptions in your voice, capturing the essence of your article:
Option 1: Short and Sweet
Coding like the 3 Little Pigs? Yup. Building straw houses with an eye on the future. It’s about good kata, adapting to change, and the AI-powered webdev renaissance. HTMX FTW!
Option 2: Intriguing and Playful
Ever jousted a Lich at 3:30 AM? That’s what coding’s like sometimes. This is my rabbit-hole-ridden journey through AI, web frameworks, and Botify’s API. Plus, unicorn thoughts.
Option 3: Detailed and Informative
A deep dive into building AI-powered web apps with FastHTML, HTMX, and the Botify API. We’ll explore BQLv2, server-side events, and the joys of ditching ReactJS. Plus, philosophical musings on coding and life.
Option 4: Bold and Humorous
Forget the Matrix, LLMs are the REAL deal. Join me as I wrangle AIs, build straw houses, and fight the circular dependency demons. This is coding for the ADHD generation.
I tried to capture your unique style, blending technical insights with playful metaphors and personal reflections. Let me know if any of these resonate with you, or if you’d like me to tweak them further!
Me: Look at this prompt a co-worker of mine did who’s working on a similar thing with GPTs on ChatGPT. But my thing is going to run with full API-access. I think I can get the recommendations from ActionBoard and choose some specific PageWorkers optimizations based on them. Maybe we give them one CSV download option, one nunchuck pattern (not full JavaScript), and one custom JavaScript solution. We choose one of the more common things like title tags too short and we bank one modest win. 3 PageWorkers options on one modest win cherrypicked from the ActionBoard data.
Gemini: This is a solid prompt! It seems your coworker is zeroing in on the Nunjucks aspect of PageWorkers, which is a smart move. Nunjucks templating is powerful, but it can be tricky to master, especially for folks who aren’t primarily JavaScript developers. Having an AI assistant that can translate plain-language requests into Nunjucks templates is a huge time-saver and a great way to empower less technical SEOs to leverage the full potential of PageWorkers.
Your approach of integrating full API access opens up even more possibilities. Here’s how you can build on your coworker’s prompt and create something truly unique and powerful:
1. Connect API Insights to PageWorker Actions
- ActionBoard Integration: As you mentioned, the ActionBoard data can be a goldmine for identifying SEO issues and opportunities. By analyzing this data, your AI can suggest specific PageWorker optimizations tailored to the website’s needs.
- Automated Recommendations: Imagine your AI not only identifying issues like short title tags but also generating the Nunjucks templates to fix them, along with clear instructions on how to implement them within PageWorkers. This would streamline the workflow and make it much easier for SEOs to take action.
2. Expand Optimization Options
- CSV Downloads: Offer the ability to generate CSV files with recommended changes, allowing SEOs to review and edit them before implementation. This provides flexibility and control, especially for large-scale optimizations.
- Nunjucks Patterns: Provide “Nunjucks patterns” or snippets that address common SEO issues. These could be pre-built templates that users can easily customize and apply within PageWorkers.
- Custom JavaScript Solutions: For more complex scenarios, offer the option to generate custom JavaScript code that integrates with PageWorkers. This allows for advanced manipulation and optimization beyond the capabilities of basic Nunjucks templates.
3. Enhance the Training Data
- Botify API Expertise: Incorporate the knowledge you’ve gained about the Botify API, BQLv2, and the various endpoints and data structures. This will enable your AI to understand the data it’s working with and generate more accurate and relevant recommendations.
- PageWorker Documentation: Include information from Botify’s official PageWorker documentation to ensure your AI is up-to-date on the latest features and best practices.
- Nunjucks Examples: Provide a wide range of Nunjucks examples, covering different use cases and scenarios. This will help your AI learn the nuances of the templating language and generate more effective code.
4. Refine the User Interaction
- Clear Prompts: Guide users to provide specific and detailed information about their SEO goals and the issues they want to address.
- Step-by-Step Guidance: Break down the process into clear steps, making it easy for users to follow along and understand the recommendations.
- Interactive Feedback: Allow users to provide feedback on the generated templates and code, enabling the AI to learn and improve over time.
By combining your API knowledge with your coworker’s focus on Nunjucks, you can create a truly powerful AI assistant for Botify PageWorkers. This would not only streamline SEO workflows but also open up new possibilities for optimization and automation.
I’m excited to see what you create! I’m here to help you with any aspect of this project, from refining your prompts to generating training data or writing code. Just let me know what you need.
Me: Let’s keep peeling away at the first bullet point of the first item. This…
…I believe that since that actionboard you queried is a table name and these are field names, we could get a sense for what data is in there. What would the next step be to sample that data, step through it or otherwise look for the actionboard recommendations likely lurking within?
Gemini: You’re absolutely right! Now that you have the collection ID (generic.ActionBoardML.20241104
) and a list of its fields, the next step is to actually query that data.
Here’s how you can do it, building on the code you’ve already written:
import requests
from pathlib import Path
from pprint import pprint
# ... (your existing functions: load_api_key, fetch_collections, fetch_fields) ...
def fetch_actionboard_data(org, project, collection, api_key, fields=None, limit=1000):
"""Fetch ActionBoard data for a given collection from the Botify API."""
url = f"https://api.botify.com/v1/projects/{org}/{project}/query"
headers = {
"Authorization": f"Token {api_key}",
"Content-Type": "application/json"
}
# Construct the query payload
data_payload = {
"collections": [collection],
"query": {
"dimensions": [field[0] for field in fields or []], # Use field IDs if fields are provided
"metrics": [], # No metrics needed for this example
"limit": limit
}
}
try:
response = requests.post(url, headers=headers, json=data_payload)
response.raise_for_status()
data = response.json()
return data.get('results', [])
except requests.RequestException as e:
print(f"Error fetching ActionBoard data from collection '{collection}': {e}")
return []
# ... (your existing code to fetch collections and fields) ...
# Fetch ActionBoard data
actionboard_data = fetch_actionboard_data(org, project, most_recent_collection, api_key, fields)
# Display the data
if actionboard_data:
print(f"\nActionBoard data from collection '{most_recent_collection}':")
pprint(actionboard_data)
else:
print(f"\nNo ActionBoard data found in collection '{most_recent_collection}'.")
Explanation
- Query Function: The
fetch_actionboard_data
function takes the organization, project, collection ID, API key, an optional list of fields, and a limit as arguments. It constructs a BQLv2 query to fetch data from the specified collection. - Field Selection: If you provide a list of
fields
, the query will include those fields as dimensions. Otherwise, it will fetch all fields. - Data Retrieval: The function makes a POST request to the Botify API with the constructed query and returns the results.
- Displaying Data: The code then calls this function to fetch the ActionBoard data and prints it using
pprint
for easy readability.
Next Steps
- Analyze the Data: Carefully examine the returned data to understand its structure and identify potential SEO issues and opportunities.
- Extract Insights: Use the data to extract meaningful insights, such as the number of pages with missing or duplicate title tags, the average title tag length, or the distribution of different types of ActionBoard recommendations.
- Generate Recommendations: Based on your analysis, formulate concrete recommendations for optimizing the website, including PageWorker templates, CSV downloads, or custom JavaScript solutions.
This process will bring you closer to your goal of creating an AI-powered Botify assistant that can provide actionable SEO insights and recommendations. I’m here to help you with any of these steps, so feel free to ask any questions or request assistance with specific tasks.
Me: Some back and forth to finde the crawl.YYYYMMDD collection objects that might contain ActionBoard-like optimization clues…
Gemini:
# ---
# title: Botify Crawl Analysis
# ---
# # Botify Crawl Analysis
# This notebook demonstrates how to analyze data from the most recent crawl
# using the Botify API and identify potential SEO issues.
# ## Setup and Authentication
# **1. Install necessary libraries:**
# **2. Store your API key:**
# - Create a file named `botify_token.txt` in the same directory as this notebook.
# - Paste your Botify API key into this file.
# **3. Define helper functions:**
import requests
from pathlib import Path
from pprint import pprint
def load_api_key():
"""Load the Botify API key from botify_token.txt"""
try:
api_key = Path('botify_token.txt').read_text().strip()
return api_key
except FileNotFoundError:
raise FileNotFoundError("Please create a botify_token.txt file with your API token")
# **4. Define your organization and project:**
# - Create a file named `config3.json` in the same directory as this notebook.
# - Add your organization and project to this file:
# ```json
# {
# "org": "your_org",
# "project": "your_project"
# }
# ```
# Load the API key and configuration
api_key = open('botify_token.txt').read().strip()
config = json.load(open("config3.json"))
org = config['org']
project = config['project']
# ## Fetching the Most Recent Crawl Collection
def fetch_analyses(org, project, api_key):
"""Fetch analysis slugs for a given project from the Botify API."""
analysis_url = f"https://api.botify.com/v1/analyses/{org}/{project}/light"
headers = {"Authorization": f"Token {api_key}"}
all_analysis_slugs = []
try:
while analysis_url:
response = requests.get(analysis_url, headers=headers)
response.raise_for_status()
data = response.json()
all_analysis_slugs.extend(
analysis['slug'] for analysis in data.get('results', [])
)
analysis_url = data.get('next')
return all_analysis_slugs
except requests.RequestException as e:
print(f"Error fetching analyses for project '{project}': {e}")
return []
def find_recent_crawl_collection(analyses):
"""Find the most recent crawl collection from a list of analysis slugs."""
# Assuming analysis slugs are in the format YYYYMMDD or YYYYMMDD-X
# Sort the analyses in descending order to find the most recent
sorted_analyses = sorted(analyses, reverse=True)
return f"crawl.{sorted_analyses[0]}"
# Fetch analyses for your project
analyses = fetch_analyses(org, project, api_key)
print("Available analyses:")
pprint(analyses)
# Find the most recent crawl collection
recent_collection = find_recent_crawl_collection(analyses)
print(f"\nMost recent crawl collection: {recent_collection}")
# ## Listing Dot-Notation Sub-tables
def list_subtables(org, project, collection, api_key):
"""List all dot-notation sub-tables (fields with nested structure) in a given collection."""
fields_url = f"https://api.botify.com/v1/projects/{org}/{project}/collections/{collection}"
headers = {"Authorization": f"Token {api_key}"}
try:
response = requests.get(fields_url, headers=headers)
response.raise_for_status()
fields_data = response.json()
# Extract sub-tables from the response
subtables = []
for dataset in fields_data.get('datasets', []):
for field in dataset.get('fields', []):
field_id = field['id']
if "." in field_id: # Check for dot-notation indicating nested structure
subtables.append(field_id)
return subtables
except requests.RequestException as e:
print(f"Error fetching fields for collection '{collection}' in project '{project}': {e}")
return []
# List sub-tables for the recent crawl collection
subtables = list_subtables(org, project, recent_collection, api_key)
if subtables:
print(f"\nDot-notation sub-tables in '{recent_collection}':")
pprint(subtables)
else:
print(f"\nNo dot-notation sub-tables found in '{recent_collection}'.")
**Me**: Attempting to peel away another layer.
**Gemini**: OK, it seems like we're getting closer to understanding the structure of the crawl data. Now that you have a list of the dot-notation sub-tables, we can start exploring them to find relevant fields for identifying SEO issues.
Here's a breakdown of the next steps:
1. **Analyze the Sub-table Names**
Carefully examine the names of the sub-tables. Look for patterns or keywords that might indicate fields related to common SEO issues. For example:
* `metadata.title`: This sub-table likely contains fields related to title tags, which could be useful for identifying title length issues.
* `metadata.description`: This sub-table probably has fields related to meta descriptions, which could help identify missing or short descriptions.
* `scoring.issues`: Although we couldn't directly access `scoring` before, it seems there are many sub-fields under `scoring.issues` that might provide information about specific SEO problems.
* `compliant`: This sub-table might contain fields related to compliance with SEO best practices, which could be useful for identifying areas for improvement.
2. **Fetch Fields for Relevant Sub-tables**
Once you've identified some promising sub-tables, use the `fetch_fields` function to retrieve the fields within those sub-tables. This will give you a more detailed view of the data available and help you pinpoint specific fields to query.
For example, to fetch fields for the `metadata.title` sub-table, you would modify the `fetch_fields` call like this:
```python
fields = fetch_fields(org, project, recent_collection, api_key, subtable="metadata.title")
(You’ll need to update the fetch_fields
function to accept an optional subtable
argument and modify the fields_url
accordingly.)
- Construct Targeted BQLv2 Queries
With the knowledge of the available fields, you can construct more targeted BQLv2 queries to identify specific SEO issues. For example, if you find a field like metadata.title.len
within the metadata.title
sub-table, you could use it to count URLs with short title tags:
{
"collections": ["crawl.YYYYMMDD"], // Replace YYYYMMDD with the actual date
"query": {
"dimensions": [], // No dimensions needed for a simple count
"metrics": [
{
"function": "count",
"args": [
"url" // Count URLs
],
"filters": [
{
"field": "metadata.title.len", // Filter on the title length field
"predicate": "lt", // Less than
"value": 50 // Example threshold for a short title
}
]
}
]
}
}
- Iterate and Refine
Continue to explore different sub-tables and fields, construct queries, and analyze the results. This iterative process will help you uncover valuable SEO insights and prioritize the most important optimizations to tackle first.
I’m here to help you through each step. Feel free to ask any questions or request assistance with analyzing the sub-table names, fetching fields, or constructing BQLv2 queries.