Mike Levin SEO

Future-proof your technology-skills with Linux, Python, vim & git... and me!

Planning My Free and Open Source SEO Software Tools

by Mike Levin SEO & Datamaster, 07/09/2014

Hello Again World! Today’s my second full day in my office as an independent consultant. Before you go on, this is a reminder that this is one of my daily journal posts. There’s a lot of raw unedited rambling to help me think better and organize my time better. This post is about self-navigating more-so than it is about making a particular point, though you will see—it is my hope—some of the earliest stages of the rise of something new and different: free and open source SEO software that can be so, because it doesn’t do any of the blackhat dark arts stuff. It’s all going to be whitehat focusing on the timeless parts of SEO that amounts to thoughtfulness, good investigation and good design. It will take me awhile before I actually have something polished out there ready to use, but the process will look something like this, which I’ll gradually do as a side-task as I’m focusing primarily on the client work.

I do have my first 2 clients, and am deeply engaged in their work. SEOs can indeed travel light, needing only their laptop, an Internet connection, a few logins and their wits to do their work, so I can get started fast, as far as many business startup cost and time requirements go. There are however still a few tools I’ll be needing, using from my past, and cobbling together for my future. But before I dive into the tool build vs. buy discussion and… well, building or buying, I’m going to take a moment to talk about process—which exists separate from tools.

It’s hard to not focus too much and too early on tools, because they are a deeply connected part of all things in the human experience. Even language is a tool to help us clarify our thoughts, turning experiences into symbols and stories so that it can be communicated and transmitted across society in-the-now or transmitted with books and archives into the future for future generations. Tools is even inseparable from our discussion about separating tools from process, so this paragraph had to happen. We shape tools. Tools shape us. They are vicarious extensions of our bodies, and in time, we stop thinking about their use and just spontaneously use them expertly—without bothering the conscious mind. That’s how we type on keyboards, for example.

However, tools are to accomplish or achieve some visualized end-goal—which this morning is to organize my morning’s thoughts, so I can structure my work for the rest of the day, and perchance produce something that I can publish. I capture more ideas, videos and notes throughout the course of the day than I ever care to publish for reasons that may include that I just don’t like the video enough to share, or that I don’t wish to share the ideas with the public just yet, or even that I don’t want to overwhelm my audience out there and dilute the impact of the more meaningful items I publish.

This sort of thinking out loud in the morning is necessary to process thoughts. There’s something about writing that just brings out pent-up thoughts. You start to make connections between things in your life that you never saw, because you never took the time to thoughtfully examine it. And when you write, you are almost being forced to think better or differently.

What you are doing here with writing, process-wise, is that you are being forced to “encode” your perhaps rather amorphous and ethereal thoughts into quite literal expression strings (writing!). Talking is the same way. You don’t always really know what you think about a thing until you talk about it with someone. Talking is encoding into sound the thoughts you are having, which sometimes must be “cleaned up and refined” in order to be expressed with the tool that is idea encoding which is writing and language.

And when you write, it’s almost like you’re talking about things with a little virtual copy of yourself, whom with some exercise, you can make as critical or as non-judgemental as you want, based on your goals for that writing session. Such abstract notions might seem worlds apart from this independent consulting adventure that I’m embarking on, but they couldn’t be closer.

First off, idea-to-language encoding is a skill. There perhaps could be no greater skill, because with it, you can motivate the masses of other homo sapiens who subscribe to the same idea-to-language encoder/decoder (codec) as you do to do the things you want them to do. Hitler was a skilled idea-to-language encoder, and shows how the number of well-understood people that you can reach with the language directly affects the size and nature of system tasks you can start running.

Ah, systems! To appreciate what a system is, you have to look at any given little system in the context of the entire human experience. Quicken is a great system, but it exists in the context of a blip in time that requires bookkeeping for small businesses in the United States. The United States! Now there’s a system! But having started in 1776, it’s a 238 year experiment. A single person can live for 100 years, so the entire United States is only 2 long life-times back to back plus a 38 year-old. Good system so far. Civilization based around the domestication of food animals and agriculture is a system too.

These are all things that self-perpetuate and compel certain human behaviors. Some systems die out fast, providing all the components they need to destroy itself—like using up some limited resource in its environment. Other systems burn on for ages and eons, in a sort of stability and balance that helps self-preservation. Burn too small, and the flame is extinguished. Burn too brightly and consume the forest. Burn just right, and the torch can be carried forward through the generations for centuries.

This article may be starting to become more of a distraction to my work than a structuring of it for today. And so the self-checking function of the system springs into action triggered by glancing at the time (9:12) AM, double-checked by a quick scroll up-and-down the page for reading length. Would I read this? No, probably not (chump), but I’m not necessarily writing this with me re-reading or even you reading for the first time in mind. I’m writing this knowing that I’m about to face a pretty big process and tool issue, and I want to optimize how well I work through it.

And sooooo… I cannot and do not want to rely on one of my past creations, 360iTiger, for my client work, even though I created it and can make it do all sorts of incredible things at no cost to me. That’s because even though it is technically still open to the public, I won’t be able to do my customizations for my own use that I’m always doing, because it will shortly be on their servers and not (as it currently is) on my own Rackspace cloud account. And I won’t keep running my own instance simply out of good form, and the desire (and perhaps even need) to move on and do something new.

But something new can often require performing an enormous build before you start to get out of it more than what you put into it, and that pitfall alone could totally torpedo my attempt to go independent. I can’t go spending time building tools that I could better spend actually performing client work and delighting them with a lot of interaction, which I can see is an important criteria. It’s a lot like the field of graphic design that I trained for in school. It’s a show. Put on a good show while you do great work.

So, I have to focus on the show now, which is good because the show is also the story that’s being told, and it is also very tied to process, which is actually even more important than the tools at this point. I know I have to perform a site-crawl of my client’s sites, and of their competitors and of the search results. I know I have to pick apart what comes back from these crawls and turn them into lists.

These lists may need further processing, like tagging items in the list. I may need to combine lists, deduping against one or more columns. The concept of algorithmic processing of these lists enters the discussion soon, and the sheer complexity of what you’re (I’m) trying to do rears its ugly head, and you start looking for either pre-existing canned tools that’ll do it all for you, or you start imagining building the product to do this yourself, and become one of those online billionaires providing it to everyone who needs it.

Every once in awhile, I see a website or product pop onto the scene which is directly what I envisioned building in those situations. Joel Spolsky’s Trello site comes to mind. Eventually, everyone who’s dealing with lots of categorizing and sorting of things comes to the realization it’s always a link list. Underlying so many systems is the management of lists, and the way the items in the lists link to items in other lists, with a set of rules about what these list items represent, link-to, or otherwise contain, and how the new items are created, and new links established or old ones modified.

That goes a fair way towards describing the human brain in fact, and certainly the LISP programming language, and most common information management tasks or chores that we are called upon to do–certainly in the field of SEO. But Common LISP is the definition of making the wrong decision in build vs. buy in how you would have to build everything up from nearly scratch only to have something that only you could understand. Bad choice for a free and open source suite of SEO software. Better to chose Python for its high level list-centricity.

So, can I actually get the power of Python and actually easily manage lists? The Python part is easy—once the list data is actually on the computer that’s running Python. But then how about getting it out of Python into some format closer to being a client deliverable, like in a spreadsheet? Well, these are the precise thoughts on which Tiger was born. It took a very particular approach towards keeping data natively in spreadsheets, and doing lookups using cell values as function input parameters (or arguments) and the column names as the functions to be used—in other words, column labels ARE function names and control their invocation, with TIger. This struck an almost ideal balance between frequently repetitions work and occasional custom work.

The time-saving idea here, both from a developer perspective and from a user-training perspective, is that the spreadsheet user interface handled the heavy lifting of building a UI and UX around abstract concepts. Python did all the processing on an external server, processing the sheet row-by-row and cell-by-cell left-to-right and top-to-bottom until it was done. This had certain advantages (simple concept) and certain limitations, namely the chatty use of the gdata API and excessively tight coupling with the gdata API.

I plan on doing something like that again, and even though I’ll still be making heavy use of Google Spreadsheets again (inevitably—until there is some better way), I’m going to make a serious attempt to decouple the system in all possible ways so that Google Spreadsheet could have its functionality replaced by other things, such as simple text files as input and output. Then other UIs could be built around it. The connection to Google Spreadsheet will mostly be directing the input and output of worksheets within the tabbed spreadsheet, so that data can be managed directly in Google Spreadsheets, but it can be thought of as actual CSV files on a server’s hard drive somewhere running Python that can read and process those CSV files, and write out new ones that manifest as new worksheets (tabs) within that same spreadsheet file.

Okay, so I just passed 10:00 AM and have identified another pitfall. As appealing as following this tool-development course may be—and it is taking on a beautiful love-worthy shape in my head now—it is still a time-sink distraction from the work that has to be done. So, the approach I really want to be taking is designing this software in my head and in my notes and on this daily work journal AS I DO THE WORK THE MANUAL WAY!!! I’m up to some pretty big list management tasks at this time, and I can certainly start taking baby-steps towards building the app I want to build WHILE actually doing the client work and getting familiar with the various nuances of the challenge (again).

I always want to feel like I’m doing something extraordinary—something that more than just merely puts food on the table and helps my clients and employers directly. I want to do something big—and helping to educate the world’s knowledge-hungry people who are inhibited by their tools and the popular notion that programming is something mysterious or difficult. Programming is just being that knowledge-hungry person willing and with the self-determination to go that extra step and become acquainted with the skills of managing that knowledge and automating knowledge processes.

Okay, next step? 1, 2, 3… 1? A place to run my Python code that is easily reproducible and easily shared! This sounds like the beginning of my second github project. I’ll share the code right from the start. And even if I don’t dive down the rabbit-holes of adjusting my Levinux distro to host this app right away, I’ll make my to-do checklist so I can work my way there over time. So in other words, use Levinux to write and run my new code. Write minimal new code to get the job done. This may mean finding easy ways to shuttle text files in and out of Levinux. Or maybe I’d be better served starting out on a virtual install of Ubuntu under VMWare fusion on my Macbook Air, just as a way of not struggling with the limitations of a bare bones stripped-down system. Hmmmm.

With what I create, there will be some question of whether I’m trying to get SEOs to be more technical (turn them into Python programmers), or trying to get people trying to learn programming to get into SEO. Well, the real answer to this is I don’t care. These tools and lessons will be free and open source, and not how I make my money. It will be my main content for how I build my web presence, build a tribe, and in time, use it as a way of having a referral network—both for people to take on the clients I’m wrapping up the larger jumpstart engagements with, and as the actual clients using the SEO jumpstarting engagement services.

This is fast approaching 2 hours of writing, but it is worth it. This is the difference between thoughtfully approaching the work before you, versus the risk of sitting and looking at a blank screen and wondering what to do next.

It’s approaching 10:30, and I have to wrap this up quick. I don’t have scheduled calls today, but I want to be well underway, performing actual work, and getting into the zone. I want to be doing this BOTH for myself and for my clients, so it’s all about PROCESS now. But it’s time to get a few plates spinning. My 3 metaphors for the day (which apply now more than ever since I’m working for myself) are:

What’s most broken? Where do I get the biggest bang for the buck? What plates need to be spun?

And the answers are:

What’s most broken? I need to become more communicative with my clients on a daily basis. THEY NEED TO HEAR FROM ME DAILY WITH UPDATES!

Where do I get the biggest bang for the buck? Working my way through the processes associated with each deliverable with an eye towards what I need to buy and what I need to build. Buy quick. DON’T BUILD! JUST PLAN THE BUILD!

What plates need to be spun? Register PythonSEO.com… it really unifies two key concepts and search terms I need to tackle, and is good branding for the tools. And I actually need to jumpstart process now without doing any of that software build. I’ve got a lot of site crawling to do. I think I’m just going to buy a license to Screaming Frog SEO Spider Tool & Crawler Software—now THERE’S a product name rigged for search! It’s the cheapest, best shortcut I know to have something like my previous capabilities that will result in a high-quality enough crawl data to make keyword lists from client sites and competitors.

I can see my path is leading fast back to doing API mashups with Google Spreadsheets—mostly just for it’s UI, using my own servers in the background running Python to process the lists. The plate-spinning item here is simply making sure I am working off a Linux platform that resembles a server enough to make my code portable and easily pushed to github.