Finishing a Major Feature Expansion Against All Odds - Getting Over The Hump

by Mike Levin SEO & Datamaster, 06/26/2012

Okay, a few words of prefacing before this long, somewhat disjointed post. I program, but programming is not my main job. The project I maintain has reached a point of blessed stability, and is deployed across multiple servers, serving agency partners, clients, our internal staff, and even accessible to the public with a sub-set of features. Needless to say, I approach changes with a measure 10-times, cut once attitude, which is really a shame because with programming you should be able to push things aggressively forward.

I have not fully split the code-base between experimental and stable, but it is on a distributed version control system (Mercurial), so I have something of a safety-net and the ultimate undo. With a more sophisticated dvcs scheme, I could turn it into a stable and experimental instances. But until I do, an single cloned instance where I have resisted pushing is my experimental code-base, and is where I do projects like the one documented below.

Another thing to note is that because programming is not my job, and I am actually an SEO director helping oversee client work and going on sales calls, so my forward-progress on projects like this could be a little… uh… disjointed. I have to re-calibrate my brain on these projects every few days, and sometimes I don’t even get past the re-calibration into real productive work.

So it is against that backdrop that I cobble together contiguous blocks of focus-time—for it is those very contiguous multi-hour time-blocks that ALLOW stuff to get done at all. For, one-hour is hardly enough time to re-calibrate your brain to where you left off the day before. So a series of hour “openings” between meetings can do more harm than good for inspired programming.

And so THAT concept gets to the heart of why the journal entries below come off the way they do. It is the eternal struggle to take the next big architectural step in 360iTiger and add not just a single feature, but a whole class of features that I’ve been rolling my eyes and saying “no” to so far. This is the story of adding the ability to “deliver files” from webhead servers. “Webheads” describes a type of generally stateless interchangeable server that’s created from a master image, useful for web applications that scale horizontally behind a loadbalancer. Lots of little webheads is one way of scaling in the cloud. Things you would not find on a webhead would include a databases, as db’s are the very essence of exactly the sort of statefulness and memory you don’t want on a webhead. Rather, webheads perform fire-and-forget functions, finishing in exactly the same state as they started—much like the Web itself was originally intended before application developers got a hold of it, and started needing pesky things like persistence and state.

And so the question is how to make such dumb webheads able to do things like produce XML sitemaps and allow the user to retrieve the file? You can’t run an FTP server, because the loadbalancer will round robin you to a different server than the file was saved on. If you store it temporarily and give links, you will both need to use the machine instance’s IP (same loadbalancing problem as FTP) and some scheme to clean up later. What, do you give them like 5 minutes? The answer to this conundrum is email… The server just sends files to a known email address (no loadbalancer retrieval issues), and then IMMEDIATELY deletes them. *POOF* like it was never there. Now, webheads can handle file-generation with a cache-and-vomit scheme.

And as there always is, there is a slew of tiny details that need to be worked out. Each server could host multiple 360iTiger bookmarklet sites, and each site could have multiple users, and each user could be using multiple functions that are capable of outputting files. So, there is a namespace file collision issue that needed to be worked out. There is also the timing of conditions to create and delete those file cache locations. And finally, there is the details of making the email efficient. A single file-per-function or multiple files-per-function should be supported, and any text files should all be zipped up together, while picture files that are typically already compressed should be left as-is. All files generated for a function (and conditionally zipped) should be added to a multi-part file attachment and sent to the user.

It took me a long time to get through all that, due to my measure 10-times / cut once mentality right now, combined with being an SEO director on client work and sales calls. But yesterday was the break-through. I have everything working, and only testing and release remain. Well, actually applying this new capability to the request that started this whole thing also remains. But it should be trivial now that I can just drop the files into a specific location, and the generalized system takes care of the rest.

Woot! Booyah! And every other outmoded expression.

-——————————————————————————- Mon Jun 25 11:01:31 EDT 2012 Okay, the main thing right now is getting done this friggin’ file-handling work ASAP. It’s dragged out way too long. I don’t even know if it’s going to be useful to Tameka anymore, but it is a major expansion of the system that will allow me to say “yes” to many more requests. It is useful even just in its capacity as session-memory that doesn’t have to all be in the server’s memory. It can be offloaded to the file system.

That whole 1, 2, 3 step procedure seems to be working well for me. Go with that. What’s your 1? Well, when last you left this project, you turned off file deletion and confirmed that file appending was working exactly as expected. Therefore, we have a location to work on for our email step. So, we have a few things there, such as simply sending an email at all correctly once it gets to the correct area of the program. If there’s anything in the cache location, send an email AT ALL. Yep, that’s step 1.

1. Put in email function and call it from the right location.

Wow, done. It’s really amazing how big the little steps can be. What is my next step? I’ve got a 1-hour meeting coming up. Can I at least think through the step following that? Deconstruction is a big part of the baby-step productivity problem. It’s so hard to take the next step due to not knowing exactly what it is, and the zeroing-in process is actually difficult.

Okay, we have identified that a userfiles location exists. We need to step through each folder inside of that location. We do not need to walk an entire directory structure, and can live in simple-loop-land. We don’t need to go recursive or anything like that here. So, find the Pythonic way of doing for each subdirectory in this directory.

2. Output a message for each filecach directory found in a loop… done!

3. Hmmmmmm. Now, we look inside each filecache location that we find, and step through each file in the folder… twice! Once for zipping purposes, and again for appending to an email purposes. But don’t go all the way yet. You can do that later. Okay, step 3 is to list every file encountered, and its mimetype prefix, so we can make determinations of what to do with it… done!

4. Okay, we don’t want to actually do any processing as we collect the name of “zippable” files. We actually just want to make an Empty list (in all cases), and append the name of zippable files (like text files) to that list when encountered. That way, we do a simple truth test on the list when we’re done. If it’s empty, there were no compressable files. If it’s not empty, we’ve got a list of files to collapse down (add to zip and delete).

Okay, for this I had to make the row value global. I know global variables are bad in a world of ways, but you just can’t get away from them for certain types of applications, and it’s perfectly fine for any Tiger function to know what row is being processed without it having to be handed around as a parameter. This looks like a good time for a commit… but it is tested working! Row names make a convenient filename key to ensure filename uniqueness on a per-row basis.

Next step? I’ve got a list of files that should be compressed. If the list has anything, create a zip file. Wow! It’s being created. Okay, next step? Delete zipped text files as you go!

Okay, the files are being deleted. Now, hit this thing home, finally! Do one more stepping through the files that remain in the folder, and add them to yet another list, with full pathnames.

Wow, I think it’s friggin done! Tiger can now email the output from functions. I will have to hit this home for Tameka’s application tomorrow.

-——————————————————————————- Fri Jun 22 11:19:48 EDT 2012 Okay, so yesterday I actually got a tiny little bit done with the Tameka project. Just get something happening that gets the new capabilities fully demonstrated. You keep getting derailed and distracted, and weeks are going by. It’s bad to have client deliverables rely on major feature expansion. I have to set expectations more correctly. So long as working in my capacity as a director, deliverable dates are just “good ideas” but not to be relied upon.

Okay, so yesterday’s journal entry was actually really useful. Break EVERYTHING down into a 1, 2, 3 step procedure. That structures your thinking. You have to think sequentially, and you have to break the steps down to really small bits. Sooooo…

1. Argh! One? One? Okay, it’s time to do the create-or-append trick. How does create-or-append work in Python? This should be easy to test. It looks from my research that it’s a one-liner:

file = open(filename, ‘a’)

Okay, it really is that simple. There’s something odd with doing an ls from that directory in a terminal, not being able to see the file until you change the current directory out and then back into it. But once I can see the file, it’s contents always check out correct. The pattern is dirt simple. If you want something that appends into a file on every row it hits, you simply do this:

def appendandemail(): filename = filecache() + “test.txt” file = open(filename, ‘a’) file.write(‘~’) file.close()

It couldn’t be easier! And so much for step one. That’s going to be the repeating pattern every time I need to do this sort of thing, but that’s okay. The native way to do this in Python is just as good an abbreviation and abstraction as if I made up my own interface. So just go with it. Now for step 2

2. Hmmmmm. Everything’s in place except for the emailing bit. So, what’s happening… oh, first, edit out the file-write and make sure the file from the last tiger run stays around… it doesn’t. AH! I have TWO places I’m clearing out the cache as a precaution against full caches when Tiger stops running from errors and stuff. Okay, fair enough.

-——————————————————————————- Thu Jun 21 15:08:49 EDT 2012 I don’t know exactly what it is, but I really have to get my mind into the game again. I need to make it feel REAL, REAL, REAL. I know how to do it. I just need to execute on it. I have the grand plan, and I look at my plan every day, since I made my one-page plan into my start-page. Somehow still, I don’t have the clarity each morning and day to dive directly into the right work. Momentum isn’t occuring. I need momentum! How can I start building momentum?

I need the Tiger Playbook. Just actually start using it regularly. Do everything one might need to do with Tiger, with a special emphaisis on checking off the to-do items for Danielle, Justin and other stakeholders. Oh, and there’s that piece of work for Tameka that I nearly forgot about. If I just hit that home, I will have dramatically expanded the system. What would be the 1, 2, 3 step procedure to getting that done now.

1. Call the function that uses the filecache function. 2. Think through how to append into that location. The filecache location is going to have to return a directory location inside a variable. 3. Turn off deleting the cache location so I can work on an append thing.

Okay, got all that done. How about another 1, 2, 3. The power of Python compels you!

1. Create a full pathname to a text file, including the file’s name.

-——————————————————————————- Tue Jun 12 10:57:59 EDT 2012 Is it possible to do some quick, pure programming and turn over this Tameka project and fundamental system capability expansion once and for all? Okay, let’s try.

1. Load the code… check! 2. Regain your thoughts as to where you left off… hard to do. 3. Jump in head-first and write a function that appends to a text file across function invocations. Rename testfilecache2 to appendandemail… much more descriptive.

Okay, it is this simple deconstruction and reconstruction of a few invocations that makes work suddenly approachable again. It’s all about state. The human brain has just as much problem with state as computers do. The human mind is more functional—especially as some time passes, and the state you were in is forgotten.

Ugh! It’s 4:00 PM. Just scattered meetings here and there totally destroy productivity on the programming side. Not only do they themselves break the rhythm of the zone or the flow, or whatever you want to call it, but they make little cracks that let distractions seep in. Because in the minutes leading up to the meeting, you don’t want to be dived deep and zoned-out, and in the moments after the meeting, it’s easy to think: “this is a good time to look at Twitter and email” or whatever.

The hardest thing in the world is getting a couple of hours of productivity in when you actually have the chance, because you just have to slam and bang your consciousness around, as if it were really that malleable—like silly putty.