The Perfect Storm For Node Computing Is Upon Us
by Mike Levin SEO & Datamaster, 03/12/2012
I have to move faster in order to position myself to get swept up into the path of the coming tornado. I often know what’s about to be big, and develop small home-grown versions, such as blogging before blogger and social media before Facebook, and all sorts CRM time-in’s with business systems before that became trendy.
Well, the new thing I’ve had a tough time articulating—I now know to be NODE computing… or generic computing, and represents a pretty big shift for me from being a software guy, to more of a hardware guy… or “Node” guy. So, it’s time to go on the record with these predictions, and rush my first alpha version of a portable persistent virtual machine out, to start demonstrating how cool node computing is—and maybe this time, plant my flag in the subject-matter.
Node computing is not cloud computing. It’s not Raspberry Pi micro-servers. It’s not nomadic QEMU virtual machines (VMs). But it is treating all of these as under one umbrella—being able to take advantage of and thrive on whatever is best for your app, with your code flowing seamlessly from place to place, surviving over the years and surviving through disruptions.
The cloud, micro-servers and VMs are all just different form-factors of a generic code execution platform, which in most other ways are very interchangeable. You can log into any of them, and it looks the same. You can write code on any of them, and it runs the same. If you didn’t know beforehand what you were logging into, you wouldn’t know at quick glance.
I speak not of game console or phone app development where things can’t really be generic because of high- performance and commercial proprietary reasons. Instead it’s the homogenization of computing started with porting Unix to all kinds of hardware that started in the 60’s, which the US government accelerated by commissioning the TCP/IP protocol in the 70’s, which in turn ushered in the Internet in the 80’s, caused the Web explosion via Linux/Apache through the 90’s, and is today the generic information technology “plumbing” inside everything… except Windows and game consoles.
Therefore, so long as the type of programming you’re doing is for the Internet (and not graphics on the local machine), a strategy resides here for making yourself incredibly tech savvy, highly valuable in the marketplace, and more-or-less obsolescence-proof. What you’re NOT becoming is an iPhone or Android app developer, or a Wii, Playstation or XBox developer. There is great value in being those types of developers too, but your knowledge and know-how gets tied to a particular product, and is therefore less timeless.
Thinking through the trade-off’s should be obvious. When you are programming for a special piece of hardware with special capabilities—particularly graphics and audio—you can do some kick-ass impressive stuff. You’re essentially “hitting the hardware” or “hitting the metal”—although it’s really all done through custom API’s these days in order to preserve a modicum of portability to future versions of the product.
If you’re programming for the Web, then you typically consider your target the Web browser—meaning DOM, HTML5, CSS3 and JavasScript. In my mind, this is almost as specialized and full of caveats as game development. It’s just a little bit better. While all the above-mentioned technologies, gratefully, are finally getting standardized, it’s still a huge distraction. For example, you might think you’re 90% done any project, but you’re really only 10% done, because you hadn’t yet considered the user interface. You dedicate a large portion of your career and mental energies towards these strange artifacts, which cuts back your ability to just be a casual programmer on the side.
Thankfully, there is another approach. And that is programming for the Internet in general, and not the Web in particular. Once that mountain of specialized, volatile DOM/HTML/CSS/JS knowledge that is “web programming” is cut out, and you focus instead on pure data—or at least on data that gets expressed as easily transportable XML or JSON—you can suddenly start working some magic in very short order.
This “other type” of Internet programming (notice: not “web” programming) allows you to glue together other people’s systems—connecting dots that nobody thought to connect before—without getting bogged down by all this specialized platform knowledge. In some cases, you don’t even have to do user interface work. You just hijack some other product’s UI, and just wedge yourself in there, such as I do with Google Spreadsheets, as you will soon see.
And so with pure Internet programming, you can bring bits to the picture—the specialized intelligence that no one else can provide—without having to be an expert in a hundred other things. You only need to be able to express yourself well in a language that likewise strips out the nonsense and lets you get down to business: Python.
However, you still needs a place to run your code—a place from which to issue commands and listen for responses. You can think of it as programming agents, or minions, or the sorcerer’s apprentice animating brooms. But now because you are free from all the constraints of graphics and UI work, this environment can be so utterly generic as to think of it almost as a universal computer.
Of course it’s not a true universal computer, as that is a minimal Turing machine with unlimited storage. But for practical application in the day-to-day world, it’s as close to a universal, timeless computer as we can get. Why? One of the “Nodes” of which I speak is an open source computer emulator that has been ported to all major platforms: QEMU, and can run with a double-click from any of their desktops, without so much as an install. And it’s your latest work (persistent) no matter what computer you sit down and pull your work up on.
Further, QEMU can emulate other CPU architectures, so even as computing trends change, QEMU can shift and change with it, always keeping your code from yesteryear up and running, OR aid in porting it to the latest snazziest hardware. And QEMU is only one leg in the 3-legged Node Computing platform I propose—the other legs being instances in the managed cloud (i.e. Rackspace & Amazon EC2) and real (albeit teensy-tiny) hardware. Hence, the “Nodes”.
With your code running on a sufficiently generic platform, you can easily “flow” it onto equivalent computing nodes anywhere in the work, or right on your living room. There’s lots of talk of private clouds where you can reproduce what Rackspace is doing for your own virtual server farm. But that still requires a rather costly server that. Runs hot and would take up too much space and electric bill at home. But treating Raspberry Pi’s and it’s descendants as tiny blade servers… well, you’ve got a dev machine and a production machine for only $70!
The advantages of running your own servers—no matter how tiny they are—is a deeper and more fundamental understanding of what you’re doing. You get a better feel for resource allocation, security, scaling, backups, and a whole set of things that virtual manage hosting blissfully insulates you from. This freedom that managed cloud hosting provides is good—but it’s even better after you’ve had to deal with a lot of these things yourself. Now that all the software that you need to run a business has become free, about the only thing big companies have to offer you these says is managed hosting—and you should know how even to be free of that.
So, in a scenario where there’s little difference between a keychain USB stick virtual machine, a Rasberry Pi or an instance in Rackspace or Amazon EC2 cloud, the technologies that become important are “core Unix” (we will have alarge discussion later about what’s core), mastery over an editor like vim, mastery over a language like Python, and mastery over a distributed distribution system like git or Mercurial (a.k.a. hg).
With that, you’ll have a small set of timeless tools that you can gradually assert mastery over, and be as comfortable with as your primary spoken language. I propose that for node computing, Linux / Python / vim / hg is like the LAMP stack (Linux, Apache, MySQL, Perl/PHP/Python) for the next generation. But you don’t want hogs like Apache and MySQL as required elements in your thin stack, especially now that Oracle owns MySQL and nginx is stealing Apache’s thunder.
There’s really no reason to have a fat webserver or sql server in your stack anymore. Webservers are built into Python, so you can just code up a webserver in a few lines of your Python code, and do away with maybe like a million lines of code. And a large RDBMS should not be assumed so much either—with so many alternatives, especially the now ubiquitous sqllite, which is usually enough for small jobs anyway. Then there’s the API-based db service, NoSQL, and a hundred other things that make today different than the 90’s/2000’s heyday of LAMP. The only assumptions anymore are the L and the P.
Speaking of efficiency, just forget about the dozens of other frameworks and runtime engines that are jockeying for an assumed position in your stack. Each one is a dependency waiting to break—and that includes Java. I’ve found that I can make do with about 60MB—and that includes all of Linux, the virtual machine code for three platforms, vim, Python and Mercurial. The fewer moving parts, the less that can break or be hacked, and the cheaper and easier it is to scale when you create new Nodes.
Each node should almost be thought of as an “embedded system”. These are the little computers that get embedded into everything like WiFi routers and feature phones, but are hardwired just to perform the tasks required by the device. Embedded systems need to be tiny and efficient, and Linux is already considered quite large for such applications. But by sticking to the embedded-system mentality, we tend to make smarter decisions.
I guess it only makes sense to summarize the principles that I feel go into Node computing.
1. Code written for a very baseline computer so that is highly portable—so, very little utilization of native graphics capabilities of hardware even when available. Programming is usually intended for publishing and consuming Internet services.
2. The “software stack” should be as short as reasonably possible while still providing a powerful, abstracted programming environment. A language like Python provides this, but it can sit on top of Busybox instead of GNU in order to reduce the footprint.
3. The goal of the above-two principles is to always provide as obsolescence-proof a code execution environment as possible, which can easily live on the cloud, micro-servers and USB sticks. Projects should easily flow between hosting technologies as appropriate to the application or user.
- Dropping Prices Again– EC2, RDS, EMR and ElastiCache (aws.typepad.com)
- Kvm (wiki.archlinux.org)
- Viewpoints: What will you do with your Pi? (bbc.co.uk)