The Best USB Drive Virtual Linux
by Mike Levin SEO & Datamaster, 07/29/2010
Note: This effort has come a long way since I wrote this article. Check out the virtual USB stick Linux called Levinux.
So this is the real-deal: a USB flash-drive virtual machine that will readily boot under either a Windows or Mac, running directly from the removable device without an install. Due to the decisions that went into its creation, it’s got a ton of advantages. Let’s look at a few.
First, it’s well-documented. I pieced together a ton of tiny little pieces, each with their own hurdle, and meticulously documented each. You can either trace my footsteps to reproduce my process from scratch, step-by-step, tweaking my decisions as you go to have 100% control over your particular mix–or you can download my virtual shankserver when I post it.
Next, this virtual computer is reasonably free and open source, so we minimize concerns over licensing. Hand-in-hand with open source is how the QEMU executable has been compiled for multiple platforms, and how your virtual machine will boot on PCs and Macs, but we already discussed that at length.
Something that is often thought about, and just as often given up on with portable virtual machines is the ability to run in-place on your USB drive. Being able to do so means you don’t have to do any “import virtual machine” procedure that only serves to make redundant copies of your VM, slows down your use of the thing, ties it to particular host hardware, and creates issues of which VM has your latest work. In short, you get persistence, and your one working copy can get better over time, no matter where you go. Any software you install, or development work you perform, all stays in the new state as you go from host machine to host machine. As a result, I recommend always copying off “virgin state” virtual machines and occasional backups so you “go back” when you need to, but you can use virtualization as a convenient backup method with any approach. The REAL advantage of this approach is that your single master VM can get forever better over the years, or your entire life.
Related-to, but not identical to, the ability to boot from a variety of host operating systems is the fact that we chose the least-proprietary format for our virtual hard disk file (raw), so there is a very good chance you could direclty boot your virtual machine on other software in the future, such as VMWare, VirtualPC, VirtualBox, Xen or one of the many others. I use the .raw extension, but this is essentially the same as .img, meaning it’s a raw dump of the disk image with no special formatting. I use “raw” to avoid confusion with other file-formats that share the “img” extension. It takes a lot of space on the real physical hard drive, but maintains the highest level of compatibility and ability to be directly-read and converted. QEMU’s normal file format is qcow, which would result in a smaller file that grows as software is installed within Linux, which would have been nice, but ultimately goes against my philosophy. Remember, each time you convert your virtual disk image you’re also forking your code. Now there are 2 copies, where there was previously only one. In some unseen future, you may want to choose between different virtualization software other than QEMU to get your VM booted, and it’s best if your image is natively and directly readable, so raw it is.
What’s next in the advantage of the approach we took? Oh yeah! We chose Linux (over say, FreeBSD), and so we are faced with the dilemma of how to create a bare-bones stripped-down version of Linux without wasting all of our time on philosophical debates of precisely what a Linux “base system” is. It’s not just the Kernel. And it’s not some particular selection of files that some centralized committee agrees on, that constitutes the stuff that goes into the directory structure–things like a compiler, the basic Linux commands, a text editor, etc. Linus Torvalds who maintains the Linux kernel aside, everything else is up for debate in the world of Linux, which is why there are so many distributions. And there’s no real “base” (though there is in the BSD-world). Thankfully, someone else has gone through this philosophical debate, and we can just piggyback on their work. That decision was made when we chose Debian Linux as our distribution, with its nifty little debootstrap tool that does all the work of deciding precisely what that minimal bare-bones base system is for Linux. We just fire off a command!
Did I mention Debian? Well, in addition to all the advantages that inherently come with that, like almost never having to compile software manually to install it, and having all your software dependencies and conflicts automatically worked out for you thanks to the aptitude / apt-get package manager system, you also get the additional support that Debian receives by being the underlying distribution behind the easily installed and massively popular Ubuntu, which equates to lots of PC hardware support. Also, the popular LiveCD survival-disk, Knoppix, is based on Linux, which means that we have the ideal popular ready-made feature-laden LiveCD boot disk to do all our fun stuff like debootstrap from. Umm, did I mention that the Google ChromeOS also sits on top of Ubuntu, and thus Debian, meaning yet more driver support, long-term viability, and getting to know the inner workings of a whole new breed of impending consumer electronic devices that are a lot less like traditional computers, but are ideal candidates for our shankserver hardware, which will be the real subject of this website.
Other advantages? Okay, since we’re using a bare-bones minimal Linux install, you have an ideal “virgin” system, to adapt into other uses. It’s a good basis for spinning your own JeOS boxes (just enough operating system). Precisely how small is our operating system, all tolled? Well, it’s a bit hard to say given that we chose raw file format over qcow, we will have to log in and look. Just type:
And this is before we even cleaned up the system. Sounds difficult? Oh yeah, we’re on Debain! Simpy type:
Going from an 83% used hard drive to a 70% used hard drive by typing one command… not bad. Oh, you say there’s only 109M available for all our software and data? Well, that leads us to the last and final advantage of this set-up. As a 500MB file, it (plus all the Mac & PC QEMU support files) will readily fit onto a 650MB CD-ROM drive, or even onto a ridiculously cheap 1GB USB stick, with almost have the space left over. But wait, we can’t fit it onto a 512GB stick? No, because after formatting, there’s only about 486MB of space left over, a wee bit too small for our image plus support files. So why not format larger, and get a wee bit more space for apps? I guess you can, and if I were to do this again, I may tweak up the size so that it can still fit on a CD-ROM, but when you start making backups and stuff, you will be thankful for the slightly smaller size.
These are the types of places where your own procedure may differ from mine. You may want to make your harddisk.raw 1.5GB to fit on a 2GB USB stick, or go with the qcow format so it stays as small as possible, and only inflates like a balloon as you fill it with data.
But all-in-all, I think I have designed the best USB drive virtual Linux given the state of today’s technology. Sure, there are compromises, like relying on somewhat flaky-seeming websites to make our compiled QEMU executables for us, but the 80/20 rule is satisfied. I will improve it over time, tweaking it this-way and that. This keychain virtual server will always be a side-project on shankserver, due to how it’s the perfect learning exercise for working with real hardware. I almost thing of this whole process like a Jedi learning to build a lightsaber.