Battling Bloat, Revisiting My QEMU Linux System

by Mike Levin SEO & Datamaster, 04/28/2011

Note: Since I’ve written this article, I’ve released Levinux Beta, a tiny virtual version of Linux which boots with a double-click from a Windows, Mac or other Linux desktop from USB drives or Dropbox without any install.

It’s been a long time since my last post, but that’s because I’ve been busy USING all these systems I’ve been creating. An eerily portable QEMU keychain version of Linux has been part of my daily workflow now for months, and I’ve taken to keeping the file on dropbox. The result is a stunning ability to be productive anywhere, anytime. The only downside is the file-size.

I’ve also been using my SheevaPlugs to great effect. One even served as an emergency backup server when the big rack-mounted one went down, and nobody even knew the difference! I’ve distributed a few SheevaPlugs to co-workers, further distributing workloads and safeguarding my code-base through automated pulls against the main code repository. System administration is very similar, whether it’s a QEMU file or ARM-based Plug Computer. The vision is gradually coming into place, and it’s time to take the next step.

My next step is to learn things much deeper, and establish even more control over this hardware and the files that make them go. Even though I’ve had what I consider dramatic success, I’ve used the Debian apt-get shortcut, sparing me the need to compile. This has been a pleasurable experience, and I really love Debian. However, my next step involves building a system from scratch so I can understand things from the bottom-up, and Debian is no longer a prerequisite. What I wish is to build a super-barebones system, which at some given point can be upgraded to being Debian-like by adding dpkg.

But I have a huge knowledge deficit for these next steps, and will be taking baby-steps with a series of blog posts that break down the issues. As usual, I will be thinking out loud in these posts, occasionally advocating Linux and open source, and occasionally diving into hard nosed tutorials. I appreciate the fact that it makes this blog a bit hard to navigate and get to the meat (I have tried to do it myself), and can only say that this is what helps me work through the issues for myself. Given time, I will try to organize it better.

I have been incredibly happy with the shankserver sites so far, often encountering my own pages when googling these topics. Now I have the itch to learn *nix systems much deeper. I used to make AmigaDOS boot disks in 800KB that seemed to do more than 800MB Linux installations. I understood the reason for each file I put on the floppy disk, the merits of replacement packages, and what compression was occurring.

On Linux, I can’t seem to get a base install under 300MB. I know Damn Small Linux does it in 50, and that’s greatly due to the 2.4 kernel versus the 2.6. But I’m not interested in all the Linux desktop fanfare. I’m much more attracted to embedded systems, and Linux appliances where there is not one byte more code than there needs to be. Not only is that better for embedded systems, but it’s better for the cloud when you instantiate thousands of instances, and just plain old all-around learning, understanding, and by extension, security.

Further, these systems don’t really need graphics output at all, or much device support beyond what’s necessary for the shell, memory and networking. Modern kernels have tons of overhead, supporting this hardware or that, and are easily the largest part of a Linux installation, especially when you also count all the module files that don’t get compiled into the kernel, but sit there in your /lib folder in case you ever need it. My goal is going to be to get a QEMU portable application hosting environment down to 30MB or so. I think this challenge will make me learn a lot.