Mike Levin SEO

Future-proof your technology-skills with Linux, Python, vim & git... and me!

Jump on the giant Internet clean-up bandwagon while waiting for tectonic plates to shift

by Mike Levin SEO & Datamaster, 09/11/2012

I’m back to work after a very long time off, between a one-week vacation and grieving the passing of my mother. My mind has maybe been less than fully into the game of late, and that has to stop because things are in flux and people will be looking to me to discern what’s going on and how best to position ourselves and our clients for advantage. That, and I just love this stuff. It will be a nice place to escape to for awhile.

Currently, I feel like I’m rudderless on adrift on the sea. The world is probably about to change again tomorrow with an Apple announcement. I’m loving my Google Nexus 7. I clearly have a very hardware-centric view of the world, because I love me some hardware, and am keenly aware of what happens in your life as you go through an upgrade cycle. Out with the old, in with the new—and THAT’S the key moment where the world can change on you. Are you moving from iOS to Android? What’s the default search set to? Is there a new search model trying to weasel its way in?

These are things on which I need to think and write in a very freeform fashion, just to see what I think. It’s not the kind of writing that merits being a “point of view” document for my employer. It’s just sort of regurgitating my thoughts onto the page, for the benefit of—or at the expense of—you, the unfortunate reader.

It’s very liberating writing just for yourself, knowing that others can read along if they like… but if not, screw ‘em. I don’t make ad revenue, and I’m not writing content for a particular market segment or demographic. This article is just me indulging in that as part of my professional calibration process. I have to discern where we are today, where we are going, and make thousands of little micro-adjustments to my day-to-day work and my career. I also need to get my easy elevator pitch spiel down about how things are changing, and what we have to do to adapt, keep pace, and lead the way.

I am un-influenced by corporate interests, except in those cases where it’s genuinely earned. In fact, I’m paying to publish—using a real web host and a custom domain and managing my own WordPress. By putting my stream of consciousness out publicly, I’m building little personal brand (nothing will be more valuable in the coming years), and applying the notions of commitment and consistency.

This works for me. I do minimal editing, and focus on ferreting out what’s really on my mind and what’s really important—often right under the surface waiting to come out with a little prodding. This is BEFORE any of the more formal writing I have to do for work, where editing is required, and the work-in-progress self-narrative churn I do here isn’t appropriate. But this leads to that, in a process I think of as list-reduction.

In life, there’s not enough time to do everything top-notch or to know everything deeply. So, you take in lots and lots of input automatically and lightly scan it for significance. When you notice things of importance in the stream, you take note and essentially tag it, resulting in a shorter sub-list that you can naturally filter on with the tag. That’s a big part of where the world is going too. This monitoring the stream and rapid-tagging is somewhat taking the place of Google’s older crawl-and-index model.

So what sticks, and what just gets washed away and lost forever in the stream archive? Well, how does it work with human brains? We process symbols, because the actual data of life is too complex and overwhelming. Also, we create associations between these symbols because things assume their identity and importance in how they relate to other things. Similar memories get grouped and consolidated. Experiences that are more emotionally charged have a better chance to persist.

Meanwhile, bland memories can just sort of vanish—never making the journey over from short-term memory to long. It probably has something to do with survival—the reason awful smells indicating do-not-eat poisonous are so memorable. But it’s all in the input-steam. That’s why I call list-reduction fundamental to life.

I actually used this stream-monitoring and tagging process in making HitTail. The referrer stream of a tracking pixel is the constant input, and a series of very aggressive algorithmic filters—some with human-oversight and some not—distill this referrer list down into just the new keywords that will have an impact in new content on your site. But list-reduction is everywhere. I recently saw the principle applied by Joel Spolsky’s crew in the Trello app. I think crowd-sourced tagging of the steam is gradually taking the place of the crawl model.

This way of thinking and behaving accepts the input-driven realities of life. You have to keep monitoring, monitoring, monitoring, with with a light-touch to capture via tagging. Otherwise, the little things that make all the difference are going to slip by. Such things that “slip by” are always still there in the stream archive, and can be the fall over search back-fill if tagged stuff isn’t surfaced. And that sort of speaks to how the role that the old crawl and index approach will continue to fill. The non-monetized and non-hot-topic long-tail.

We see Google playing around with these factors in the search results—more so than in a whole decade. Of the fifteen-or-so years that I’ve been doing search engine optimization, I’ve only called myself a professional SEO for maybe 7 of those years. I resisted allowing this derivative of web development to define who I was until I went to work as a vice president of a public relations firm, heading up their SEO group. Before that, I was a Webmaster making a percentage of company gross revenues. I considered SEO one of the many things I did, and a means to an end—not the end in itself. I resisted it, because I am a builder. I like to build things.

But since I came to New York, I really have been promoting myself as an SEO—with a somewhat bad taste in my mouth, because it always somehow implied something a little seedy to me. SEOs have the reputation for being scammy. The fact that they’re always sending out spam offering their services doesn’t help. But this became the most valuable thing I had to offer professionally there for quite a few years, and so I gave in and called myself an SEO.

And so things are changing more than in the past decade, and I feel myself being a bit sorry I let myself get identified as an SEO. We always must remember that Google itself is nothing but a word-of-mouth success. Before the days of social networks and tweeting, Google ascended to popularity the old-fashioned way—sending links around in email and in-person chats around the water cooler. Things change fast. Remember Alta Vista? Remember MySpace? Well, Google doesn’t really have very much customer lock-in, because few people ever paid to use Google, and are not really customers. They are just users, and only the advertisers are customers.

No matter how mega-awesome a brand like Google has become, without real customer lock-in, such good-will and equity can dissolve overnight. Google is a castle built on clouds. And the cloud consisted very much of editorial search results content that was of a high quality in part due to their PageRank algorithm, and in part due to nothing better being around. And here we are fifteen years later, and PageRank is getting a bit crusty because it is excessively manipulated, and all the good new data is building up in social systems, with Likes, Tweets and +1’s.

And so now, we see the new world gradually taking shape. New content—and even old content—is constantly being given a chance to get votes and be buoyed by being floated in streams. There are so many of these streams now, it’s ridiculous. There’s the old Twitter stream and the Facebook news feeds, and the Google+ stream, and the homepage of every publisher that puts new articles on the homepage and in their RSS feeds, and the same with blogs and vote-up sites like Reddit and Hackernews. Streams are piling up upon streams. You hardly need to crawl anymore. Just let the crowd sit on he stream and tag like crazy as a new sort of information-age wanker pastime.

This tagging and distillation of the stream is gradually taking the place of Google’s old PageRank algorithms. With PageRank, a good piece of content might have accumulated a few thousand links over the years from sites throughout the Internet. Over time, cruft sets in. Half those links go bad, and the other half are the result of link manipulation. And such questionable links can only be made by content creators and publishers who are greatly unaccountable, because there is no author ID system built into the fabric of the Internet. So, manipulation is hard to spot and link-rot is fast to accumulate.

In other words, the Link Graph has an expiration date on it, especially when compared to the Social Graph alternative, that has built-in accountability by its very nature. People casting votes for things are permanently associating their identity with those pages. Abuse that gets a profile banned could cut off years of precious accumulated influence, and therefore a much higher quality dynamic is created with built-in accountability.

Quality content can’t be given a chance to rise to the surface without some sort of initial promotion. So, it’s becoming a “float and vote” dynamic. Float out some content, and see if it gets voted-up. In this scenario, the old-fashioned Top-10 search results of yesteryear become increasingly relegated to content-discovery overflow—the bottom-of-the-barrel scrapings when neither an advertisement nor socially super-charged content is surfaced on your search.

It’s not that invisible-hand PageRank is suddenly obsolete. It’s that over time, the quality that will be accumulating up in social systems will dwarf the link-graph by several orders of magnitude. How many webmasters are out there building links between sites anymore in a genuine un-manipulative fashion? Okay, how many people are hitting the Like button on pages because they genuinely like the content? Ten-to-one? A hundred-to-one? No, I think more like a thousand-to-one.

But all this wonderful social data is accumulating up primarily in Facebook, which is not a general search engine. It is also accumulating up in Twitter, which is also not a general search engine. The biggest irony in the advancement of search… a shift as big and serious as Google’s displacement of AltaVista… is that Google doesn’t own Twitter or Facebook. The great social warehouses of the world have become silos.

Google tried compensating with Google+, and the way its doing it is very clever: annoying everyone with a gmail address to “upgrade” it into a Google+ profile. You don’t even need to actively participate in Google+ in order to be participating. Seth Godin for example didn’t do a single thing in Google+ except upgrade his gmail address, and already he’s in 90K+ circles. And you can bet many of those circles are keyworded with names like Marketing, superimposing a sort of tagging system onto people.

Now these people in Google+ who have themselves become the center of Hubs thanks to the Circles system are probably becoming very influential. How do you think a +1 from Seth Godin will impact the search rankings of marketing-related site versus the +1 of some Joe-anyone? Yep, social counters is not just a numbers game. It’s imminently going to become extremely qualitative, just as links from influential sites work in the PageRank scenario.

Back when the Web was difficult to crawl, and crowd-sourcing was not practical, Google’s web crawling approach was brilliant and difficult to replicate, and even harder to improve upon. A few voices dissented like Stephen Wolfram (of Wolfram Alpha), arguing that better information-sources should be utilized and results better formatted for consumption. Other dissenters like Vivisimo thought the results could be grouped real-time into relationship trees. But none of these voices prevailed in light of precisely how successful Google was with its approach.

But today, you can crawl about a billion pages in a week with fairly little resources. The specialness of Google’s secret sauce is significantly less special than it was ten years ago. What’s left is brand loyalty and habits. Searchers know where to go to look for things, and people trying to be found know where they have to go to passively market their wares to searchers (AdWords). It’s a good system that works, but it is not fortified against disruption but any stretch of the imagination. It is very vulnerable to game-changers—just as the pre-Google world was vulnerable to disruption by Google.

Okay, so what you’ve got is a company making about $40-billion a year putting ads in front of searchers, whose search data is becoming increasingly crusty, because marketers have had 15 years to figure out how to game the system—and attempts to switch over to better relevancy signals based on difficult-to-scam crowd-sourcing is slow to achieve critical mass, because it’s the Johnny-come-lately to social.

Meanwhile, those with the best social signals are not using it for generic search in any meaningful way. Microsoft Bing does have deals with Facebook and Twitter allowing it access to some of this data, but for whatever reasons, it has not taken off. I would presume it is because Bing is not actually Facebook or Twitter. It just has deals and will always be limited in how cleverly and extensively they can use that data by the details of the deal. It certainly appears that way.

And so Google waits for critical mass to accumulate with Google+ while it tries to breath life into and protect the validity of their old PageRank system. The more ads they can run on high-value searches, the more money they can make off those searches, and the more they can sweep the issues of their editorial search problems off of page-1 of results and under the carpet. And meanwhile, they wait for the world to change—trying to knit themselves into the fabric of the Internet a little better, turning users into customers with Nexus 7 and maybe the next crop of Motorola Razors.

That leaves us in a sort of limbo. What do we all do while we’re waiting for things to either pan-out or not for Google? Well, we have a tiny bit of safe harbor in the fact that the Internet’s system of Web addresses is unlikely to change, and many of the signals for relevancy point to URLs. So just keep focusing on your website hierarchy and long-term evergreen URLs. Make them the best possible pages they could ever be on their themes, and promote the hell out of them.

If on the other hand all these social signals start pointing to content hosted on the social sites themselves (Facebook & Twitter), then we will be in a very different world indeed, and all bets are off. There is in fact this sort of Bizarro alternate Internet within Facebook does indeed exist with Facebook pages and Apps. But somehow I do not think the content hosted WITHIN social sites is going to be the winners.

So, what now? Well, jump on the giant Internet clean-up bandwagon. In the grand scene of things, the social counters have just started counting. These counts are so-far non-transferrable. But also don’t get too hung up on the counts themselves, as it is going to soon become a qualitative game as well, controlled by precisely WHO hit the +1 button. So, turn your site into the best possible target for the crowdsource voting signals of Like, Tweet and +1.