Mike Levin SEO

Future-proof your technology-skills with Linux, Python, vim & git... and me!

Ajax Datagrid in HitTail Browses Massive Data

by Mike Levin SEO & Datamaster, 07/07/2006

So, if you’re ever looking for an example of an Ajax datagrid that’s connected to massive amounts of live data and also allows column-based sorting, check out the one built into HitTail. Log in as Connors for a demo, and go to the Search Hits tab. Click around. Currently, it’s up to about 5 million records in one table, and performance continues to be as remarkably snappy as when there were only a few thousand records.

We don’t plan on deleting any of the HitTail data until the size of the database has a noticeable impact on performance. In fact, we get such great performance with massive quantities of data, that we’re quite comfortable doing an timed reload of the data to simulate server push while we’re waiting for a reasonable server-push mechanism to reach the Ajax libraries (if one ever does).

The data is real-time, so if you paged forward and paged back and there was new data there, you would see it. There are no cached datasets on the database, webserver or client. It’s real-time data, all the time. With the rapid pace at which people are discovering and signing up for the HitTail beta, I’m sure we’ll be at a billion records under the Search Hits tab in no time. Eventually, we’ll have to start deleting records, but not until we see an impact on performance.

In addition to snappy performance with massive amounts of data, you can sort on any of the columns, and still achieve high performance data paging. This is a special little piece of magic that no one will appreciate, except for folks who have actually attempted to implement cache-less, cursor-less, state-less Ajax datagrids. In such cases, SQL is your enemy. Usually, you can only sort by a unique constraint column, and usually the primary key at that. And usually, you are almost bullied into using an API-layer that supports states, sessions and cursors, thus rendering the whole thing useless for massive scaling.

Based on all my googling and lack of information on how to do this, I think I hit a blind spot in Web-developer land. Scott Mitchell of 4GuysFromRolla fame got closest to the issue in his ASP.NET DataGrid book, but didn’t offer a workable solution. .NET attempts to deal with the problem with disconnected recordsets that shift the burden to the Client software. PHP and ASP Classic have ADOdb, which creates cached recordsets on the webserver. Databases themselves have alternative APIs that create sessions, cursors and persistent connections, with the accompanying evil temporary tables.

No solution offered massive scaling on minimal resources. So, we baked our own, which is completely counter-intuitive, and completely rocks. We feel like we turned a single webserver/database combo into a server farm based on the performance we’re seeing. Wait until we add clustering. HitTail is truly ready to scale to meet the worldwide demand. Our technique in addition to working well with one server works phenomenally with clustering, because we have also dispensed with pesky session IDs.

You may not see all this wonderful mojo in the seemingly simple datagrid when you log into HitTail as Connors. But go in there and click around, knowing that the table you’re connecting to is paging through 5 million total records (filtered to just the few that belong to Connors). When you hit “Last”, you’re jumping 5 million records to the end. When you hit “First”, you’re jumping 5 million records to the beginning. It is one of the first and best examples of using Ajax and radically different SQL cursoring techniques to make massive quantities of data easily browsed, manipulated and tagged on the Web.