Thursday, July 22, 2010

With head in the clouds

I write this post inspired by excellent article of Jeremiah Peschka (blog) who wrote about the future of databases. I think that this is a good topic to write something I hope meaningful about.

Jeremiah made several good observations about the nature of current database designs and technologies and what it seems to be an upcoming technology of future – the name-value stores.

I am sometimes faced with opinions of sort “Dude, these relational databases are sooo gone!”. Are they? Let’s see, what good “no sql”, cloud database should have:

Obviously trendy key-value collection. This is implemented usually using hash table, where key is transformed to certain numeric value, and this value is used to index table that holds actual data values. To add a value, you have to calculate hash of the key, resolve conflict and insert the value into appropriate index of the table. The other option is to use tree data structure, where nodes are balanced according to the key value, and each node contains actual data. To access data, you have to traverse tree based on the value of the key. When you insert a value, you have to rebalance tree (if you use balanced tree) to maintain predictable access time to the data.

In some cases you want to retrieve collection of objects within a specific order. This can be done in two ways – either you sort data each time the collection is requested, or you maintain doubly linked list of keys that enables you to move between nodes in specific order. Ah these old good *prev and *next pointers. The first option gives you overhead each time you retrieve the data, but doesn’t necessarily significantly consume memory. The second option requires additional memory and management for adding or removing values. If you change key value, the list has to be updated so the elements are returned in proper order.

Believe it or not, even clouds touch the ground sometimes. Even the cloud data have to be persisted sometimes. How do you do it? You can use either proprietary binary file structure for performance or xml, that’s pretty much it. When you write the data, you have to ensure that some other process is not writing it as well – concurrency control. You have to ensure that when data are written, other processes can read consistent view of data. There’s another thing – you don’t always write all data you have in memory. You should be able to persist only changes. To identify changed values on disk and update or insert or delete them accordingly. If you want to keep performance, these disk data structures have to be optimized for searching – here we have indexes.

All these points above are key factors of a good “no sql” database. All these points are also implemented in SQL Server. You have balanced trees for indexes, hash tables for plan cache, key indexed tables for row offsets on page level, linked lists of pages in indexes, sorting, data persistence technology optimized for concurrency and speed. You can, if you want to access data through OR mapper so you deal with collections, attributes etc.

So, what are the differences? The main and most important difference is the replication. SQL Server insists on storing changed data to disk first and then replicates it, the cloud data stores do the opposite – they replicate data to neighbouring nodes of the network before persisting it. This change of approach was made possible by development of technology, growing speeds of networks and capacities of memory installed on servers. It may be cheaper and faster to throw a few gigabytes over the fast network to the other side of the world than to store them in local SAN. The idea relies on the statistics – it is very unlikely that all of a sudden servers located in several places in the world will fail at the same time. SQL Server approach stems from old times, where servers were mostly standalone machines which had to ensure that no matter what, data will be available after disaster recovery.

The cloud applications often use traditional databases as their back ends in various locations for the reasons I described above. Traditional databases offer transactional and safe way of persisting data, what has to happen at some point of time.

I hope that this article will shed some light on internal implementation of cloud databases, especially the fact that all data structures and algorithms were invented may years ago, sometimes 50 years ago. It’s the power of technology and money what made cloud data stores possible, nothing else. There is nothing new under the sun.

5 comments:

  1. Quite a few NoSQL databases use JSON(open standard) which is actually very easy to work with.
    I can't really believe that NoSQL databases replicate data before it gets persisted locally. The replication is an asyn process and can be done once data is written locally. That's where the eventual consistency comes from I believe.

    ReplyDelete
  2. Oh yes, but asynchronous replication means that you don't wait for the physical write to send the data. Besides, I would find it awkward if the node where the change originated would be treated in a different way than nodes which are targets of replication.

    ReplyDelete
  3. There are different replication strategies thus in most cases the local node and replicas are treated differently. What's wrong with it? You might get inconsistent writes? Well, that's one of the tradeoffs and side effect of CAP theorem.

    ReplyDelete
  4. Well I never that it is bad or good :). I just wanted to point out that there is no magic, no groundbreaking technology in the clouds.

    I see that you reactivated your blog - nice :)

    ReplyDelete
  5. Took me time to read all the comments, but I really enjoyed the article. It proved to be Very helpful to me and I am sure to all the commenters here! Its always nice when you can not only be informed, but also entertained! Im sure you had fun writing this article.
    Valuable information and excellent design you got here! I would like to thank you for sharing your thoughts and time into the stuff you post!!
    http://sqlservermasters.com/

    ReplyDelete