I came across Kyle Kingsbury’s talk on network partitions while doing some unrelated research into monitoring systems. But first,
What are Network Partitions?
Say you have a globally distributed game. One datacenter in the US, another in the EU. To consolidate the user base, we have a single accounts database. A simplistic approach would be to also throw in an inventory system into this database if we wanted to deal with in-game purchases.
So, one day a ship drops an anchor and cuts several of the underwater fibers that make landfall on one side of the Atlantic. The internet starts to route around this faulty network link. Perhaps sending some packets across Asia instead as congestion hits the few remaining Atlantic lines. All of this, takes only minutes, but a few minutes are enough to effectively split the EU cluster we have from the US cluster.
Why is it Important (or Why This Keeps Me Awake at Nights)?
What happens to people buying game items during this time? What happens to people who pay for credits? I’m spending way too much time thinking about this – but what’s the alternative? You either catch it here, or create an auditing and logging system that customer support can use to manually resolve conflicts. The industry tends to employ the later – but since I won’t have a CS department of any note, an ounce of prevention goes a long way.
Now, without further ado, here’s the talk.
Building a resilient distributed system is hard. Building a resilient distributed system without data loss and just works is even harder. I’ll end it here for now. Step one is knowing there is a problem. I’ll go into detail more later on solutions to this problem for specific cases.
Background Theory and Knowledge
There’s talk about CAP / AP but the talk assumes you know it. For those without a computer science background, it stands for Consistency, Availability, and Partition tolerance. The theorem basically says that you can have two, but never all three when dealing with a distributed system.
Each side of a partition will always think it’s the world. One typical way to resolve that impasse is via a quorum. The partition that has the majority of all nodes is considered the live master. The partition that has the minority tends to lose Availability for all the clients on that side as it no longer accepts requests.
Interspersed between all this is the idea of Consistency – the idea that if you read right after you write, you expect the data to be there. Because the speed of light is constant and network speeds between regions will always be limited, eventual consistency tends to be the norm instead of hard-consistency. Writing to a record in the US, we only expect it to show up in the EU after some indeterminate time (say 1-2 mins?). Ways to enforce hard-consitency would require a lot of gossip/transactional back-and-forth between each node which makes it brittle to network partitions.
PostgresSQL: One of the standard relational databases. It’s not really distributed in the sense that all data is located on a single node. The talk makes a good point that by virtue of the client being on a separate node, it will suffer from the CAP theorem.
Redis: Similar to a database (but not quite) in that its designed to externally store data used by an application. The access semantics are different from your traditional database. It’s main claim to fame comes from it being a convenient way for multiple processes to share data with one another. It tends to be used a lot in monitoring systems as a temporary storage and buffer of high velocity flowing data.
MongoDB: Another popular NoSQL database. It doesn’t have a set schema (e.g. columns in a SQL database) and it comes with some built-in mechanisms on the client side to specify exactly how tolerant of CAP you want your data writes to be.
Riak: A different take on NoSQL based on the dynamo ring system. Unlike MongoDB, it’s designed not to have a central point. Where in the CAP triangle the system lives is also tuneable, although on a system-wide level. It also employs a special system called a CRDT which lets the application resolve any conflicting writes.