In the distributed systems verification of Dgraph 1.0.2 through 1.0.6, Jepsen has found 23 issues including multiple deadlocks and crashes in the cluster, duplicate upserted records, snapshot isolation violations, records with missing fields, and in some cases, the loss of all but one inserted record.
Dgraph is an open source, fast, distributed graph database which uses Raft for per-shard replication and a custom transactional protocol, based on Omid, Reloaded, for snapshot-isolated cross-shard transactions.
Dgraph has a custom transaction system to provide transactional isolation across different Raft groups. Storage nodes, called Alpha, are controlled by a supervisory system, called Zero. Zero nodes form a single Raft cluster, which organizes Alpha nodes into shards called groups. Each group runs an independent Raft cluster.
Jepsen is a framework to analyze distributed systems under stress and verify that the safety properties of a distributed system hold up, given concurrency, non-determinism, and partial failure. It is an effort to improve the safety of distributed databases, queues, consensus systems, and more.
To verify safety properties of Dgraph, a suite of Jepsen tests was designed using a five node cluster with replication factor three. Alpha nodes were organized into two groups: one with three replicas, and one with two. Every node ran an instance of both Zero and Alpha.
Many operations were tested, out of which some are listed here:
Here are some of the issues found by the test:
The identified safety issues were mostly associated with process crashes, restarts, and predicate migration. Out of 23 issues, 4 still remain unresolved, including the corruption of data in healthy clusters.
This analysis was funded by Dgraph and Jepsen has documented the full report on their official website.
2018 is the year of graph databases. Here’s why.
MongoDB Sharding: Sharding clusters and choosing the right shard key [Tutorial]
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…