Collaborative and Offline Editing Using CRDTs

In today's fast-paced and highly-connected world, seamless collaboration and offline accessibility have become essential features. Decipad, a cutting-edge notebook-like application tailored for quantitative modeling and analysis, uses CRDTs under the hood to provide some essential properties to the product.

Share this

Introduction

In today's fast-paced and highly-connected world, seamless collaboration and offline accessibility have become essential features for productivity tools. Decipad, a cutting-edge notebook-like application tailored for quantitative modeling and analysis, uses Conflict-Free Replicated Data Types (CRDTs) under the hood to provide some essential properties to the product. This article delves into the innovative approach that Decipad takes to enable offline editing and seamless collaboration through CRDTs.

The Challenge: Offline Editing and Collaboration

Traditional collaboration tools often rely on constant online connectivity to synchronize changes made by different users. However, this model poses challenges when users need to work offline or face unstable internet connections. To address this, Decipad sought to create a solution that allowed users to edit their notebooks offline and collaborate with others without the fear of conflicts arising from simultaneous edits.

Introducing CRDTs

Conflict-Free Replicated Data Types (CRDTs) emerged as a breakthrough in distributed systems to tackle the challenge of collaborative editing in a decentralized environment. CRDTs enable multiple users to edit a shared document concurrently, with the assurance that conflicts will not arise during synchronization.

CRDTs are data structures designed to be replicated across multiple nodes in a distributed system while maintaining strong eventual consistency. Operation-based CRDTs operate under the principle of capturing and replicating individual operations, which leads to their eventual convergence. These operations can be inserts, deletes, updates, or any other transformation that the data structure supports.

The key features of operation-based CRDTs include:

  1. Commute Property: Operations in operation-based CRDTs should commute or be order-independent, meaning that the order in which they are applied does not affect the end result. This is crucial for ensuring consistency across replicas.
  2. Associative and Idempotent Operations: Operations should be associative, meaning that the grouping of operations should not impact the final result, and idempotent, meaning that applying an operation multiple times has the same effect as applying it once.
  3. Causality Preservation: CRDTs must capture the causal relationship between operations to ensure that replicas can independently reconstruct the sequence of operations and reach the same state.

Text CRDTs and Replicated Growable Array (RGA)

A text CRDT is a specific type of CRDT designed for maintaining concurrent text editing across distributed systems. The Replicated Growable Array (RGA) is a prominent text CRDT that achieves this by combining the concepts of operation-based CRDTs with a data structure suitable for text manipulation.

RGA - How It Works

  1. Position Assignment: In RGA, each character in the text is assigned a unique position identifier based on a unique node identifier.
  2. Insertion Operation: When a user wants to insert a character at a specific position, the operation generates a new position identifier for the inserted character, taking into account the positions of the adjacent characters. This new identifier ensures that the operation commutes with respect to other insertions.
  3. Deletion Operation: Deletions are handled by marking a character as "tombstone" rather than physically removing it. This preserves the causality of operations and allows concurrent deletions to be synchronized.
  4. Convergence: Replicas exchange operations and merge them by ensuring causality and maintaining the total order of operations. This process guarantees that all replicas will eventually converge to the same state.

Benefits of RGA and Text CRDTs

  1. Offline Editing: Text CRDTs like RGA allow users to edit text even when offline. Edits are captured as operations and can be applied when reconnected.
  2. Concurrency Handling: RGA inherently supports concurrent edits, enabling collaborative editing in real-time across distributed systems.
  3. Fault Tolerance: Due to the operation-based nature of CRDTs, RGA is highly fault-tolerant. Even if some nodes fail or messages are lost, replicas can still converge correctly.

Decipad's Approach with CRDTs

Decipad chose the op-based CRDT approach due to its suitability for offline editing scenarios. Here's how Decipad leverages CRDTs to enable offline and collaborative editing:

  1. Data Structure: Decipad represents the document as a distributed DOM-like data structure using a op-based CRDT that composes arrays, maps and text elements. For the text elements, each character or element in the document is assigned a unique identifier to track changes.
  2. Offline Editing: When a user edits the document offline, Decipad records their changes as local operations (using the browser IndexedDB database). These local changes are applied to the document's CRDT representation while maintaining causality.
  3. Collaborative Editing: When multiple users edit the same document, Decipad ensures that each user's changes are merged correctly upon synchronization. The CRDT's inherent properties guarantee that concurrent edits made by different users do not conflict.
  4. Conflict Resolution: In the rare event of concurrent changes that might not be automatically resolvable by the CRDT algorithm, Decipad employs intelligent conflict resolution strategies. These strategies are designed to maintain the user's intent and minimize data loss.
  5. Synchronization: When a user reconnects to the internet, Decipad synchronizes their local changes with the remote copy of the document. The CRDT algorithm ensures that all changes are integrated seamlessly, irrespective of the order in which they were made.

Benefits of Decipad's Approach

  1. Offline Accessibility: Users can work on their documents even without an active internet connection, eliminating productivity roadblocks due to connectivity issues.
  2. Conflict-Free Collaboration: Decipad's utilization of CRDTs guarantees that collaboration is seamless and conflict-free. Users can work together on the same document without worrying about merging conflicts.
  3. Real-Time Updates: As changes are synchronized, users can see edits from collaborators in near real-time, fostering a sense of active collaboration even when physically apart.
  4. Intuitive Experience: Decipad maintains the familiar interface of a traditional notebook application, making the transition to the new offline and collaborative model effortless for users.

Conclusion

Decipad's use of Conflict-Free Replicated Data Types (CRDTs) has ushered in a new era of productivity and collaboration. By allowing offline editing and seamless, conflict-free collaboration, Decipad empowers users to work more efficiently, whether they're in an environment with spotty internet connectivity or working closely with distributed teams.

Related reads