How to implement a Sync Engine for the Web
I implemented a sync engine for the web: reactive client local storage, web workers for background sync, socket, CRDT, server storage. This is how it all works.

Sandro Maglione
Contact me1846 words
・Users want apps to be fast, privacy-friendly, multi device, and working offline. Developers want to reduce complexity while still providing a delightful user experience.
A sync engine is the key that unlocks both requirements ⚡️
I implemented a sync engine for the web, end-to-end from client(s) to server. This article is an overview of how a sync engine for the web works and how you can implement one for yourself.
By the end of the article, you will understand why and how all the components of the diagram below work together to create a superior user and developer experience.
Store data on the client
Most of the complexity of client code is caused by network requests:
- Handling all possible errors (missing connection, encoded/decode requests, error responses)
- Manage loading states
- Build HTTP requests with correct token, headers, parameters
- Handle asynchronous requests inside the UI
It's also where the user experience starts to degrade:
- Long waiting times
- Unclear error messages when something bad happens
- No offline support
All these issues disappear when the client writes and reads data locally:
- Fast (even synchronous)
- Privacy-friendly
- Offline by default
- Persistent
Let's start from this simple idea: the client (UI) always writes and reads locally.
Local-only: the best developer experience
When the data is stored locally, it's possible to implement an "observable" that automatically re-render the UI when data changes.
A live query provides data to a component and re-renders it when the requested data changes.
This reduces the responsibility of the UI to mutating data, just like a simple
useState
in React.
Some local storage options currently available on the web are:
- IndexedDB (DexieJs with
liveQuery
) - Postgres (PgLite's live queries)
- SQLite (using WASM and something like wa-sqlite)
With live queries there is no need of any store-based state management libraries (Redux, Jotai, Zustand).
You also don't need TanStack Query, since all the data is stored locally.
import { useLiveQuery } from "dexie-react-hooks";
import { db } from "./db";
export function FriendList() {
// Automatically re-renders when data changes ⚡️
const friends = useLiveQuery(() =>
db.friends.where("age").between(50, 75).toArray()
);
return (
<>
<h2>Friends</h2>
<ul>
{friends?.map((friend) => (
<li key={friend.id}>
{friend.name}, {friend.age}
</li>
))}
</ul>
</>
);
}
Syncing data between clients
Storing data locally has many advantages, but one fundamental drawback: the data is trapped inside the user's device.
- No long-term persistence
- No collaboration with other users
- No way to share data between multiple devices
The core requirement therefore becomes:
How to keep all the advantages of local data storage, while also allowing collaboration and multi-device support?
This is where you introduce a Sync Engine:
A sync engine synchronizes the data between clients while allowing each client to read and write locally.
The aim is to keep all the advantages of client-only, while also sharing data between clients:
- Each client only cares about its own local data
- The sync engine makes sure the local data is in sync between all clients
From the perspective of the UI code nothing changes when working client-only or with a sync engine.
The UI keeps mutating data and using live queries for listening to local data changes: fast, persistent, offline.
A sync engine works "in the background" to update the local data to include changes from other clients.
Web Worker for syncing
Web workers are ideal to keep syncing independent of the UI:
A Web Worker allows running script in background threads on the web.
A syncing web worker performs 2 roles:
- Push: Listen for changes on the local storage (using live queries) and push those changes to the server
- Pull: Listen for updates from the server and commit those inside local storage
Pulling updates from the server is achieved by creating a web socket connection, so that the server can send changes as soon as they become available.
WebSocket
makes it possible to open a two-way interactive communication session between the user's browser and a server.
Syncing updates on the server
The server is responsible for merging changes from multiple independent clients such that the final state is consistent between all clients (eventual consistency).
This objective is achieved using a CRDT (Conflict-free Replicated Data Type).
A CRDT has the following properties:
- Clients can update their local data independently, concurrently and without coordinating with other clients
- An algorithm (part of the data type) automatically resolves any inconsistencies
- Although local states may differ at any particular point in time, they are guaranteed to eventually converge
Libraries like Loro or Yjs implement CRDTs in TypeScript. They both allow to export a CRDT in binary format (Uint8Array
) that can be sent over the network for syncing.
/** Client 1️⃣ */
const doc = new LoroDoc();
doc.getText("text").insert(0, "Hello world!");
const bytes: Uint8Array = doc.export({ mode: "update" });
/**
* 👆
* Send `bytes` to another device (`Uint8Array`)
* 👇
*/
/** Client 2️⃣ */
const doc = new LoroDoc();
doc.getText("text").insert(0, "Hi!");
// CRDT algorithm to merge the changes
doc.import(bytes);
I suggest using a library for resolving change between clients. Implementing your own CRDT or using another strategy like event sourcing is where I found the most complexity.
In my implementation I used Loro. The rest of the article is based on the Loro API (LoroDoc
).
Handle multiple formats
LoroDoc
is the central data structure we use when applying mutations (insert/update/delete).
For persistence, LoroDoc
can be converted to Uint8Array
(by calling export
). Since Uint8Array
is serializable (as number[]
), it can be stored in IndexedDB and also sent to the server.
On the UI we provide plain JSON values. This can be done by converting a LoroDoc
to JSON using toJSON()
.
The server stores the Loro CRDT in bytes (number[]
). When it receives updates from a client, it can merge them by converting number[]
to Uint8Array
, Uint8Array
to LoroDoc
, and then import
the new changes into the current value from storage.
By merging changes with
import
fromLoroDoc
, we use the CRDT algorithm internal inLoroDoc
to make sure changes are consistent.
Live updates on the server
The server needs a similar push/pull mechanism as the client:
- Pull: An API receives changes from a client, resolves the state using a CRDT (
import
), and writes it into storage (export
) - Push: The socket connection listen for storage changes and wires them to each connected client
Just like in the client, the server also needs a "live query" mechanism that listens for changes in storage, to then send them live to other clients through the socket connection.
Listening for live updates on a database like Postgres requires setting up Data Replication, which uses a Write-ahead Log (WAL):
WAL is a file that stores all changes made to the database, such as inserts, updates and deletes in sequence.
This requires creating a Publication (CREATE PUBLICATION
), a Subscription (CREATE SUBSCRIPTION
), and a Replication Slot.
Another more simple solution would be storing everything inside a file and using NodeJs
watch
API."Storage" can be anything as long as it supports "live" updates.
Stream changes to the client
As we saw previously, the socket connection interacts with a Web Worker on the client.
The Web Worker keeps running on the background, waiting for updates.
The Web Worker receives a Stream
of updates from the server. When it receives a new update from the server, it stores it in the client's local storage.
Since the changes comes from a trusted server, the client can assume that the data is valid and therefore replace everything with the new value.
Offline mode
When a connection with the server cannot be established, the client can still update its local data.
We store local changes in a separate location inside storage to mark them as "waiting syncing" (e.g. another table inside IndexedDB).
Local changes are applied to the current client, but still unverified. The client can keep making changes even when offline.
When the device comes back online, all the local changes are synced on the server, that then responds with a validated snapshot.
At this point, the client can remove all local changes and instead rely on the data from the server.
Bootstrap
On initial load, the client can make a single API request to pull the latest changes from the server.
This initial loading process is called bootstrapping.
Bootstrapping is necessary to move the client up to date with changes that happened since the last time the user opened the app.
Bootstrapping is necessary only on the initial load. After the initial bootstrapping, then the socket connection will make sure to stream live changes.
For bootstrapping, a single GET request to an HTTP endpoint is enough.
Manual sync requests
The same bootstrap endpoint can be used to implement manual syncing requests (triggered by the user).
In my implementation, I create a new Web Worker on the initial load, that performs the initial bootstrap.
This new Web Worker is different from the Web Worker for the socket connection.
The user can then click on a button to trigger a manual sync. Clicking the button sends a message to the worker to perform another sync (HTTP request).
These are all the components for a sync engine on the web. It all starts from a simple requirement:
The client (UI) always writes and reads locally.
The UI
reads and write with Local Storage
, and doesn't care about other parts of the architecture.
A Web Worker
acts on the background by creating a socket connection with the server. It sends and receives updates that are stored inside Local Storage
.
On the server, state is resolved using a CRDT
and stored inside Remote Storage
.
Sync engine implementation
The details of the sync engine implementation depend on your tech stack and requirements. Some general guidelines:
- If you are using a frontend framework, it works best for it to be client-only. Frameworks like
next
are not ideal because they include both client and server, while local-first apps are generally client-only - You can use any CRDT library you prefer, or even no CRDT at all (you may consider event sourcing instead)
- You are not required to use any specific local or remote storage option
That's why the article focuses on the architecture of a sync engine, and not any specific implementation. You can choose libraries and technologies that best fit your project.
Here are the technologies I used for my own implementation:
- TanStack Router (Client/frontend framework based on React)
- Loro (CRDT)
- IndexedDB with Dexie (Client storage)
effect
(Backend server and socket connection)
TanStack Router runs on Vite, which supports Web Workers.
Other requirements
- Migrations: by using Loro and storing data as bytes, in practice no data migration should be required. Loro will merge changes regardless of their schema
- Authentication: In my implementation I organized the app in workspaces. You can join any workspace as long as you get access to its unique
UUID
. Below the schema of the table stored on the server
export class ServerTableMetadata extends Schema.Class<ServerTableMetadata>(
"ServerTableMetadata"
)({
serverId: Schema.UUID, // Unique id generated on the server
clientId: Schema.UUID, // Used to deduplicate snapshot on the client
workspaceId: Schema.UUID, // Auth with workspace unique id (generated by each client)
ownerId: Schema.UUID, // Identify user who made each change
table: Table, // Different syncing for each table, to avoid huge payloads
snapshot: Schema.Uint8Array, // CRDT encoded in bytes
}) {}
- Encryption: Encryption requires adding a layer inside the syncing web worker that encrypts the data before wiring it to the server. The same should happen when the server sends data back to the client (end-to-end encryption)
- Querying: Since data is stored as bytes, making actual SQL queries on the server is not possible. However, this is by design. The server is meant for syncing, not as an API. Querying should instead be performed locally on the client by converting bytes to JSON
Not all apps benefit from this architecture. Local-first works best for apps that deal with user private data, with no centralized security logic (e.g. not ideal for banks) and no shared global feeds (e.g. not ideal for social networks).