Replication: Ensuring Data Availability

Your vector similarity search is humming along at 50ms latency. Then at 2 AM, the node holding your product embeddings dies. Suddenly, your recommendation engine returns empty results, and customers see a blank “You might also like” section. Revenue drops by the minute.

In our previous post, we covered consistent hashing for distributing data across shards. But distribution alone doesn’t protect you from node failures. That’s where replication enters—and it’s more nuanced than “just copy everything.”

Table of Contents

The Problem Replication Solves

A single node holding your vector index is a single point of failure. Hardware fails. Disks corrupt. Network partitions happen. The question isn’t if your node will fail, but when.

Replication creates copies of your data across multiple nodes. When one dies, others continue serving requests. Simple concept, messy implementation.

The tradeoff you’ll wrestle with: consistency vs. availability vs. latency. You can’t maximize all three. Every replication strategy picks its poison.

Core Concept 1: Primary-Replica Architecture

What it is: One node (primary) accepts all writes. Other nodes (replicas) copy data from the primary and serve read requests.

The Problem It Solves: You need redundancy, but allowing writes everywhere creates conflicts. Who wins when two nodes update the same vector simultaneously? Primary-replica sidesteps this by funneling all writes through one authoritative source.

Analogy: Think of a production database migration gone wrong. The primary is your source-of-truth database; replicas are read-only copies you can query without hammering the primary during peak load.

Simple Example:

import time
from dataclasses import dataclass, field
from typing import Optional
from collections import deque

@dataclass
class VectorRecord:
    id: str
    embedding: list[float]
    timestamp: float = field(default_factory=time.time)

class ReplicaNode:
    def __init__(self, node_id: str, is_primary: bool = False):
        self.node_id = node_id
        self.is_primary = is_primary
        self.vectors: dict[str, VectorRecord] = {}
        self.write_log: deque = deque(maxlen=10000)  # Replication log
        self.last_applied_offset: int = 0

    def write(self, record: VectorRecord) -> bool:
        """Only primary accepts writes directly."""
        if not self.is_primary:
            raise PermissionError(f"Node {self.node_id} is replica. Writes rejected.")

        self.vectors[record.id] = record
        self.write_log.append(record)
        print(f"[PRIMARY {self.node_id}] Wrote vector {record.id}")
        return True

    def read(self, vector_id: str) -> Optional[VectorRecord]:
        """Any node can serve reads."""
        return self.vectors.get(vector_id)

    def get_replication_log(self, from_offset: int) -> list[VectorRecord]:
        """Returns log entries for replicas to fetch."""
        return list(self.write_log)[from_offset:]

# Setup: 1 primary, 2 replicas
primary = ReplicaNode("node-0", is_primary=True)
replica_1 = ReplicaNode("node-1", is_primary=False)
replica_2 = ReplicaNode("node-2", is_primary=False)

# Write to primary
primary.write(VectorRecord(id="product-001", embedding=[0.1, 0.2, 0.3, 0.4]))
primary.write(VectorRecord(id="product-002", embedding=[0.5, 0.6, 0.7, 0.8]))

# Replica attempting write fails
try:
    replica_1.write(VectorRecord(id="product-003", embedding=[0.9, 0.8, 0.7, 0.6]))
except PermissionError as e:
    print(f"[ERROR] {e}")

Output:

[PRIMARY node-0] Wrote vector product-001
[PRIMARY node-0] Wrote vector product-002
[ERROR] Node node-1 is replica. Writes rejected.

Where It Breaks: All writes funnel through one node. If your write volume exceeds what a single node can handle, you’re stuck. Sharding helps here, but then you need replication per shard—complexity compounds fast.

Core Concept 2: Synchronous vs. Asynchronous Replication

What it is: Synchronous replication waits for replicas to confirm before acknowledging writes. Asynchronous confirms immediately and replicates in the background.

The Problem It Solves: You’re choosing between data safety and speed. Lose a primary before async replication completes? Those writes are gone.

Analogy: It’s like the difference between a database transaction that commits only after the backup confirms (sync), versus fire-and-forget logging that might lose the last few entries during a crash (async).

Synchronous Replication Example:

import time
from concurrent.futures import ThreadPoolExecutor, TimeoutError
from typing import Callable

class SyncReplicatedNode:
    def __init__(self, node_id: str):
        self.node_id = node_id
        self.vectors: dict[str, list[float]] = {}
        self.replicas: list['SyncReplicatedNode'] = []

    def add_replica(self, replica: 'SyncReplicatedNode'):
        self.replicas.append(replica)

    def write_sync(self, vector_id: str, embedding: list[float], timeout: float = 2.0) -> bool:
        """
        Synchronous write: blocks until ALL replicas confirm.
        Tradeoff: Latency equals slowest replica + network round-trip.
        """
        # Write locally first
        self.vectors[vector_id] = embedding

        if not self.replicas:
            return True

        # Replicate to all replicas, wait for confirmation
        def replicate_to(replica: 'SyncReplicatedNode') -> bool:
            try:
                # Simulate network latency (real systems: gRPC/HTTP call)
                time.sleep(0.05)  # 50ms network latency
                replica.vectors[vector_id] = embedding
                return True
            except Exception:
                return False

        start = time.time()
        with ThreadPoolExecutor(max_workers=len(self.replicas)) as executor:
            futures = [executor.submit(replicate_to, r) for r in self.replicas]

            try:
                results = [f.result(timeout=timeout) for f in futures]
            except TimeoutError:
                print(f"[WARN] Replication timeout after {timeout}s")
                return False

        elapsed_ms = (time.time() - start) * 1000
        success = all(results)
        print(f"[SYNC] Wrote {vector_id} to {len(self.replicas)} replicas in {elapsed_ms:.1f}ms")
        return success

# Setup
primary = SyncReplicatedNode("primary")
replica_a = SyncReplicatedNode("replica-a")
replica_b = SyncReplicatedNode("replica-b")

primary.add_replica(replica_a)
primary.add_replica(replica_b)

# Synchronous write - blocks until both replicas confirm
primary.write_sync("vec-001", [0.1, 0.2, 0.3, 0.4])
primary.write_sync("vec-002", [0.5, 0.6, 0.7, 0.8])

print(f"\nPrimary vectors: {list(primary.vectors.keys())}")
print(f"Replica-A vectors: {list(replica_a.vectors.keys())}")
print(f"Replica-B vectors: {list(replica_b.vectors.keys())}")

Output:

[SYNC] Wrote vec-001 to 2 replicas in 52.3ms
[SYNC] Wrote vec-002 to 2 replicas in 51.8ms

Primary vectors: ['vec-001', 'vec-002']
Replica-A vectors: ['vec-001', 'vec-002']
Replica-B vectors: ['vec-001', 'vec-002']

Asynchronous Replication Example:

import threading
import queue
import time

class AsyncReplicatedNode:
    def __init__(self, node_id: str):
        self.node_id = node_id
        self.vectors: dict[str, list[float]] = {}
        self.replicas: list['AsyncReplicatedNode'] = []
        self.replication_queue: queue.Queue = queue.Queue()
        self._start_replication_worker()

    def _start_replication_worker(self):
        """Background thread that processes replication queue."""
        def worker():
            while True:
                try:
                    vector_id, embedding = self.replication_queue.get(timeout=1.0)
                    for replica in self.replicas:
                        time.sleep(0.05)  # Simulate network latency
                        replica.vectors[vector_id] = embedding
                    print(f"[ASYNC WORKER] Replicated {vector_id} to {len(self.replicas)} replicas")
                    self.replication_queue.task_done()
                except queue.Empty:
                    continue

        thread = threading.Thread(target=worker, daemon=True)
        thread.start()

    def add_replica(self, replica: 'AsyncReplicatedNode'):
        self.replicas.append(replica)

    def write_async(self, vector_id: str, embedding: list[float]) -> bool:
        """
        Async write: returns immediately after local write.
        Replication happens in background. DANGER: data loss possible!
        """
        start = time.time()
        self.vectors[vector_id] = embedding
        self.replication_queue.put((vector_id, embedding))
        elapsed_ms = (time.time() - start) * 1000
        print(f"[ASYNC] Wrote {vector_id} locally in {elapsed_ms:.2f}ms (replication queued)")
        return True

# Setup
primary = AsyncReplicatedNode("primary")
replica_a = AsyncReplicatedNode("replica-a")
replica_b = AsyncReplicatedNode("replica-b")

primary.add_replica(replica_a)
primary.add_replica(replica_b)

# Async writes return immediately
primary.write_async("vec-001", [0.1, 0.2, 0.3, 0.4])
primary.write_async("vec-002", [0.5, 0.6, 0.7, 0.8])
primary.write_async("vec-003", [0.9, 0.8, 0.7, 0.6])

# Check immediately - replicas may not have data yet!
print(f"\n[IMMEDIATE CHECK]")
print(f"Primary vectors: {list(primary.vectors.keys())}")
print(f"Replica-A vectors: {list(replica_a.vectors.keys())}")  # Likely empty!

# Wait for replication to complete
time.sleep(0.5)

print(f"\n[AFTER 500ms]")
print(f"Replica-A vectors: {list(replica_a.vectors.keys())}")
print(f"Replica-B vectors: {list(replica_b.vectors.keys())}")

Output:

[ASYNC] Wrote vec-001 locally in 0.02ms (replication queued)
[ASYNC] Wrote vec-002 locally in 0.01ms (replication queued)
[ASYNC] Wrote vec-003 locally in 0.01ms (replication queued)

[IMMEDIATE CHECK]
Primary vectors: ['vec-001', 'vec-002', 'vec-003']
Replica-A vectors: []

[ASYNC WORKER] Replicated vec-001 to 2 replicas
[ASYNC WORKER] Replicated vec-002 to 2 replicas
[ASYNC WORKER] Replicated vec-003 to 2 replicas

[AFTER 500ms]
Replica-A vectors: ['vec-001', 'vec-002', 'vec-003']
Replica-B vectors: ['vec-001', 'vec-002', 'vec-003']

Where It Breaks:

Sync: One slow replica tanks your entire write latency. A replica in another region with 200ms RTT means every write takes 200ms minimum. We’ve seen teams add a third replica for redundancy, only to watch p99 latency triple because that replica was in a different availability zone.
Async: Read-after-write inconsistency. User updates their profile embedding, immediately searches, gets stale results because the replica hasn’t caught up. They refresh, works fine. Intermittent bugs that are maddening to debug.

Core Concept 3: Quorum-Based Replication

What it is: Instead of waiting for all replicas (sync) or none (async), wait for a majority. With 3 replicas, wait for 2 confirmations.

The Problem It Solves: Pure sync is too slow. Pure async loses data. Quorum finds the middle ground—tolerate some replica failures without blocking writes.

The Math: For N replicas with quorum size Q:

Write quorum (W): Number of replicas that must confirm a write
Read quorum ( R ) : Number of replicas to query for reads
Consistency guarantee: If W + R > N, you’re guaranteed to read latest write

import time
import random
from concurrent.futures import ThreadPoolExecutor, as_completed
from dataclasses import dataclass

@dataclass
class QuorumConfig:
    total_replicas: int
    write_quorum: int
    read_quorum: int

    def __post_init__(self):
        # Validate quorum overlap for consistency
        if self.write_quorum + self.read_quorum <= self.total_replicas:
            print(f"[WARN] W({self.write_quorum}) + R({self.read_quorum}) <= N({self.total_replicas})")
            print("        Reads may return stale data!")

class QuorumReplicatedStore:
    def __init__(self, config: QuorumConfig):
        self.config = config
        self.replicas: list[dict[str, tuple[list[float], int]]] = [
            {} for _ in range(config.total_replicas)
        ]
        self.version_counter = 0

    def _simulate_replica_latency(self, replica_idx: int) -> float:
        """Simulate variable network latency per replica."""
        base_latency = 0.02  # 20ms base
        jitter = random.uniform(0, 0.08)  # 0-80ms jitter
        # Replica 2 is in a far region - always slow
        if replica_idx == 2:
            base_latency = 0.15  # 150ms for distant replica
        return base_latency + jitter

    def write(self, vector_id: str, embedding: list[float]) -> bool:
        """
        Quorum write: succeeds when write_quorum replicas confirm.
        """
        self.version_counter += 1
        version = self.version_counter

        def write_to_replica(idx: int) -> tuple[int, bool, float]:
            latency = self._simulate_replica_latency(idx)
            time.sleep(latency)
            self.replicas[idx][vector_id] = (embedding, version)
            return (idx, True, latency * 1000)

        start = time.time()
        successful_writes = 0

        with ThreadPoolExecutor(max_workers=self.config.total_replicas) as executor:
            futures = [executor.submit(write_to_replica, i)
                      for i in range(self.config.total_replicas)]

            for future in as_completed(futures):
                idx, success, latency_ms = future.result()
                if success:
                    successful_writes += 1
                    print(f"  Replica {idx} confirmed in {latency_ms:.1f}ms")

                    # Quorum reached! Don't wait for slow replicas
                    if successful_writes >= self.config.write_quorum:
                        elapsed_ms = (time.time() - start) * 1000
                        print(f"[QUORUM] Write succeeded with {successful_writes}/{self.config.total_replicas} "
                              f"replicas in {elapsed_ms:.1f}ms")
                        return True

        return False

    def read(self, vector_id: str) -> tuple[list[float], int] | None:
        """
        Quorum read: query read_quorum replicas, return highest version.
        """
        def read_from_replica(idx: int):
            latency = self._simulate_replica_latency(idx)
            time.sleep(latency)
            return (idx, self.replicas[idx].get(vector_id))

        results = []
        with ThreadPoolExecutor(max_workers=self.config.total_replicas) as executor:
            futures = [executor.submit(read_from_replica, i)
                      for i in range(self.config.total_replicas)]

            for future in as_completed(futures):
                idx, data = future.result()
                if data:
                    results.append(data)

                if len(results) >= self.config.read_quorum:
                    # Return highest version
                    latest = max(results, key=lambda x: x[1])
                    return latest

        return None

# N=3, W=2, R=2 -> Consistent (2+2 > 3)
config = QuorumConfig(total_replicas=3, write_quorum=2, read_quorum=2)
store = QuorumReplicatedStore(config)

print("Writing vector-001...")
store.write("vector-001", [0.1, 0.2, 0.3, 0.4])

print("\nWriting vector-002...")
store.write("vector-002", [0.5, 0.6, 0.7, 0.8])

Output:

Writing vector-001...
  Replica 0 confirmed in 45.2ms
  Replica 1 confirmed in 67.8ms
[QUORUM] Write succeeded with 2/3 replicas in 68.1ms

Writing vector-002...
  Replica 1 confirmed in 38.4ms
  Replica 0 confirmed in 52.1ms
[QUORUM] Write succeeded with 2/3 replicas in 52.4ms

Notice: Replica 2 (the slow one at 150ms+) never blocked the write. Quorum lets fast replicas drive performance while slow ones catch up eventually.

Where It Breaks: Quorum assumes replicas are roughly equal. If you have 5 replicas but 3 are in one datacenter and 2 in another, a datacenter outage might leave you unable to reach quorum—even though nodes are technically “up.”

Core Concept 4: Failover and Leader Election

What it is: When the primary dies, replicas must elect a new leader to accept writes.

The Problem It Solves: Without automatic failover, a primary failure means manual intervention at 3 AM. With failover, the system self-heals in seconds.

import time
import random
import threading
from enum import Enum
from typing import Optional

class NodeState(Enum):
    LEADER = "leader"
    FOLLOWER = "follower"
    CANDIDATE = "candidate"
    DEAD = "dead"

class FailoverNode:
    def __init__(self, node_id: str, peers: list['FailoverNode'] = None):
        self.node_id = node_id
        self.state = NodeState.FOLLOWER
        self.peers = peers or []
        self.current_term = 0
        self.voted_for: Optional[str] = None
        self.last_heartbeat = time.time()
        self.election_timeout = random.uniform(0.15, 0.3)  # 150-300ms
        self._lock = threading.Lock()

    def receive_heartbeat(self, from_leader: str, term: int):
        """Follower receives heartbeat from leader."""
        with self._lock:
            if term >= self.current_term:
                self.current_term = term
                self.state = NodeState.FOLLOWER
                self.last_heartbeat = time.time()
                self.voted_for = None

    def request_vote(self, candidate_id: str, term: int) -> bool:
        """Candidate requests vote from this node."""
        with self._lock:
            if term > self.current_term:
                self.current_term = term
                self.voted_for = None

            if self.voted_for is None or self.voted_for == candidate_id:
                self.voted_for = candidate_id
                return True
            return False

    def start_election(self):
        """Initiate leader election."""
        with self._lock:
            self.state = NodeState.CANDIDATE
            self.current_term += 1
            self.voted_for = self.node_id
            term = self.current_term

        print(f"[{self.node_id}] Starting election for term {term}")

        votes = 1  # Vote for self
        for peer in self.peers:
            if peer.state != NodeState.DEAD:
                if peer.request_vote(self.node_id, term):
                    votes += 1
                    print(f"[{self.node_id}] Received vote from {peer.node_id}")

        # Need majority
        if votes > (len(self.peers) + 1) / 2:
            with self._lock:
                self.state = NodeState.LEADER
            print(f"[{self.node_id}] Elected leader with {votes} votes!")
            return True

        print(f"[{self.node_id}] Election failed. Got {votes} votes, needed {(len(self.peers) + 1) // 2 + 1}")
        return False

    def check_leader_timeout(self) -> bool:
        """Check if leader heartbeat timed out."""
        if self.state == NodeState.LEADER or self.state == NodeState.DEAD:
            return False
        return (time.time() - self.last_heartbeat) > self.election_timeout

    def kill(self):
        """Simulate node failure."""
        self.state = NodeState.DEAD
        print(f"[{self.node_id}] NODE DIED")

# Setup cluster
node_a = FailoverNode("node-A")
node_b = FailoverNode("node-B")
node_c = FailoverNode("node-C")

# Connect peers
node_a.peers = [node_b, node_c]
node_b.peers = [node_a, node_c]
node_c.peers = [node_a, node_b]

# Initial election - node_a becomes leader
print("=== Initial Election ===")
node_a.state = NodeState.LEADER
node_a.current_term = 1
print(f"[node-A] Started as leader (term 1)")

# Simulate heartbeats
for peer in node_a.peers:
    peer.receive_heartbeat("node-A", 1)

print(f"\nCluster state:")
for node in [node_a, node_b, node_c]:
    print(f"  {node.node_id}: {node.state.value}")

# Kill the leader!
print("\n=== Leader Failure ===")
node_a.kill()

# Simulate timeout detection
time.sleep(0.2)
print("\n=== Follower Detects Timeout ===")

# node_b detects leader is dead and starts election
if node_b.check_leader_timeout():
    print(f"[node-B] Leader timeout detected!")
    node_b.start_election()

print(f"\nNew cluster state:")
for node in [node_a, node_b, node_c]:
    print(f"  {node.node_id}: {node.state.value}")

Output:

=== Initial Election ===
[node-A] Started as leader (term 1)

Cluster state:
  node-A: leader
  node-B: follower
  node-C: follower

=== Leader Failure ===
[node-A] NODE DIED

=== Follower Detects Timeout ===
[node-B] Leader timeout detected!
[node-B] Starting election for term 2
[node-B] Received vote from node-C
[node-B] Elected leader with 2 votes!

New cluster state:
  node-A: dead
  node-B: leader
  node-C: follower

Where It Breaks: Split-brain. If network partitions the cluster into two groups, each group might elect its own leader. Now you have two primaries accepting conflicting writes. This is why production systems use odd numbers of replicas and carefully designed network topologies.

Practical Walkthrough: Building a Replicated Vector Store

Let’s combine everything into a minimal but functional replicated vector store:

import time
import threading
from typing import Optional
from dataclasses import dataclass
from concurrent.futures import ThreadPoolExecutor, as_completed

@dataclass
class Vector:
    id: str
    embedding: list[float]
    version: int

class ReplicatedVectorStore:
    """
    A simple replicated vector store demonstrating:
    - Primary-replica architecture
    - Configurable sync/async replication
    - Basic failover support
    """

    def __init__(self, node_id: str, is_primary: bool = False):
        self.node_id = node_id
        self.is_primary = is_primary
        self.vectors: dict[str, Vector] = {}
        self.replicas: list['ReplicatedVectorStore'] = []
        self.version = 0
        self._lock = threading.Lock()

        # Replication settings
        self.replication_mode = "sync"  # "sync" or "async"
        self.replication_queue = []

    def add_replica(self, replica: 'ReplicatedVectorStore'):
        self.replicas.append(replica)

    def put(self, vector_id: str, embedding: list[float]) -> dict:
        """Store a vector with replication."""
        if not self.is_primary:
            return {"error": "Write rejected: not primary", "redirect_to": "primary"}

        with self._lock:
            self.version += 1
            vec = Vector(id=vector_id, embedding=embedding, version=self.version)
            self.vectors[vector_id] = vec

        # Replicate based on mode
        start = time.time()

        if self.replication_mode == "sync":
            success_count = self._replicate_sync(vec)
            elapsed = (time.time() - start) * 1000
            return {
                "status": "ok",
                "version": vec.version,
                "replicated_to": success_count,
                "latency_ms": round(elapsed, 2)
            }
        else:
            self._replicate_async(vec)
            elapsed = (time.time() - start) * 1000
            return {
                "status": "ok",
                "version": vec.version,
                "replication": "pending",
                "latency_ms": round(elapsed, 2)
            }

    def _replicate_sync(self, vec: Vector) -> int:
        """Synchronous replication - wait for all replicas."""
        def send_to_replica(replica):
            time.sleep(0.03)  # Simulate 30ms network
            with replica._lock:
                replica.vectors[vec.id] = vec
            return True

        success = 0
        with ThreadPoolExecutor(max_workers=len(self.replicas)) as executor:
            futures = {executor.submit(send_to_replica, r): r for r in self.replicas}
            for future in as_completed(futures):
                if future.result():
                    success += 1
        return success

    def _replicate_async(self, vec: Vector):
        """Async replication - fire and forget."""
        def background_replicate():
            for replica in self.replicas:
                time.sleep(0.03)
                with replica._lock:
                    replica.vectors[vec.id] = vec

        thread = threading.Thread(target=background_replicate, daemon=True)
        thread.start()

    def get(self, vector_id: str) -> Optional[Vector]:
        """Retrieve a vector."""
        return self.vectors.get(vector_id)

    def search(self, query_embedding: list[float], top_k: int = 5) -> list[tuple[str, float]]:
        """
        Simple brute-force similarity search.
        In production, you'd use HNSW or IVF indexes.
        """
        def cosine_sim(a: list[float], b: list[float]) -> float:
            dot = sum(x * y for x, y in zip(a, b))
            norm_a = sum(x * x for x in a) ** 0.5
            norm_b = sum(x * x for x in b) ** 0.5
            return dot / (norm_a * norm_b + 1e-10)

        scores = []
        for vec in self.vectors.values():
            sim = cosine_sim(query_embedding, vec.embedding)
            scores.append((vec.id, sim))

        scores.sort(key=lambda x: x[1], reverse=True)
        return scores[:top_k]

    def promote_to_primary(self):
        """Emergency promotion during failover."""
        self.is_primary = True
        print(f"[{self.node_id}] PROMOTED TO PRIMARY")

# Demo: Setup and test
print("=== Setting up replicated cluster ===")
primary = ReplicatedVectorStore("primary", is_primary=True)
replica1 = ReplicatedVectorStore("replica-1")
replica2 = ReplicatedVectorStore("replica-2")

primary.add_replica(replica1)
primary.add_replica(replica2)

# Test sync replication
print("\n=== Synchronous Writes ===")
primary.replication_mode = "sync"
result = primary.put("doc-001", [0.1, 0.2, 0.3, 0.4])
print(f"Write result: {result}")

result = primary.put("doc-002", [0.5, 0.6, 0.7, 0.8])
print(f"Write result: {result}")

# Verify replication
print(f"\nPrimary has: {list(primary.vectors.keys())}")
print(f"Replica-1 has: {list(replica1.vectors.keys())}")
print(f"Replica-2 has: {list(replica2.vectors.keys())}")

# Test async replication
print("\n=== Async Writes ===")
primary.replication_mode = "async"
result = primary.put("doc-003", [0.9, 0.8, 0.7, 0.6])
print(f"Write result: {result}")

# Check immediately - async may not be done
print(f"Replica-1 has (immediate): {list(replica1.vectors.keys())}")
time.sleep(0.1)
print(f"Replica-1 has (after 100ms): {list(replica1.vectors.keys())}")

# Test search (any node can serve reads)
print("\n=== Search on Replica ===")
query = [0.15, 0.25, 0.35, 0.45]
results = replica1.search(query, top_k=3)
print(f"Top results: {results}")

# Simulate failover
print("\n=== Failover Scenario ===")
print("Primary dies! Promoting replica-1...")
primary.is_primary = False  # Simulated death
replica1.promote_to_primary()

# New primary accepts writes
result = replica1.put("doc-004", [0.2, 0.3, 0.4, 0.5])
print(f"New primary write: {result}")

Advanced Tips & Production Reality

Replication lag monitoring is non-negotiable. We’ve seen systems where async replicas fell hours behind during traffic spikes. Reads were serving data from yesterday. Add metrics: replication_lag_seconds, replication_queue_depth.

Geographic replication multiplies complexity. Cross-region replication means 100-300ms per round-trip. Sync replication becomes unusable. Most teams use async with conflict resolution, but conflict resolution for vector embeddings is… interesting. Which embedding “wins” when the same document is re-embedded differently in two regions?

The learned-the-hard-way moment: We once had a 3-node cluster with quorum writes (W=2). Deployed a bad config that made one replica reject all writes. System still “worked” because 2 nodes hit quorum. A week later, that node came back online with stale data, got elected leader during a restart, and served week-old embeddings until someone noticed search quality tanked.

Actionable Takeaways

Start with async replication unless you have strict consistency requirements—sync kills latency
Use quorum (W=2, R=2, N=3) as your default—balances consistency and availability
Monitor replication lag obsessively—stale reads are silent failures
Test failover regularly—don’t discover it’s broken at 3 AM
Odd number of replicas prevents split-brain during leader election
Geographic replication requires async—accept eventual consistency or accept latency

Replication Mode	Latency	Consistency	Data Safety
Synchronous	High (slowest replica)	Strong	High
Asynchronous	Low (local write only)	Eventual	Risk of loss
Quorum (W=2, N=3)	Medium (2nd fastest)	Tunable	Good

Conclusion

Replication isn’t just “copy data to more nodes.” It’s a series of tradeoffs between consistency, availability, and latency. Synchronous gives you safety but kills performance. Asynchronous gives you speed but risks data loss. Quorum finds middle ground but adds complexity.

The right choice depends on your pain tolerance. Can you afford stale reads? Can you afford slow writes? Can you afford 3 AM pages when failover breaks?

In the next post, we’ll tackle distributed indexing—how to make similarity search fast across a replicated, sharded cluster. That’s where things get genuinely hard.

Discover more from A Streak of Communication

Subscribe to get the latest posts sent to your email.

The Problem Replication Solves

Core Concept 1: Primary-Replica Architecture

Core Concept 2: Synchronous vs. Asynchronous Replication

Core Concept 3: Quorum-Based Replication

Core Concept 4: Failover and Leader Election

Practical Walkthrough: Building a Replicated Vector Store

Advanced Tips & Production Reality

Actionable Takeaways

Conclusion

Like this:

Related

Discover more from A Streak of Communication

Leave a ReplyCancel reply

The Problem Replication Solves

Core Concept 1: Primary-Replica Architecture

Core Concept 2: Synchronous vs. Asynchronous Replication

Core Concept 3: Quorum-Based Replication

Core Concept 4: Failover and Leader Election

Practical Walkthrough: Building a Replicated Vector Store

Advanced Tips & Production Reality

Actionable Takeaways

Conclusion

Share this:

Like this:

Related

Discover more from A Streak of Communication

Check this too

Replication Strategies: Synchronous vs. Asynchronous

Sharding Deep Dive: Consistent Hashing

Sharding: Splitting the Data

Leave a ReplyCancel reply

Discover more from A Streak of Communication