Replication Algorithm

Selective replication ensures efficient, decentralized, and redundant content storage across Webhash Storage Nodes. The algorithm intelligently distributes content based on storage availability, node reliability, geographic distribution, and randomness while preventing storage monopolization and ensuring fair participation.

Content Upload & Metadata Registration

  1. User uploads content to Webhash Storage Nodes via an API.

  2. Content is hashed into a CID (Content Identifier).

  3. A smart contract registers the CID and initiates Selective Replication.

Initial Node Selection (Storage Discovery)

The algorithm identifies potential storage nodes using three selection filters:

1. Storage Availability Filter

  • Queries active nodes via on-chain registry.

  • Filters nodes with sufficient free storage based on a minimum threshold (e.g., 10GB available per node).

2. Reputation & Uptime Filter

  • Nodes must have at least 98% uptime (tracked on-chain).

  • Nodes with high response times or frequent downtimes are deprioritized.

3. Fair Distribution Using VRF

  • A VRF (Verifiable Random Function) selects nodes randomly from the filtered set.

  • Ensures randomness while preventing centralization in storage.

Geo-Distributed Replication

  1. Nodes selected for storage are categorized based on geolocation (continent, country, region).

  2. The CID is replicated across 10-20% of total active nodes.

  3. Redundancy Factor (RF) ensures each CID is stored in at least 3 continents.

  4. Content distribution is load-balanced across the globe.

On-Chain Storage Verification

  1. Each node pins the content and generates a Merkle Proof.

  2. A zk-SNARK proof is submitted on-chain to verify storage without exposing data.

  3. The CID is linked to storage nodes in a smart contract.

  4. Only nodes confirming pinning are eligible for rewards.

Dynamic Content Rebalancing

  • Every 24 hours, nodes must submit storage proofs to confirm they still store the content.

  • If a node fails verification, the CID is migrated to a new node using Selective Replication.

  • Nodes that frequently drop content face penalty deductions in their reputation score.

Technologies & Methods Used for Selective Replication

Technology

Function in Selective Replication

IPFS

Decentralized storage, content addressing via CIDs

Ethereum Smart Contracts

On-chain content registry & storage verification

VRF (Verifiable Random Function)

Random but fair node selection

DHT (Distributed Hash Table)

Content location and retrieval

Libp2p PubSub

Real-time node communication

zk-SNARKs

Privacy-preserving storage proofs

Geolocation APIs

Geo-distributed content placement

Smart Contract-based Reputation System

Ensures high-availability nodes are prioritized

Automated Node Penalty & Migration

Ensures redundant storage

Benefits of Selective Replication in Webhash

  • No Single Point of Failure – Content is always available, even if some nodes go offline.

  • Efficient Storage Utilization – Nodes only store necessary data, optimizing resources.

  • Faster Content Retrieval – Geo-distributed nodes reduce latency.

  • Incentivized Participation – Nodes are rewarded for reliable content storage.

  • Resistant to Censorship & Downtime – Even if some nodes are removed, content remains accessible.

Last updated