Replication Algorithm
Selective replication ensures efficient, decentralized, and redundant content storage across Webhash Storage Nodes. The algorithm intelligently distributes content based on storage availability, node reliability, geographic distribution, and randomness while preventing storage monopolization and ensuring fair participation.
Content Upload & Metadata Registration
User uploads content to Webhash Storage Nodes via an API.
Content is hashed into a CID (Content Identifier).
A smart contract registers the CID and initiates Selective Replication.
Initial Node Selection (Storage Discovery)
The algorithm identifies potential storage nodes using three selection filters:
1. Storage Availability Filter
Queries active nodes via on-chain registry.
Filters nodes with sufficient free storage based on a minimum threshold (e.g., 10GB available per node).
2. Reputation & Uptime Filter
Nodes must have at least 98% uptime (tracked on-chain).
Nodes with high response times or frequent downtimes are deprioritized.
3. Fair Distribution Using VRF
A VRF (Verifiable Random Function) selects nodes randomly from the filtered set.
Ensures randomness while preventing centralization in storage.
Geo-Distributed Replication
Nodes selected for storage are categorized based on geolocation (continent, country, region).
The CID is replicated across 10-20% of total active nodes.
Redundancy Factor (RF) ensures each CID is stored in at least 3 continents.
Content distribution is load-balanced across the globe.
On-Chain Storage Verification
Each node pins the content and generates a Merkle Proof.
A zk-SNARK proof is submitted on-chain to verify storage without exposing data.
The CID is linked to storage nodes in a smart contract.
Only nodes confirming pinning are eligible for rewards.
Dynamic Content Rebalancing
Every 24 hours, nodes must submit storage proofs to confirm they still store the content.
If a node fails verification, the CID is migrated to a new node using Selective Replication.
Nodes that frequently drop content face penalty deductions in their reputation score.
Technologies & Methods Used for Selective Replication
Technology
Function in Selective Replication
IPFS
Decentralized storage, content addressing via CIDs
Ethereum Smart Contracts
On-chain content registry & storage verification
VRF (Verifiable Random Function)
Random but fair node selection
DHT (Distributed Hash Table)
Content location and retrieval
Libp2p PubSub
Real-time node communication
zk-SNARKs
Privacy-preserving storage proofs
Geolocation APIs
Geo-distributed content placement
Smart Contract-based Reputation System
Ensures high-availability nodes are prioritized
Automated Node Penalty & Migration
Ensures redundant storage
Benefits of Selective Replication in Webhash
No Single Point of Failure – Content is always available, even if some nodes go offline.
Efficient Storage Utilization – Nodes only store necessary data, optimizing resources.
Faster Content Retrieval – Geo-distributed nodes reduce latency.
Incentivized Participation – Nodes are rewarded for reliable content storage.
Resistant to Censorship & Downtime – Even if some nodes are removed, content remains accessible.
Last updated