Skip to main content
Change page

Decentralized Storage

Last edit: , October 10, 2023

Unlike a centralized server operated by a single company or organization, decentralized storage systems consist of a peer-to-peer network of user-operators who hold a portion of the overall data, creating a resilient file storage sharing system. These can be in a blockchain-based application or any peer-to-peer-based network.

Ethereum itself can be used as a decentralized storage system, and it is when it comes to code storage in all the smart contracts. However, when it comes to large amounts of data, that isn't what Ethereum was designed for. The chain is steadily growing, but at the time of writing, the Ethereum chain is around 500GB - 1TB (depending on the client(opens in a new tab)), and every node on the network needs to be able to store all of the data. If the chain were to expand to large amounts of data (say 5TBs) it wouldn't be feasible for all nodes to continue to run. Also, the cost of deploying this much data to Mainnet would be prohibitively expensive due to gas fees.

Due to these constraints, we need a different chain or methodology to store large amounts of data in a decentralized way.

When looking at decentralized storage (dStorage) options, there are a few things a user must keep in mind.

  • Persistence mechanism / incentive structure
  • Data retention enforcement
  • Decentrality
  • Consensus

Persistence mechanism / incentive structure

Blockchain-based

For a piece of data to persist forever, we need to use a persistence mechanism. For example, on Ethereum, the persistence mechanism is that the whole chain needs to be accounted for when running a node. New pieces of data get tacked onto the end of the chain, and it continues to grow - requiring every node to replicate all the embedded data.

This is known as blockchain-based persistence.

The issue with blockchain-based persistence is that the chain could get far too big to upkeep and store all the data feasibly (e.g. many sources(opens in a new tab) estimate the Internet to require over 40 Zetabytes of storage capacity).

The blockchain must also have some type of incentive structure. For blockchain-based persistence, there is a payment made to the validator. When the data is added to the chain, the validators are paid to add the data on.

Platforms with blockchain-based persistence:

Contract-based

Contract-based persistence has the intuition that data cannot be replicated by every node and stored forever, and instead must be upkept with contract agreements. These are agreements made with multiple nodes that have promised to hold a piece of data for a period of time. They must be refunded or renewed whenever they run out to keep the data persisted.

In most cases, instead of storing all data on-chain, the hash of where the data is located on a chain gets stored. This way, the entire chain doesn't need to scale to keep all of the data.

Platforms with contract-based persistence:

Additional considerations

IPFS is a distributed system for storing and accessing files, websites, applications, and data. It doesn't have a built-in incentive scheme, but can instead be used with any of the contract-based incentive solutions above for longer-term persistence. Another way to persist data on IPFS is to work with a pinning service, which will "pin" your data for you. You can even run your own IPFS node and contribute to the network to persist your and/or other's data for free!

SWARM is a decentralized data storage and distribution technology with a storage incentive system and a storage rent price oracle.

Data retention

In order to retain data, systems must have some sort of mechanism to make sure data is retained.

Challenge mechanism

One of the most popular ways to make sure data is retained, is to use some type of cryptographic challenge that is issued to the nodes to make sure they still have the data. A simple one is looking at Arweave's proof-of-access. They issue a challenge to the nodes to see if they have the data at both the most recent block and a random block in the past. If the node can't come up with the answer, they are penalized.

Types of dStorage with a challenge mechanism:

  • 0Chain
  • Skynet
  • Arweave
  • Filecoin
  • Crust Network
  • 4EVERLAND

Decentrality

There aren't great tools to measure the level of decentralization of platforms, but in general, you'll want to use tools that don't have some form of KYC to provide evidence they are not centralized.

Decentralized tools without KYC:

  • 0Chain (implementing a non-KYC edition)
  • Skynet
  • Arweave
  • Filecoin
  • IPFS
  • Ethereum
  • Crust Network
  • 4EVERLAND

Consensus

Most of these tools have their own version of a consensus mechanism but generally they are based on either proof-of-work (PoW) or proof-of-stake (PoS).

Proof-of-work based:

  • Skynet
  • Arweave

Proof-of-stake based:

  • Ethereum
  • Filecoin
  • 0Chain
  • Crust Network

IPFS - InterPlanetary File System is a decentralized storage and file referencing system for Ethereum.

Storj DCS - Secure, private, and S3-compatible decentralized cloud object storage for developers.

Skynet - Skynet is a decentralized PoW chain dedicated to a decentralized web.

Filecoin - Filecoin was created from the same team behind IPFS. It is an incentive layer on top of the IPFS ideals.

Arweave - Arweave is a dStorage platform for storing data.

0chain - 0Chain is a proof-of-stake dStorage platform with sharding and blobbers.

Crust Network - Crust is a dStorage platform on top of the IPFS.

Swarm - A distributed storage platform and content distribution service for the Ethereum web3 stack.

OrbitDB - A decentralized peer to peer database on top of IPFS.

Aleph.im - Decentralized cloud project (database, file storage, computing and DID). A unique blend of offchain and onchain peer-to-peer technology. IPFS and multi-chain compatibility.

Ceramic - User-controlled IPFS database storage for data-rich and engaging applications.

Filebase - S3-compatible decentralized storage and geo-redundant IPFS pinning service. All files uploaded to IPFS through Filebase are automatically pinned to the Filebase infrastructure with 3x replication across the globe.

4EVERLAND - A Web 3.0 cloud computing platform that integrates storage, compute and networking core capabilities, is S3 compatible and provides synchronous data storage on decentralized storage networks such as IPFS and Arweave.

Kaleido - A blockchain-as-a-service platform with click-button IPFS Nodes

Further reading

Know of a community resource that helped you? Edit this page and add it!

Was this article helpful?