Key Takeaways
- Sharding in blockchain attempts to improve decentralized network throughput and many blockchain protocols’ scaling potential.
- As development continues and sharding is enabled, Ethereum 2.0 could be on the way to solving the blockchain trilemma for the first time in history.
- Sharding seems like an ideal solution to the scaling problem, but there are currently several issues in the path to its success, like operational complexities and latency.
Blockchain has captured the minds and funds of people all over the world. The last decade has been a monumental success in terms of innovation, but there are still several issues holding the technology back from mainstream adoption. Payments are a part of our everyday lives, and while traditional payment networks run on complex infrastructure manned by expert teams, hardly any of that is shown to the end-user.
Scalability Issues
Modern centralized payment networks like Visa and Mastercard can handle around 2,000 transactions per second, a far cry from the five transactions per second Bitcoin manages. Scalability is the most significant factor holding decentralized networks back, but solving the problem isn’t as straightforward as it sounds.
The cryptocurrency industry has an almost religious obsession with decentralization, and creating distributed systems that function with the same level of efficiency isn’t just challenging – it’s near impossible. Unlike centralized models, blockchain networks are often developed by open-source communities with less incentive to improve the technology.
While enterprise financial systems need to be high-performance, slow transaction speeds aren’t the only concern. Network congestion can lead to exorbitant transaction fees, as was the case with CryptoKitties in 2017, making it impractical for most use-cases.
Scaling is a problem that the industry has been wrestling with since the creation of Bitcoin. From block sizes and a segregated witness (SegWit) to the Lightning Network and other Layer-2 solutions, there have been many attempts to improve network throughput over the last decade. However, few have been more than remotely successful, and usually at the expense of other system characteristics, such as its degree of decentralization.
The blockchain trilemma represents the challenge in creating a scalable, secure, and decentralized network without sacrificing one of these three attributes. Last year, amidst a global pandemic and widespread economic downturn, Bitcoin was touching new all-time highs, and the total value of ETH locked into decentralized finance platforms rose by over 7,000%. From governments and central banks to hedge funds and retail investors, people have started paying attention to cryptocurrencies, but blockchain may not be ready.
Almost all DeFi applications currently run atop the Ethereum network. As more people jump aboard, blockchain’s ability to scale will be the industry’s primary bottleneck in the years to come. Ethereum 2.0 has been under development for years, and while the chain has officially launched, core features like its sharded structure have yet to be implemented. However, as development continues and sharding is enabled, Ethereum 2.0 could be on the way to solving the blockchain trilemma for the first time in history.
Scaling in Shards
Improve Network’s Security
Layer-2 solutions are still a necessary part of making blockchain networks usable at scale. These protocols usually involve a second network layer where computation is off-loaded, creating more headroom without making any changes to the base chain. Additionally, elements of the original chain can be used while constructing the second layer.
This helps improve the network’s security, and by implementing a mixed approach of both Layer-1 and Layer-2 scaling solutions, developers hope to improve the blockchain’s performance without impacting its security or decentralization. Ethereum already supports various Layer-2 solutions, such as Raiden, Plasma, and rollups, which deal with bottleneck computation on the base layer.
Validate Individual Transactions Faster
Plasma implements’ child chains’, which can individually handle up to 15 transactions per second, while Raiden is more of an off-chain solution for opening up payment channels that participants can fund. The two protocols go hand-in-hand, with Plasma handling smart contracts and triggering payment channels on Raiden.
On the other hand, sharding divides nodes into groups, so each node doesn’t have to validate the whole chain. Initially, this technique was created to partition large databases into more manageable chunks horizontally.
Sharding Ethereum Explained
This is easier to understand in terms of a spreadsheet, where splitting the table vertically would make each part meaningless without the other. Horizontal splitting, however, would keep each part relevant without the other while also offering the complete picture when pieced back together. When this concept is extrapolated to a blockchain, the chain’s state is fragmented into chunks called shards.
This allows shards to validate individual transactions faster, while other Ethereum nodes verify the shard chain as a whole. Most operations related to cryptocurrency transactions occur in sequence, with one action taking place after another has ended. In fact, each step depends on the successful completion of the previous step. As the network grows in size and number of users, sequential processing leads to impractical inefficiency levels.
To combat this problem, parallel processing is a much more viable alternative. By breaking the blockchain into multiple parts and processing them in parallel, sharding should radically improve how much computation the network can handle at once. Even a handful of shards would dramatically improve Ethereum’s scaling potential, but with nearly 7,000 nodes at its disposal, the outcome could be much more remarkable.
The plan is to divide Ethereum 2.0 into 1,024 shards, which could theoretically expand its throughput by a thousandfold. Unlike centralized systems, there is no server to keep track of the network’s state. This needs to be done by the nodes on the chain, and reducing the workload on each node allows Ethereum to scale without nodes requiring expensive hardware to participate.
One of the most crucial parts of implementing sharding is Ethereum’s move from a Proof-of-Work (PoW) consensus mechanism to Proof-of-Stake (PoS). Bitcoin uses PoW, which incentivizes miners to compete to solve a mathematical problem to validate transactions for a chance to win the block reward (currently 6.25 BTC).
Bitcoin Mining and Energy Consumption
The problem with this consensus algorithm is its colossal power demand, needing a large amount of electricity to compute the necessary calculation required to mine the chain. Reports estimate that Bitcoin’s global electricity consumption is as much as a small country, and while a broad shift to renewable energy sources is a slightly more sustainable option, the escalating computational requirements mean its energy demand will only keep growing.
Instead of having miners purchase dedicated, expensive equipment to part-take in the mining process, Proof-of-Stake allows network participants to validate transactions per their stake in the network. Participants lock up or ‘stake’ tokens, and based on the number of tokens staked, validators are given a chance to mining the next block. This uses significantly less energy while maintaining the network’s security and levels of decentralization.
On Ethereum 2.0, the overall state of the network is called its ‘global state.’ This state is broken down into shards, with each shard possessing its own state. Together, the global state, shards, and the shards’ states form a Merkle tree in which each level of the tree is derived from a node one level above.
When sharding is eventually activated on Ethereum 2.0, its state will be split into shards, with each unique account belonging to a particular shard. As explained by Ethereum creator Vitalik Buterin at Devcon, it will be as if Ethereum was split into thousands of islands, where each island can exist independently while interacting and sharing resources with other islands.
Ethereum 2.0 plans on executing this through two interaction levels. In the first shard interaction level, each shard has its own transaction group. This group is split into the transaction group header, which contains the Shard ID, the state of the shard before transactions were placed inside it (pre-state root), the state after transactions have been added (post-state root), a ‘receipt root’ to acknowledge the addition of transactions, and a group of validators randomly chosen to verify the shard chain’s data.
Each transaction broadcasts the shard ID it belongs to, with the transactions occurring between two accounts in that shard. By specifying the pre-state and post-state roots, this interaction level also reveals state transitions. The second interaction level acts as a simple blockchain that accepts transaction groups instead of individual transactions, with groups accepted only if their pre-state root matches the shard root from the global state.
Additionally, validators must verify every signature in the transaction group before the group will accept it. Once the group is accepted into a block, the shard’s post-state root is updated to match the global state root.
Complex Communication in Sharding
Sharding is a brilliant concept that could change how our financial systems operate, but it isn’t particularly useful unless the individual shards can communicate with each other. To make the system efficient, shards must interact effectively while reducing communication bottlenecks and expenses.
To do this, shards communicate only when necessary, but the biggest challenge for developers is enabling cross-shard communication and scaling while maintaining the same level of security. Ethereum 2.0 uses a ‘receipt paradigm’ to accomplish this, employing a distributed shared memory to store receipts on the beacon chain.
Other shards can still see the receipts inside the Ethereum 2.0 beacon chain, but they cannot modify them due to blockchain networks’ immutable nature. This allows shards to mutually benefit without affecting the network’s ability to achieve finality. On the surface, sharding seems like an ideal solution to the scaling problem, but there are currently several issues in the path to its success, like operational complexities and latency.
To deal with operational complexities, Buterin announced two proposals to develop a fully-sharded network with a minimal consensus-layer framework. The proposals should provide enough support for developing intricately designed smart contract frameworks while enabling cross-shard communication using Layer-2 abstraction for transfers and code execution on the blockchain.
To send a token from one shard to another, there are many processes involved which introduce latency into the equation. Tackling this problem consists of using the ‘Fast Cross-Shard Transfers Via Optimistic Receipt Root’ solution. Besides being a mouthful, the solution allows the network to temporarily change an account’s state until the transaction is verified. This will enable shards to ensure finality without needing to factor in communication latency.
Once the transfer is verified, the transaction becomes permanent if valid and is reversed if not. However, this does unearth the potential for large mining pools to take over the system and centralize its operations, mainly since most shards will operate at only a fraction of the entire chain’s hash rate.
The Last Stage of Ethereum 2.0’s Launch
Ethereum 2.0 Phase 0 launch took place late last year, and validators have collectively sent over 3.5 million ETH into its staking deposit contract since. Still, Phase 0 only marks the launch of Ethereum 2.0’s Proof-of-Stake implementation, and it’s only in Phase 1 that the network will activate its sharding mechanism.
Phase 1 should take place this year, partitioning the blockchain into 64 parallel shard chains that are in sync with each other. After this phase, the network should be capable of simultaneously processing transactions from 64 blocks while considerably reducing the load on the main beacon chain.
The intermediary Phase 1.5 involves integrating the original Proof-of-Work chain with the new beacon chain, creating a new Proof-of-Stake chain. This will allow Eth1 to exist as one of the 64 shard chains complete with its entire history of transactions without running the power-hungry PoW algorithm.
Phase 2 will involve the fine-tuning of accounts, transactions, and smart contract execution, and while this is the last stage of Ethereum 2.0’s launch, it could be the most important one. Implementing sharding is exceptionally complicated, and in the long run, the network’s phased launch is probably for the best.
Conclusion
Blockchain has been the recipient of both acclaim and criticism over the last decade for its future potential and current limitations, and its lack of a genuinely robust scaling solution has pushed many projects to more centralized mechanisms. As the chain’s development pushes forward, along with Layer-2 solutions like Plasma and Raiden, sharding could enable unprecedented throughput levels on the network, lending credence to the claims of blockchain proponents over all these years.