Skip to main content

Shards

Sharding is a mature concept originating in database design. It involves splitting and distributing one logical data set across multiple databases that share nothing and can be deployed across multiple servers. Simply put, sharding allows horizontal scalability - splitting data to distinct, independent pieces that can be processed in parallel. This is a key concept in the world's transition from data to big data. When data sets are becoming too big to be handled by traditional means, there's no other way to scale other than to break them down into smaller pieces.

The Sharding mechanism in the TON blockchain, allows a large number of transactions to be processed. The TON Blockchain consists of one masterchain and up to 232 workchains. Each workchain is a separate chain with its rules. Each workchain can further split into 260 shardchains, or sub-shards, containing a fraction of the workchain's state. Currently, only one workchain is operating on TON - Basechain. The main idea of sharding in TON is that when account A sends a message to account B and account C sends a message to account D both of these operations can be performed asynchronously.

By default in the Basechain (workchain=0) there is only one shard with a shard number 0x8000000000000000 (or 1000000000000000000000000000000000000000000000000000000000000000 in binary representation). The masterchain (workchain=-1) always has one and only one shard.

Masterchain

The Masterchain is the primary chain that stores the network configuration and the final state of all workchains. You can understand this as the masterchain being the core directory, a single source of truth for all the shards in the ecosystem.

It carries fundamental protocol information, including current settings, a list of active validators and their stakes, active workchains, and associated shardchains. Most importantly, it maintains a record of the latest block hashes for all workchains and shardchains, enforcing consensus across the network.

Workchain

The Masterchain splits into individual chains called Workchains. Workchains are customized blockchains tailored to certain transactions or use cases, running parallel within the TON network.

Uniqueness

In designing the TON blockchain, two key decisions were made that make it unique among other blockchains that utilize sharding.

First, TON provides for dynamic segmentation of the blockchain depending on the network load. When the number of transactions increases to a critical level, the blockchain is automatically split into two separate shardchains. If the load on one of the parts continues to grow, it is split in half again, and this process continues as needed. If the number of transactions decreases, the shards can merge again. This adaptive model allows for the creation of as many shards as needed at a given time.

The second solution that distinguishes TON is the principle of a non-fixed number of shards. Unlike systems like Ethereum 2.0, which supports a fixed number of shards (64 shards), TON allows adding more and more shards depending on the needs of the network, with a theoretical limit of 260 shards per worker chain. This number is so large as to be virtually limitless, and allows over 100 million shards to be provided to every person on Earth and still have a supply left over. This approach is the only way to meet dynamic scaling requirements that are difficult to predict in advance.

Splitting

In TON Blockchain, a sequence of transactions of a single account (e.g. Tx1 -> Tx2 -> Tx3 -> ...) is called an account transaction chain or AccountChain. This emphasizes that it is a sequence of transactions associated with a single account. Several such AccountChains combined within a single shard form a ShardChain. The ShardChain (hereinafter referred to as the shard) is responsible for storing and processing all transactions within a shard, where each transaction chain is defined by a specific account group.

These groups of accounts are denoted by a common binary prefix, which serves as a criterion for clustering them in the same shard. This prefix appears in shard id which presented by a 64-bit integer and has the following structure: <binary prefix>100000.... For example, shard with id 1011100000... contains all accounts started with prefix 1011.

When number of transactions in some shard grows, this shard splits into two shards. New shards obtain the following ids: <parent prefix>01000... and <parent prefix>11000... and become responsible for accounts starting with <parent prefix>0 and <parent prefix>1 correspondingly. The seqnos of blocks in shards go continuously starting from the parent last seqno plus 1. After split, shards go independently and may have different seqnos.

A simple example:

Masterchain block contains information about shards in its header. After the block of shard appears in masterchain header it may be considered as finished (it cannot be rolled back).

An actual example:

  • Masterchain block seqno=34607821 has 2 shards: (0,4000000000000000,40485798) and (0,c000000000000000,40485843) (https://toncenter.com/api/v2/shards?seqno=34607821).
  • Shard shard=4000000000000000 was splitted into shard=2000000000000000 and shard=6000000000000000, and the masterchain block seqno=34607822 obtains 3 shards: (0,c000000000000000,40485844), (0,2000000000000000,40485799) and (0,6000000000000000,40485799). Note that both new shards have the same seqnos but different shard IDs (https://toncenter.com/api/v2/shards?seqno=34607822).
  • New shards go independently and after 100 masterchain blocks (in masterchain block seqno=34607921) one shard has the last block (0,2000000000000000,40485901) and another one has (0,6000000000000000,40485897) (https://toncenter.com/api/v2/shards?seqno=34607921).

Merging

If the load on shards goes down they can merge back:

  • Two shards can merge if they have a common parent and, therefore, their shard ids are <parent prefix>010... and <parent prefix110.... Merged shard will have shard id <parent prefix>10... (for example 10010... + 10110... = 1010...). The first block of a merged shard will have seqno=max(seqno1, seqno2) + 1.

A simple example:

An actual example:

See Also