# TL-B Language

TL-B (Type Language - Binary) serves to describe the type system, constructors and existing functions. For example, we
can use TL-B schemes to build binary structures associated with TON Blockchain. Special TL-B parsers can read schemes to
deserialize binary data into different objects. TL-B describes data schemes for `Cell`

objects. If you not familiar
with `Cells`

, please read Cell & Bag of Cells(BOC) article.

## Overview

We refer to any set of TL-B constructs as TL-B documents. A TL-B document usually consists of declarations of types (
i.e. their constructors) and functional combinators. The declaration of each combinator ends with a semicolon (`;`

).

Here is an example of a possible combinator declaration:

## Constructors

The left-hand side of each equation describes the way to define, or serialize, a value of the type indicated on the right-hand side. Such a description begins with the name of a constructor.

Constructors are used to specify the type of combinator, including the state at serialization. For example, constructors
can also be used when you want to specify an `op`

(operation code) in query to a smart contract in TON.

`// ....`

transfer#5fcc3d14 <...> = InternalMsgBody;

// ....

- constructor name:
`transfer`

- constructor prefix code:
`#5fcc3d14`

Notice, every constructor name immediately followed by an optional constructor tag, such as `#_`

or `$10`

, which
describes the bitstring used to encode (serialize) the constructor in question.

`message#3f5476ca value:# = CoolMessage;`

bool_true$0 = Bool;

bool_false$1 = Bool;

The left-hand side of each equation describes the way to define, or serialize, a value of the type indicated on the
right-hand side. Such a description begins with the name of a constructor, such as `message`

or `bool_true`

, immediately
followed by an optional constructor tag, such as `#3f5476ca`

or `$0`

, which describes the bits used to encode (
serialize)
the constructor in question.

constructor | serialization |
---|---|

`some#3f5476ca` | 32-bit uint serialize from hex value |

`some#5fe` | 12-bit uint serialize from hex value |

`some$0101` | serialize `0101` raw bits |

`some` or `some#` | serialize `crc32(equation) \| 0x80000000` |

`some#_` or `some$_` or `_` | serialize nothing |

Constructor names (`some`

in this example) are used as variables in codegen. For example:

`bool_true$1 = Bool;`

bool_false$0 = Bool;

Type `Bool`

has two tags `0`

and `1`

. Codegen pseudocode might look like:

class Bool:

tags = [1, 0]

tags_names = ['bool_true', 'bool_false']

If you don't want to define any name for current constructor, just pass `_`

, e.g. `_ a:(## 32) = 32Int;`

Constructor tags may be given in either binary (after a dollar sign) or hexadecimal notation (after a hash sign). If a
tag is not
explicitly provided, the TL-B parser must compute a default 32-bit constructor tag by hashing with CRC32 algorithm
the text of the “equation” with `| 0x80000000`

defining this constructor in a certain fashion. Therefore, empty tags
must be explicitly provided by `#_`

or `$_`

.

This tag willies used to guess current type of bitstring in deserialization process. E.g. we have 1 bit bitstring `0`

,
if we tell TLB to parse this bitstring in type of `Bool`

it will parse it as `Bool.bool_false`

.

Let's say we have more complex examples:

`tag_a$10 val:(## 32) = A;`

tag_b$00 val(## 64) = A;

If we parse `1000000000000000000000000000000001`

(1 and 33 zeroes and 1) in TLB type `A`

- firstly we need to get first
two bits to define tag. In this example `10`

is two first bits and they represent `tag_a`

. So now we know that next 32
bits are `val`

variable, `1`

in our example. Some "parsed" pseudocode variables may look like:

`A.tag = 'tag_a'`

A.tag_bits = '10'

A.val = 1

All constructor names must be distinct and constructor tags for the same type must constitute a prefix code (otherwise the deserialization would not be unique); i.e. no tag can be a prefix of any other in same type.

Maximum number of constructors per one type: `64`

Maximum bits for tag: `63`

**Binary example:**

`example_a$10 = A;`

example_b$01 = A;

example_c$11 = A;

example_d$00 = A;

Codegen pseudocode might look like:

class A:

tags = [2, 1, 3, 0]

tags_names = ['example_a', 'example_b', 'example_c', 'example_d']

**Hex tag example:**

`example_a#0 = A;`

example_b#1 = A;

example_c#f = A;

Codegen pseudocode might look like:

class A:

tags = [0, 1, 15]

tags_names = ['example_a', 'example_b', 'example_c']

If you use `hex`

tag, keep in mind that it will be serialized as 4 bits for each hex symbol. Maximum value is 63-bit
unsigned integer. This means:

`a#32 a:(## 32) = AMultiTagInt;`

b#1111 a:(## 32) = AMultiTagInt;

c#5FE a:(## 32) = AMultiTagInt;

d#3F5476CA a:(## 32) = AMultiTagInt;

constructor | serialization |
---|---|

`a#32` | 8-bit uint serialize from hex value |

`b#1111` | 16-bit uint serialize from hex value |

`c#5FE` | 12-bit uint serialize from hex value |

`d#3F5476CA` | 32-bit uint serialize from hex value |

Also hex values allowed both in upper and lower case.

#### More about hex tags

In addition to the classic hex tag definition, a hexadecimal number can be followed by the underscore character. This means that the tag is equal to the specified hexadecimal number without the least significant bit. For example there is a scheme:

`vm_stk_int#0201_ value:int257 = VmStackValue;`

And the tag is not actually equal to `0x0201`

. To compute it we need to remove LSb from the binary representation of `0x0201`

:

`0000001000000001 -> 000000100000000`

So the tag equals to the 15-bit binary number `0b000000100000000`

.

## Field definitions

The constructor and its optional tag are followed by field definitions. Each field definition is of the
form `ident:type-expr`

, where ident is an identifier with the name of the field (replaced by an underscore for
anonymous fields), and type-expr is the field’s type. The type provided here is a type expression, which may include
simple types, parametrized types with suitable parameters or complex expressions.

**In sum ap all fields defined in type must not be greater than Cell (`1023` bits and `4` refs)**

### Simple types

`_ a:# = Type;`

-`Type.a`

here is 32-bit integer`_ a:(## 64) = Type;`

-`Type.a`

here is 64-bit integer`_ a:Owner = NFT;`

-`NFT.a`

here is`Owner`

type`_ a:^Owner = NFT;`

-`NFT.a`

here is cell ref to`Owner`

type means`Owner`

is stored in next cell reference.

### Anonymous fields

`_ _:# = A;`

- first field is anonymous 32-bit integer

### Extend cell with references

`_ a:(##32) ^[ b:(##32) c:(## 32) d:(## 32)] = A;`

- If for some reason we want to separate some fields to another
cell we can use
`^[ ... ]`

syntax. In this example`A.a`

/`A.b`

/`A.c`

/`A.d`

are 32-bit unsigned integers, but`A.a`

is stored in first cell, and`A.b`

/`A.c`

/`A.d`

are stored in next cell (1 ref)

`_ ^[ a:(## 32) ^[ b:(## 32) ^[ c:(## 32) ] ] ] = A;`

- Chain of references are also allowed. In this example each of
variables (
`a`

,`b`

,`c`

) are stored in separated cells

### Parametrized types

Suppose we have `IntWithObj`

type:

`_ {X:Type} a:# b:X = IntWithObj X;`

Now we can use it in other types:

`_ a:(IntWithObj uint32) = IntWithUint32;`

### Complex expressions

- Conditional fields (only for
`Nat`

) (`E?T`

means if expression`E`

is True than field has type`T`

)In`_ a:(## 1) b:a?(## 32) = Example;`

`Example`

type variable`b`

serialized only if`a`

is`1`

Multiply expression for tuples creation (

`x * T`

means create tuple of length`x`

of type`T`

):`a$_ a:(## 32) = A;`

b$_ b:(2 * A) = B;`_ (## 1) = Bit;`

_ 2bits:(2 * Bit) = 2Bits;Bit selection (only for

`Nat`

) (`E . B`

means take bit`B`

of`Nat`

`E`

)`_ a:(## 2) b:(a . 1)?(## 32) = Example;`

In

`Example`

type variable`b`

serialized only if second bit`a`

is`1`

- Other
`Nat`

operators also allowed (look`Allowed contraints`

)

Note: you can combine several complex expressions:

`_ a:(## 1) b:(## 1) c:(## 2) d:(a?(b?((c . 1)?(## 64)))) = A;`

## Built-in types

`#`

-`Nat`

32 bits unsigned integer`## x`

-`Nat`

with`x`

bits`#< x`

-`Nat`

less than`x`

bit unsigned integer stored as`lenBits(x - 1)`

bits, up to 31 bits`#<= x`

-`Nat`

less or equal than`x`

bit unsigned integer stored as`lenBits(x)`

bits, up to 32 bits`Any`

/`Cell`

- rest of cell bits&refs`Int`

- 257 bits`UInt`

- 256 bits`Bits`

- 1023 bits`uint1`

-`uint256`

- 1 - 256 bits`int1`

-`int257`

- 1 - 257 bits`bits1`

-`bits1023`

- 1 - 1023 bits`uint X`

/`int X`

/`bits X`

- same as`uintX`

but you can use parametrized`X`

in this types

## Constraints

`_ flags:(## 10) { flags <= 100 } = Flag;`

`Nat`

fields allowed in constraints. In this example `{ flags <= 100 }`

constraint means that `flags`

variable is less
or
equal `100`

.

Allowed contraints: `E`

| `E = E`

| `E <= E`

| `E < E`

| `E >= E`

| `E > E`

| `E + E`

| `E * E`

| `E ? E`

## Implicit fields

Some fields may be implicit. Their definitions are surrounded by curly
brackets(`{`

, `}`

), which indicate that the field is not actually present in the serialization, but that its value must
be deduced from other data (usually the parameters of the type being serialized). Example:

`nothing$0 {X:Type} = Maybe X;`

just$1 {X:Type} value:X = Maybe X;

`_ {x:#} a:(## 32) { ~x = a + 1 } = Example;`

## Parametrized types

Variables — i.e. the (identifiers of the) previously
defined fields of types `#`

(natural numbers) or `Type`

(type of types) — may be used as parameters for the parametrized
types. The serialization process recursively serializes each field according to its type and the serialization of a
value ultimately consists of the concatenation of bits representing the constructor (i.e. the constructor tag) and
the field values.

### Natural numbers (`Nat`

)

`_ {x:#} my_val:(## x) = A x;`

Means that `A`

is parametrized by `x`

`Nat`

. In deserialization process we will fetch `x`

-bit unsigned integer E.g.:

`_ value:(A 32) = My32UintValue;`

Means than in deserialization process of `My32UintValue`

type we will fetch 32-bit unsigned integer (because of `32`

parameter to `A`

type)

### Types

`_ {X:Type} my_val:(## 32) next_val:X = A X;`

Means that `A`

is parametrized by `X`

type. In deserialization process we will fetch 32-bit unsigned integer and than
parse
bits&refs of type `X`

.

Usage example of such parametrized type can be:

`_ bit:(## 1) = Bit;`

_ 32intwbit:(A Bit) = 32IntWithBit;

In this example we pass type `Bit`

to `A`

as parameter.

If you don't want to define type, but want to deserialize by this scheme you may use `Any`

word:

`_ my_val:(A Any) = Example;`

Means that if we deserialize `Example`

type we will fetch 32-bit integer and then rest of cell (bits&refs) to `my_val`

.

You can create complex types with several parameters:

`_ {X:Type} {Y:Type} my_val:(## 32) next_val:X next_next_val:Y = A X Y;`

_ bit:(## 1) = Bit;

_ a_with_two_bits:(A Bit Bit) = AWithTwoBits;

Also you can use partial apply on such parametrized types:

`_ {X:Type} {Y:Type} v1:X v2:Y = A X Y;`

_ bit:(## 1) = Bit;

_ {X:Type} bits:(A Bit X) = BitA X;

Or even parametrized types itself:

`_ {X:Type} v1:X = A X;`

_ {X:Type} d1:X = B X;

_ {X:Type} bits:(A (B X)) = AB X;

### NAT fields usage for parametrized types

You can use fields defined previously like parameters to types. Serialization will be determinate in runtime.

Simple example:

`_ a:(## 8) b:(## a) = A;`

This means that we store size of `b`

field inside of `a`

field. So when we want to serialize type `A`

we need to load 8
bit unsigned integer of `a`

field and then use this number to determinate size of `b`

field.

This strategy works for parametrized types as well:

`_ {input:#} c:(## input) = B input;`

_ a:(## 8) c_in_b:(B a) = A;

### Expression in parametrized types

`_ {x:#} value:(## x) = Example (x * 2);`

_ _:(Example 4) = 2BitInteger;

In this example `Example.value`

type is determinate in runtime.

In `2BitInteger`

definition we set value `Example 4`

type. To determinate this type we use `Example (x * 2)`

definition and calculate `x`

by formula (`y = 2`

, `z = 4`

):

`static inline bool mul_r1(int& x, int y, int z) {`

return y && !(z % y) && (x = z / y) >= 0;

}

We can also use add operator:

`_ {x:#} value:(## x) = ExampleSum (x + 3);`

_ _:(ExampleSum 4) = 1BitInteger;

In `1BitInteger`

definition we set value `ExampleSum 4`

type. To determinate this type we use `ExampleSum (x + 3)`

definition and calculate `x`

by formula (`y = 3`

, `z = 4`

):

`static inline bool add_r1(int& x, int y, int z) {`

return z >= y && (x = z - y) >= 0;

}

## Negate operator (`~`

)

Some occurrences of “variables” (i.e. already-defined fields) are prefixed by a tilde(`~`

). This indicates that the
variable’s occurrence is used in the opposite way to the default behavior: on the left-hand side of the equation, it
means that the variable will be deduced (computed) based on this occurrence, instead of substituting its previously
computed value; in the right-hand side, conversely, it means that the variable will not be deduced from the type being
serialized, but rather that it will be computed during the deserialization process. In other words, a tilde transforms
an “input argument” into an “output argument” or vice versa.

Simple example for negate operator is definition of new variable base on another variable:

`_ a:(## 32) { b:# } { ~b = a + 100 } = B_Calc_Example;`

After definition, you can use new variable for passing it to `Nat`

types:

`_ a:(## 8) { b:# } { ~b = a + 10 }`

example_dynamic_var:(## b) = B_Calc_Example;

The size of `example_dynamic_var`

will be computed in runtime, when we load `a`

variable and use it value for
determination of `example_dynamic_var`

size.

Or to other types:

`_ {X:Type} a:^X = PutToRef X;`

_ a:(## 32) { b:# } { ~b = a + 100 }

my_ref: (PutToRef b) = B_Calc_Example;

Also you can define variables with negate operator in add or multiply complex expressions:

`_ a:(## 32) { b:# } { ~b + 100 = a } = B_Calc_Example;`

`_ a:(## 32) { b:# } { ~b * 5 = a } = B_Calc_Example;`

### Negate operator (`~`

) in type definition

`_ {m:#} n:(## m) = Define ~n m;`

_ {n_from_define:#} defined_val:(Define ~n_from_define 8) real_value:(## n_from_define) = Example;

Assume we have class `Define ~n m`

which takes `m`

and compute `n`

loading it from `m`

bit unsigned integer.

In `Example`

type we store variable computed by `Define`

type into `n_from_define`

, also we know that it's `8`

bit
unsigned integer, because we apply `Define`

type with `Define ~n_from_define 8`

. Now we can use `n_from_define`

variable
in other types to determinate serialization process.

This technic lead to more complex type definitions (such as Unions, Hashmaps).

`unary_zero$0 = Unary ~0;`

unary_succ$1 {n:#} x:(Unary ~n) = Unary ~(n + 1);

_ u:(Unary Any) = UnaryChain;

This is example has good explanation in TL-B Types
article. The main idea here is that `UnaryChain`

will recursively deserialize until reach of `unary_zero$0`

(because we
know last element of `Unary X`

type by definition `unary_zero$0 = Unary ~0;`

and `X`

is calculated in runtime
due `Unary ~(n + 1)`

definition).

Note: `x:(Unary ~n)`

means that `n`

is defined in process of serialization of `Unary`

class.

## Special types

Currently, TVM allow types of cells:

- Ordinary
- PrunnedBranch
- Library
- MerkleProof
- MerkleUpdate

By default, all cells are `Ordinary`

. And all cells described in tlb are `Ordinary`

.

To allow load of special types in constructor you need to add `!`

before constructor.

Example:

`!merkle_update#02 {X:Type} old_hash:bits256 new_hash:bits256`

old:^X new:^X = MERKLE_UPDATE X;

!merkle_proof#03 {X:Type} virtual_hash:bits256 depth:uint16 virtual_root:^X = MERKLE_PROOF X;

This technic allow codegen code to mark `SPECIAL`

cells when you want to print structure, also it allow to correctly
validate structures with special cells.

## Several instances for one type without constructor uniqueness tag check

It's allowed to create several instances of one type depending only on type parameters. In this way of definition constructor tag unique check will not be applied.

Example:

`_ = A 1;`

a$01 = A 2;

b$01 = A 3;

_ test:# = A 4;

Means that actual tag for deserialization will be determinate by `A`

type parameter:

`# class for type `A``

class A(TLBComplex):

class Tag(Enum):

a = 0

b = 1

cons1 = 2

cons4 = 3

cons_len = [2, 2, 0, 0]

cons_tag = [1, 1, 0, 0]

m_: int = None

def __init__(self, m: int):

self.m_ = m

def get_tag(self, cs: CellSlice) -> Optional["A.Tag"]:

tag = self.m_

if tag == 1:

return A.Tag.cons1

if tag == 2:

return A.Tag.a

if tag == 3:

return A.Tag.b

if tag == 4:

return A.Tag.cons4

return None

Same works with several parameters:

`_ = A 1 1;`

a$01 = A 2 1;

b$01 = A 3 3;

_ test:# = A 4 2;

Please, keep in mind that when you add parametrized type definition, tags between predefined type definition (`a`

and `b`

in our example) and
parametrized type definition (`c`

in our example) must be unique:

*Not valid example:*

`a$01 = A 2 1;`

b$11 = A 3 3;

c$11 {X:#} {Y:#} = A X Y;

*Valid example:*

`a$01 = A 2 1;`

b$01 = A 3 3;

c$11 {X:#} {Y:#} = A X Y;

## Comments

Comments are the same as in C++

`/* `

This is

a comment

*/

// This is one line comment

## IDE Support

The intellij-ton plugin supports Fift, FunC and also TL-B.

The TL-B grammar is described in
the TlbParser.bnf file.

## Useful sources

Documentation provided by Disintar team.