TL-B Language
TL-B (Type Language - Binary) serves to describe the type system, constructors and existing functions. For example, we
can use TL-B schemes to build binary structures associated with TON Blockchain. Special TL-B parsers can read schemes to
deserialize binary data into different objects. TL-B describes data schemes for Cell
objects. If you not familiar
with Cells
, please read Cell & Bag of Cells(BOC) article.
Overview
We refer to any set of TL-B constructs as TL-B documents. A TL-B document usually consists of declarations of types (
i.e. their constructors) and functional combinators. The declaration of each combinator ends with a semicolon (;
).
Here is an example of a possible combinator declaration:
Constructors
The left-hand side of each equation describes the way to define, or serialize, a value of the type indicated on the right-hand side. Such a description begins with the name of a constructor.
Constructors are used to specify the type of combinator, including the state at serialization. For example, constructors
can also be used when you want to specify an op
(operation code) in query to a smart contract in TON.
// ....
transfer#5fcc3d14 <...> = InternalMsgBody;
// ....
- constructor name:
transfer
- constructor prefix code:
#5fcc3d14
Notice, every constructor name immediately followed by an optional constructor tag, such as #_
or $10
, which
describes the bitstring used to encode (serialize) the constructor in question.
message#3f5476ca value:# = CoolMessage;
bool_true$0 = Bool;
bool_false$1 = Bool;
The left-hand side of each equation describes the way to define, or serialize, a value of the type indicated on the
right-hand side. Such a description begins with the name of a constructor, such as message
or bool_true
, immediately
followed by an optional constructor tag, such as #3f5476ca
or $0
, which describes the bits used to encode (
serialize)
the constructor in question.
constructor | serialization |
---|---|
some#3f5476ca | 32-bit uint serialize from hex value |
some#5fe | 12-bit uint serialize from hex value |
some$0101 | serialize 0101 raw bits |
some or some# | serialize crc32(equation) | 0x80000000 |
some#_ or some$_ or _ | serialize nothing |
Constructor names (some
in this example) are used as variables in codegen. For example:
bool_true$1 = Bool;
bool_false$0 = Bool;
Type Bool
has two tags 0
and 1
. Codegen pseudocode might look like:
class Bool:
tags = [1, 0]
tags_names = ['bool_true', 'bool_false']
If you don't want to define any name for current constructor, just pass _
, e.g. _ a:(## 32) = 32Int;
Constructor tags may be given in either binary (after a dollar sign) or hexadecimal notation (after a hash sign). If a
tag is not
explicitly provided, the TL-B parser must compute a default 32-bit constructor tag by hashing with CRC32 algorithm
the text of the “equation” with | 0x80000000
defining this constructor in a certain fashion. Therefore, empty tags
must be explicitly provided by #_
or $_
.
This tag willies used to guess current type of bitstring in deserialization process. E.g. we have 1 bit bitstring 0
,
if we tell TLB to parse this bitstring in type of Bool
it will parse it as Bool.bool_false
.
Let's say we have more complex examples:
tag_a$10 val:(## 32) = A;
tag_b$00 val(## 64) = A;
If we parse 1000000000000000000000000000000001
(1 and 32 zeroes and 1) in TLB type A
- firstly we need to get first
two bits to define tag. In this example 10
is two first bits and they represent tag_a
. So now we know that next 32
bits are val
variable, 1
in our example. Some "parsed" pseudocode variables may look like:
A.tag = 'tag_a'
A.tag_bits = '10'
A.val = 1
All constructor names must be distinct and constructor tags for the same type must constitute a prefix code (otherwise the deserialization would not be unique); i.e. no tag can be a prefix of any other in same type.
Maximum number of constructors per one type: 64
Maximum bits for tag: 63
example_a$10 = A;
example_b$01 = A;
example_c$11 = A;
example_d$00 = A;
Codegen pseudocode might look like:
class A:
tags = [2, 1, 3, 0]
tags_names = ['example_a', 'example_b', 'example_c', 'example_d']
example_a#0 = A;
example_b#1 = A;
example_c#f = A;
Codegen pseudocode might look like:
class A:
tags = [0, 1, 15]
tags_names = ['example_a', 'example_b', 'example_c']
If you use hex
tag, keep in mind that it will be serialized as 4 bits for each hex symbol. Maximum value is 63-bit
unsigned integer. This means:
a#32 a:(## 32) = AMultiTagInt;
b#1111 a:(## 32) = AMultiTagInt;
c#5FE a:(## 32) = AMultiTagInt;
d#3F5476CA a:(## 32) = AMultiTagInt;
constructor | serialization |
---|---|
a#32 | 8-bit uint serialize from hex value |
b#1111 | 16-bit uint serialize from hex value |
c#5FE | 12-bit uint serialize from hex value |
d#3F5476CA | 32-bit uint serialize from hex value |
Also hex values allowed both in upper and lower case.
More about hex tags
In addition to the classic hex tag definition, a hexadecimal number can be followed by the underscore character. This means that the tag is equal to the specified hexadecimal number without the least significant bit. For example there is a scheme:
vm_stk_int#0201_ value:int257 = VmStackValue;
And the tag is not actually equal to 0x0201
. To compute it we need to remove LSb from the binary representation of 0x0201
:
0000001000000001 -> 000000100000000
So the tag equals to the 15-bit binary number 0b000000100000000
.
Field definitions
The constructor and its optional tag are followed by field definitions. Each field definition is of the
form ident:type-expr
, where ident is an identifier with the name of the field (replaced by an underscore for
anonymous fields), and type-expr is the field’s type. The type provided here is a type expression, which may include
simple types, parametrized types with suitable parameters or complex expressions.
1023
bits and 4
refs)
Simple types
_ a:# = Type;
-Type.a
here is 32-bit integer_ a:(## 64) = Type;
-Type.a
here is 64-bit integer_ a:Owner = NFT;
-NFT.a
here isOwner
type_ a:^Owner = NFT;
-NFT.a
here is cell ref toOwner
type meansOwner
is stored in next cell reference.
Anonymous fields
_ _:# = A;
- first field is anonymous 32-bit integer
Extend cell with references
_ a:(##32) ^[ b:(##32) c:(## 32) d:(## 32)] = A;
- If for some reason we want to separate some fields to another
cell we can use
^[ ... ]
syntax. In this exampleA.a
/A.b
/A.c
/A.d
are 32-bit unsigned integers, butA.a
is stored in first cell, andA.b
/A.c
/A.d
are stored in next cell (1 ref)
_ ^[ a:(## 32) ^[ b:(## 32) ^[ c:(## 32) ] ] ] = A;
- Chain of references are also allowed. In this example each of
variables (
a
,b
,c
) are stored in separated cells
Parametrized types
Suppose we have IntWithObj
type:
_ {X:Type} a:# b:X = IntWithObj X;
Now we can use it in other types:
_ a:(IntWithObj uint32) = IntWithUint32;
Complex expressions
-
Conditional fields (only for
Nat
) (E?T
means if expressionE
is True than field has typeT
)_ a:(## 1) b:a?(## 32) = Example;
In
Example
type variableb
serialized only ifa
is1
-
Multiply expression for tuples creation (
x * T
means create tuple of lengthx
of typeT
):a$_ a:(## 32) = A;
b$_ b:(2 * A) = B;_ (## 1) = Bit;
_ 2bits:(2 * Bit) = 2Bits; -
Bit selection (only for
Nat
) (E . B
means take bitB
ofNat
E
)_ a:(## 2) b:(a . 1)?(## 32) = Example;
In
Example
type variableb
serialized only if second bita
is1
-
Other
Nat
operators also allowed (lookAllowed contraints
)
Note: you can combine several complex expressions:
_ a:(## 1) b:(## 1) c:(## 2) d:(a?(b?((c . 1)?(## 64)))) = A;
Built-in types
#
-Nat
32 bits unsigned integer## x
-Nat
withx
bits#< x
-Nat
less thanx
bit unsigned integer stored aslenBits(x - 1)
bits, up to 31 bits#<= x
-Nat
less or equal thanx
bit unsigned integer stored aslenBits(x)
bits, up to 32 bitsAny
/Cell
- rest of cell bits&refsInt
- 257 bitsUInt
- 256 bitsBits
- 1023 bitsuint1
-uint256
- 1 - 256 bitsint1
-int257
- 1 - 257 bitsbits1
-bits1023
- 1 - 1023 bitsuint X
/int X
/bits X
- same asuintX
but you can use parametrizedX
in this types
Constraints
_ flags:(## 10) { flags <= 100 } = Flag;