Auto-packing to/from cells/slices/builders
A short demo of how it looks:
struct Point {
x: int8;
y: int8;
}
var value: Point = { x: 10, y: 20 };
// makes a cell containing "0A14"
var c = value.toCell();
// back to { x: 10, y: 20 }
var p = Point.fromCell(c);
Key features of auto-serialization
- Supports all types: unions, tensors, nullables, generics, atomics, ...
- Allows you to specify serialization prefixes (particularly, opcodes)
- Allows you to manage cell references and when to load them
- Lets you control error codes and other behavior
- Unpacks data from a cell or a slice, mutate it or not
- Packs data to a cell or a builder
- Warns if data potentially exceeds 1023 bits
- More efficient than manual serialization
List of supported types and how they are serialized
A small reminder: Tolk has intN
types (int8
, uint64
, etc.). Of course, they can be nested, like nullable int32?
or a tensor (uint5, int128)
.
They are just integers at the TVM level, they can hold any value at runtime: overflow only happens at serialization.
For example, if you assign 256 to uint8, asm command "8 STU" will fail with code 5 (integer out of range).
Type | TL-B equivalent | Serialization notes |
---|---|---|
int8 , uint55 , etc. | same as TL-B | N STI / N STU |
coins | TL-B varint16 | STGRAMS |
bytes8 , bits123 , etc. | just N bits | runtime check + STSLICE (1) |
address | MsgAddress (internal/external/none) | STSLICE (2) |
bool | one bit | 1 STI |
cell | untyped reference, TL-B ^Cell | STREF |
cell? | maybe reference, TL-B (Maybe ^Cell) | STOPTREF |
Cell<T> | typed reference, TL-B ^T | STREF |
Cell<T>? | maybe typed reference, TL-B (Maybe ^T) | STOPTREF |
RemainingBitsAndRefs | rest of slice | STSLICE |
builder | only for writing, not for reading | STBR |
slice | only for writing, not for reading | STSLICE |
T? | TL-B (Maybe T) | 1 STI + IF ... |
T1 | T2 | TL-B (Either T1 T2) | 1 STI + IF ... + ELSE ... (3) |
T1 | T2 | ... | TL-B multiple constructors | IF ... + ELSE IF ... + ELSE ... (4) |
(T1, T2) | TL-B (Pair T1 T2) = one by one | pack T1 + pack T2 |
(T1, T2, ...) | nested pairs = one by one | pack T1 + pack T2 + ... |
SomeStruct | fields one by one | like a tensor |
-
(1) By analogy with
intN
, there is arebytesN
types. It's just aslice
under the hood: the type shows how to serialize this slice. By default, beforeSTSLICE
, the compiler inserts runtime checks (get bits/refs count + compare with N + compare with 0). These checks ensure that serialized binary data will be correct, but they cost gas. However, if you guarantee that a slice is valid (for example, it comes from trusted sources), pass an optionskipBitsNFieldsValidation
to disable runtime checks. -
(2) In TVM, all addresses are also plain slices. Type
address
indicates that it's a slice containing some valid address (internal/external/none). It's packed withSTSLICE
(no runtime checks) and loaded withLDMSGADDR
.
Don't confuse address none
with null! None
is a valid address (two zero bits), whereas address?
is maybe address
(bit "0" OR bit "1" + address).
- (3) TL-B Either is expressed with a union
T1 | T2
. For example,int32 | int64
is packed as ("0" + 32-bit int OR "1" + 64-bit int).
However, if T1 and T2 are both structures with manual serialization prefixes, those prefixes are used instead of a 0/1 bit.
- (4) To (un)pack a union, say,
Msg1 | Msg2 | Msg3
, we need serialization prefixes. For structures, you can specify them manually (or the compiler will generate them right here). For primitives, likeint32 | int64 | int128 | int256
, the compiler generates a prefix tree (00/01/10/11 in this case). Read auto-generating serialization prefixes below.
Some examples of valid types
struct A {
f1: int8; // just int8
f2: int8?; // maybe int8
f3: address; // internal/external/none
f4: bool; // TRUE (-1) serialized as bit '1'
f5: B; // embed fields of struct B
f6: B?; // maybe B
f7: coins; // used for money amounts
r1: cell; // always-existing untyped ref
r2: Cell<B>; // typed ref
r3: Cell<int32>?; // optional ref that stores int32
u1: int32 | int64; // Either
u2: B | C; // also Either
u3: B | C | D; // manual or autogenerated prefixes
u4: bits4 | bits8?; // autogenerated prefix tree
// even this works
e: Point | Cell<Point>;
// rest of slice
rest: RemainingBitsAndRefs;
}
Serialization prefixes and opcodes
Declaring a struct, there is a special syntax to provide pack prefixes.
Typically, you'll use 32-bit prefixes for messages opcodes, or arbitrary prefixes is case you'd like to express TL-B multiple constructors.
struct (0x7362d09c) TransferNotification {
queryId: uint64;
...
}
Prefixes can be of any width, they are not restricted to be 32 bit.
0x000F
— 16-bit prefix0x0F
— 8-bit prefix0b010
— 3-bit prefix0b00001111
— 8-bit prefix
Declaring messages with 32-bit opcodes does not differ from declaring any other structs. Say, you the following TL-B scheme:
asset_simple$001 workchain:int8 ptr:bits32 = Asset;
asset_booking$1000 order_id:uint64 = Asset;
...
You can express the same with structures and union types:
struct (0b001) AssetSimple {
workchain: int8;
ptr: bits32;
}
struct (0b1000) AssetBooking {
orderId: uint64;
}
type Asset = AssetSimple | AssetBooking | ...;
struct ProvideAssetMessage {
...
asset: Asset;
}
When deserializing, Asset
will follow manually provided prefixes:
// msg.asset is parsed as '001' + int8 + bits32 OR ...
val msg = ProvideAssetMessage.fromSlice(s);
// now, msg.asset is just a union
// you can match it
match (msg.asset) {
AssetSimple => { // smart cast
msg.asset.workchain
msg.asset.ptr
}
...
}
// or test with `is` operator
if (msg.asset is AssetBooking) {
...
}
// or do any other things with a union:
// prefixes play their role only in deserialization process
When serializing, everything also works as expected:
val out: ProvideAssetMessage = {
...,
asset: AssetSimple { // will be serialized as
workchain: ..., // '001' + int8 + bits32
ptr: ...
}
}
Note that if a struct has a manual pack prefix, it does not matter whether this struct is inside any union or not.
struct (0x1234) MyData {
...
}
MyData.fromSlice(s) // expected to be "1234..." (hex)
data.toCell() // "1234..."
That's why, when you declare outgoing messages with 32-bit opcodes and just serialize them, that opcodes are included in binary data.
What can NOT be serialized
int
can't be serialized, it does not define binary width; useint32
,uint64
, etc.slice
, for the same reason; useaddress
orbitsN
- tuples, not implemented
A | B
(andA|B|C|...
in general) if A has manual serialization prefix, B not (because it seems like a bug in your code)int32 | A
(andprimitives|...|structs
in general) if A has manual serialization prefix (because it's not definite what prefixes to use for primitives)
Example of invalid:
struct (0xFF) A {}
struct B {} // forgot prefix
fun invalidDemo(obj: A | B) {
// (it's better to fire an error than to generate '0'+'FF'+dataA OR '1'+dataB)
obj.toCell(); // error: A has prefix, B not
}
Error messages if serialization unavailable
If you, by mistake, use unsupported types, Tolk compiler will fire a meaningful error. Example:
struct ExtraData {
owner: address;
lastTime: int;
}
struct Storage {
...
more: Cell<ExtraData>;
}
Storage.fromSlice("");
fires an error:
auto-serialization via fromSlice() is not available for type `Storage`
because field `Storage.more` of type `Cell<ExtraData>` can't be serialized
because type `ExtraData` can't be serialized
because field `ExtraData.lastTime` of type `int` can't be serialized
because type `int` is not serializable, it doesn't define binary width
hint: replace `int` with `int32` / `uint64` / `coins` / etc.
Controlling cell references. Typed cells
Tolk gives you full control over how your data is placed in cells and how cells reference each other. When you declare fields in a struct, there is no compiler magic of reordering fields, making any implicit references, etc. As follows, whenever you need to place data in a ref, you do it manually. As well as you manually control, when contents of that ref is loaded.
There are two types of references: typed and untyped.
struct NftCollectionStorage {
ownerAddress: address;
nextItemIndex: uint64;
content: cell; // untyped
nftItemCode: cell; // untyped
royaltyParams: Cell<RoyaltyParams>; // typed
}
struct RoyaltyParams {
numerator: uint16;
denominator: uint16;
royaltyAddress: address;
}
When you call NftCollectionStorage.fromSlice
(or fromCell), the process is as follows:
- read address (slice.loadAddress)
- read uint64 (slice.loadUint 64)
- read three refs (slice.loadRef); do not unpack them: we just have pointers to cells
Note, that royaltyParams
is Cell<T>
, not T
itself. You can NOT access numerator
, etc. To load those fields, you should manually unpack that ref:
// error: field does not exist in type `Cell<RoyaltyParams>`
st.royaltyParams.numerator
// that's the way
val rp = st.royaltyParams.load(); // Cell<T> -> T
rp.numerator
// alternatively
val rp = RoyaltyParams.fromCell(st.royaltyParams);
And vice versa: when composing such a struct, you should assign a typed cell to a field:
val st: NftCollectionStorage = {
...
// error
royaltyParams: RoyaltyParams{ ... }
// correct
royaltyParams: RoyaltyParams{ ... }.toCell()
}
Probably, you've guessed that T.toCell()
makes Cell<T>
, actually. That's true:
val c = p.toCell(); // Point to Cell<Point>
val p2 = c.load(); // Cell<Point> to Point
With types cells, you can express snake data:
struct Snake {
data: bits1023;
next: Cell<Snake>?;
}
So, typed cells are a powerful mechanism to express the contents of referenced cells.
Note that Cell<address>
or even Cell<int32 | int64>
is also okay, you are not restricted to structures.
When it comes to untyped cells — just cell
— they also denote references, but don't denote their inner contents, don't have the .load()
method.
It's just some cell, like code/data of a contract or an untyped nft content.
Remaining data after reading
Suppose you have struct Point (x int8, y int8), and read from a slice with contents "0102FF". Byte "01" for x, byte "02" for y, and the remaining "FF" — is it correct?
By default, this is incorrect. By default, functions fromCell
and fromSlice
ensure the slice end after reading.
In this case, exception 9 ("cell underflow") is thrown.
But you can override this behavior with an option:
Point.fromSlice(s, {
assertEndAfterReading: false
})
UnpackOptions and PackOptions
They allow you to control behavior of fromCell
, toCell
, and similar functions:
MyMsg.fromSlice(s, {
throwIfOpcodeDoesNotMatch: 0xFFFF
})
Serialization functions have the second optional parameter, actually:
fun T.fromSlice(rawSlice: slice, options: UnpackOptions = {}): T;
When you don't pass it, default options are used. But you can overload some of the options.
For deserialization (fromCell
and similar), there are now two available options:
Field of UnpackOptions | Description |
---|---|
assertEndAfterReading | after finished reading all fields from a cell/slice, call slice.assertEnd to ensure no remaining data left; it's the default behavior, it ensures that you've fully described data you're reading with a struct; for struct Point , input "0102" is ok, "0102FF" will throw excno 9; default: true |
throwIfOpcodeDoesNotMatch | this excNo is thrown if opcode doesn't match, e.g. for struct (0x01) A given input "88..."; similarly, for a union type, this is thrown when none of the opcodes match; default: 63 |
For serialization (toCell
and similar), there is now one option:
Field of PackOptions | Description |
---|---|
skipBitsNFieldsValidation | when a struct has a field of type bits128 and similar (it's a slice under the hood), by default, compiler inserts runtime checks (get bits/refs count + compare with 128 + compare with 0); these checks ensure that serialized binary data will be correct, but they cost gas; however, if you guarantee that a slice is valid (for example, it comes from trusted sources), set this option to true to disable runtime checks; note: int32 and other are always validated for overflow without any extra gas, so this flag controls only rarely used bytesN / bitsN types; default: false |
Full list of serialization functions
Each of them can be controlled by PackOptions
described above.
T.toCell()
— convert anything to a cell. Example:
contract.setData(storage.toCell());
Internally, a builder is created, all fields are serialized one by one, and a builder is flushed (beginCell() + serialize fields + endCell()).
builder.storeAny<T>(v)
— similar tobuilder.storeUint()
and others, but allows storing structures. Example:
var b = beginCell()
.storeUint(32)
.storeAny(msgBody) // T=MyMsg here
.endCell();
Full list of deserialization functions
Each of them can be controlled by UnpackOptions
described above.
T.fromCell(c)
— parse anything from a cell. Example:
var st = MyStorage.fromCell(contract.getData());
Internally, a cell is unpacked to a slice, and that slice is parsed (c.beginParse() + read from slice).
T.fromSlice(s)
— parse anything from a slice. Example:
var msg = CounterIncrement.fromSlice(cs);
All fields are read from a slice immediately. If a slice is corrupted, an exception is thrown (most likely, excode 9 "cell underflow"). Note, that a passed slice is NOT mutated, its internal pointer is NOT shifted. If you need to mutate it, like cs.loadInt()
, consider calling cs.loadAny<Increment>()
.
slice.loadAny<T>
— parse anything from a slice, shifting its internal pointer. Similar toslice.loadUint()
and others, but allows loading structures. Example:
var st: MyStorage = cs.loadAny(); // or cs.loadAny<MyStorage>()
cs.loadAny<int32>(); // = cs.loadInt(32)
Similar to MyStorage.fromSlice(cs)
, but called as a method and mutates the slice. Note: options.assertEndAfterReading
is ignored by this function, because it's actually intended to read data from the middle.
slice.skipAny<T>
— skip anything in a slice, shifting its internal pointer. Similar toslice.skipBits()
and others, but allows skipping structures. Example:
struct TwoInts { a: int32; b: int32; }
cs.skipAny<TwoInts>(); // skips 64 bits
cs.skipAny<int32>(); // = cs.skipBits(32)
Special type RemainingBitsAndRefs
It's a built-in type to get "all the rest" slice tail on reading. Example:
struct JettonMessage {
// ... some fields
forwardPayload: RemainingBitsAndRefs;
}
When you deserialize JettonMessage, forwardPayload contains everything left after reading fields above. Essentially, it's an alias to a slice which is handled specially while unpacking:
type RemainingBitsAndRefs = slice;
Auto-generating prefixes for unions
We've mentioned multiple times, that T1 | T2
is encoded as TL-B Either: bit '0' + T1 OR bit '1' + T2. But what about wider unions? Say,
struct WithUnion {
f: int8 | int16 | int32;
}
In this case, the compiler auto-generates a prefix tree. This field will be packed as: '00' + int8 OR '01' + int16 OR '10' + int32. On deserialization, the same format is expected (prefix '11' will throw an exception).
Same for structs without a manually specified prefix:
struct A { ... } // 0x... prefixes not specified
struct B { ... }
struct C { ... }
struct WithUnion {
// simple either ('0' + A OR '1' + B)
e: A | B;
// auto-generated prefix tree
f: A | B | C;
// even this works, why not
g: A | int32 | C | bits128;
}
When declaring a struct, you can manually specify serialization prefix (32-bit prefixes for messages are called opcodes, but prefixes in general can be of any length):
struct (0x01) WithPrefixLen8 { ... }
struct (0x00FF) WithPrefixLen16 { ... }
struct (0b1100) WithPrefixLen4 { ... }
struct WithUnion {
// manual prefixes will be used, not 0/1
e: WithPrefixLen8 | WithPrefixLen16;
// also, no auto-generation
f: WithPrefixLen8 | WithPrefixLen16 | WithPrefixLen4;
}
If you specify prefixes manually, they will be used in (de)serialization. Moreover: since a prefix exists for a struct, when deserializing a struct itself (not inside a union), a prefix is expected to be contained in binary data:
// s should be "00FF..."
WithPrefixLen16.fromSlice(s)
// c will be "00FF..."
WithPrefixLen16{...}.toCell()
So, the rules are quite simple:
- if you specify prefixes manually, they will be used (no matter within a union or not)
- if you don't specify any prefixes, the compiler auto-generates a prefix tree
- if you specify prefix for A, but forgot prefix for B,
A | B
can't be serialized - either (0/1) is just a prefix tree for two cases
How can I specify a serialization prefix for non-struct? Currently, there is no way to write something like Prefixed<int32, 0b0011>
.
But you can just create a struct with a single field:
struct (0b0011) MyPrefixedInt {
value: int32;
}