Tolk vs FunC: in detail

A huge list is below. Will anyone have enough patience to read it up to the end?..

There is a compact version

Traditional comments

FunC	Tolk
`;; comment`	`// comment`
`{- multiline comment -}`	`/* multiline comment */`

2+2 is 4, not an identifier. Identifiers can only be alpha-numeric

In FunC, almost any character can be a part of the identifier. For example, 2+2 (without a space) is an identifier. You can even declare a variable with such a name.

In Tolk, spaces are not mandatory. 2+2 is 4, as expected. 3+~x is 3 + (~ x), and so on.

FunC	Tolk
return 2+2; ;; undefined function `2+2`	`return 2+2; // 4`

More precisely, an identifier can start from [a-zA-Z$_] and be continued with [a-zA-Z0-9$_]. Note that ?, :, and others are not valid symbols, and found? and op::increase are invalid identifiers.

Note, that cell, slice, etc. are valid identifiers: var cell = ... or even var cell: cell = ... is okay. (like in TypeScript, number is a valid identifier)

You can use backticks to surround an identifier, and then it can contain any symbols (similar to Kotlin and some other languages). This potential usage is to allow keywords to be used as identifiers in case of code generation by a scheme, for example.

FunC	Tolk
`const op::increase = 0x1234;`	`const OP_INCREASE = 0x1234`
`;; even 2%&!2 is valid int 2+2 = 5;`	// don't do like this :) var `2+2` = 5;

Impure by default, compiler won't drop user function calls

FunC has an impure function specifier. When absent, a function is treated as pure. If its result is unused, its call is deleted by the compiler.

Though this behavior is documented, it is very unexpected to newcomers. For instance, various functions that don't return anything (throw an exception on mismatch, for example) are silently deleted. This situation is spoilt by FunC not checking and validating the function body, allowing impure operations inside pure functions.

In Tolk, all functions are impure by default. You can mark a function pure with annotation, and then impure operations are forbidden in its body (exceptions, globals modification, calling non-pure functions, etc.).

New functions syntax: fun keyword, @ attributes, types on the right (like in TypeScript, Kotlin, Python, etc.)

FunC	Tolk
`cell parse_data(slice cs) { }`	`fun parse_data(cs: slice): cell { }`
`(cell, int) load_storage() { }`	`fun load_storage(): (cell, int) { }`
`() main() { ... }`	`fun main() { ... }`

Types of variables — also to the right:

FunC	Tolk
`slice cs = ...;`	`var cs: slice = ...;`
`(cell c, int n) = parse_data(cs);`	`var (c: cell, n: int) = parse_data(cs);`
`global int stake_at;`	`global stake_at: int`

Modifiers inline and others — with annotations:

FunC	Tolk
`int f(cell s) inline {`	`@inline fun f(s: cell): int {`
`() load_data() impure inline_ref {`	`@inline_ref fun load_data() {`
`global int stake_at;`	`global stake_at: int`

forall — this way:

FunC	Tolk
`forall X -> tuple cons(X head, tuple tail)`	`fun cons<X>(head: X, tail: tuple): tuple`

asm implementation — like in FunC, but being properly aligned, it looks nicer:

@pure
fun third<X>(t: tuple): X
    asm "THIRD"

@pure
fun iDictDeleteGet(dict: cell, keyLen: int, index: int): (cell, slice, int)
    asm(index dict keyLen) "DICTIDELGET NULLSWAPIFNOT"

@pure
fun mulDivFloor(x: int, y: int, z: int): int
    builtin

There is also a @deprecated attribute, not affecting compilation, but for a human and IDE.

get instead of method_id

In FunC, method_id (without arguments) declared a get method. In Tolk, you use a straightforward syntax:

FunC	Tolk
`int seqno() method_id { ... }`	`get fun seqno(): int { ... }`

For method_id(xxx) (uncommon in practice, but valid), there is an attribute:

FunC	Tolk
`() after_code_upgrade(cont old_code) impure method_id(1666)`	`@method_id(1666) fun afterCodeUpgrade(oldCode: continuation)`

It's essential to declare types of parameters (though optional for locals)

// not allowed
fun do_smth(c, n)
// types are mandatory
fun do_smth(c: cell, n: int)

If parameter types are mandatory, the return type is not (it's often obvious or verbose). If omitted, it's auto-inferred:

fun x() { ... }  // auto infer from return statements

For local variables, types are also optional:

var i = 10;                      // ok, int
var b = beginCell();             // ok, builder
var (i, b) = (10, beginCell());  // ok, two variables, int and builder

// types can be specified manually, of course:
var b: builder = beginCell();
var (i: int, b: builder) = (10, beginCell());

Defaults for parameters are supported:

fun increment(x: int, by: int = 1) {
    return x + by
}

Variables are not allowed to be redeclared in the same scope

var a = 10;
...
var a = 20;  // error, correct is just `a = 20`
if (1) {
    var a = 30;  // it's okay, it's another scope
}

As a consequence, partial reassignment is not allowed:

var a = 10;
...
var (a, b) = (20, 30);  // error, releclaration of a

Note, that it's not a problem for loadUint() and other methods. In FunC, they returned a modified object, so a pattern var (cs, int value) = cs.load_int(32) was quite common. In Tolk, such methods mutate an object: var value = cs.loadInt(32), so redeclaration is unlikely to be needed.

fun send(msg: cell) {
    var msg = ...;  // error, redeclaration of msg

    // solution 1: intruduce a new variable
    var msgWrapped = ...;
    // solution 2: use `redef`, though not recommended
    var msg redef = ...;

String postfixes removed, compile-time functions introduced

Tolk removes the old FunC-style string postfixes ("..."c, etc.) in favor of a more transparent and more flexible approach.

FunC	Tolk
`"..."c`	`stringCrc32("...")`
—	`stringCrc16("...")`
`"..."H`	`stringSha256("...")`
`"..."h`	`stringSha256_32("...")`
`"..."a`	`address("...")`
`"..."s`	`stringHexToSlice("...")`
`"..."u`	`stringToBase256("...")`

These functions:

compile-time only
for constant strings only
can be used in constant initialization

// type will be `address`
const BASIC_ADDR = address("EQDKbjIcfM6ezt8KjKJJLshZJJSqX7XOA4ff-W72r5gqPrHF")

// return type will be `int`
fun minihashDemo() {
    return stringSha256_32("transfer(slice, int)");
}

The naming highlights that these functions have arrived from string postfixes and operate on string values. Remember that at runtime, there are no strings, only slices.

Trailing comma support

Tolk now supports trailing commas in the following contexts:

tensors
tuples
function calls
function parameters

var items = (
    totalSupply,
    verifiedCode,
    validatorsList,
);

Note that (5) is not a tensor. It's just the integer 5 in parentheses. With a trailing comma (5,) it's still (5).

Optional semicolon for the last statement in a block

In Tolk, you can omit the semicolon after the final statement in a block. While semicolons are still required between statements, the trailing semicolon on the last statement is now optional.

fun f(...) {
	doSomething();
	return result   // <-- valid without semicolon
}

// or
if (smth) {
	return 1
} else {
	return 2
}

Function ton("...") for human-readable amounts of Toncoins

FunC	Tolk
`int cost = 50000000;`	`val cost = ton("0.05");`
`const ONE_TON = 1000000000;`	`const ONE_TON = ton("1")`

The function ton() only accepts constant values (e.g., ton(some_var) is invalid). Its type is coins (not int!), although it's still a regular int from the TVM point of view. Arithmetic over coins degrade to int (for example, cost << 1 is valid, cost + ton("0.02") also).

Changes in the type system

FunC's type system is based on Hindley-Milner. This is a common approach for functional languages, where types are inferred from usage through unification.

In Tolk v0.7, the type system is rewritten from scratch. In order to add booleans, fixed-width integers, nullability, structures, and generics, we must have a static type system (like TypeScript or Rust). Because Hindley-Milner will clash with structure methods, struggle with proper generics, and become entirely impractical for union types (despite claims that it was designed for union types).

We have the following types:

int, bool, cell, slice, builder, untyped tuple
typed tuple [T1, T2, ...]
tensor (T1, T2, ...)
callables (TArgs) -> TResult
nullable types T?, compile-time null safety
union types T1 | T2 | ..., handled with pattern matching
coins and function ton("0.05")
int32, uint64, and other fixed-width integers (just int at TVM) details
bytesN and bitsN, similar to intN (backed by slices at TVM)
address (internal/external/none, still a slice at TVM)
void (more canonical to be named unit, but void is more reliable)
self, to make chainable methods, described below; actually it's not a type, it can only occur instead of return type of a function
never (an always-throwing function returns never, for example; an impossible type is also never)
structures and generics

The type system obeys the following rules:

variable types can be specified manually or are inferred from declarations, and never change after being declared
function parameters must be strictly typed
function return types, if unspecified, inferred from return statements similar to TypeScript; in case of recursion (direct or indirect), the return type must be explicitly declared somewhere
generic functions are supported

Clear and readable error messages on type mismatch

In FunC, due to Hindley-Milner, type mismatch errors are very hard to understand:

error: previous function return type (int, int)
cannot be unified with implicit end-of-block return type (int, ()):
cannot unify type () with int

In Tolk, they are human-readable:

1) can not assign `(int, slice)` to variable of type `(int, int)`
2) can not call method for `builder` with object of type `int`
3) can not use `builder` as a boolean condition
4) missing `return`
...

bool type, casting boolVar as int

Under the hood, bool is still -1 and 0 at TVM level, but from the type system's perspective, bool and int are now different.

Comparison operators == / >= /... return bool. Logical operators && || return bool. Constants true and false have the bool type. Lots of stdlib functions now return bool, not int (having -1 and 0 at runtime):

var valid = isSignatureValid(...);    // bool
var end = cs.isEnd();                 // bool

Operator !x supports both int and bool. Condition of if and similar accepts both int (!= 0) and bool. Logical && and || accept both bool and int, preserving compatibility with constructs like a && b where a and b are integers (!= 0).

Arithmetic operators are restricted to integers, only bitwise and logical allowed for bools:

valid && end;    // ok
valid & end;     // ok, bitwise & | ^ also work if both are bools
if (!end)        // ok

if (~end)        // error, use !end
valid + end;     // error
8 & valid;       // error, int & bool not allowed

Note, that logical operators && || (missed in FunC) use IF/ELSE asm representation always. In the future, for optimization, they could be automatically replaced by & | when it's safe (example: a > 0 && a < 10). To manually optimize gas consumption, you can still use & | (allowed for bools), but remember, that they are not short-circuit.

bool can be cast to int via as operator:

var i = boolValue as int;  // -1 / 0

There are no runtime transformations. bool is guaranteed to be -1/0 at TVM level, so this is type-only casting. But generally, if you need such a cast, probably you're doing something wrong (unless you're doing a tricky bitwise optimization).

Generic functions and instantiations like f<int>(...)

Tolk introduces properly made generic functions. Their syntax reminds mainstream languages:

fun replaceNulls<T1, T2>(tensor: (T1?, T2?), v1IfNull: T1, v2IfNull: T2): (T1, T2) {
    var (a, b) = tensor;
    return (a == null ? v1IfNull : a, b == null ? v2IfNull : b);
}

A generic parameter T may be something complex.

fun duplicate<T>(value: T): (T, T) {
    var copy: T = value;
    return (value, copy);
}

duplicate(1);         // duplicate<int>
duplicate([1, cs]);   // duplicate<[int, slice]>
duplicate((1, 2));    // duplicate<(int, int)>

Or even functions, it also works:

fun callAnyFn<TObj, TResult>(f: TObj -> TResult, arg: TObj) {
    return f(arg);
}

fun callAnyFn2<TObj, TCallback>(f: TCallback, arg: TObj) {
    return f(arg);
}

Note that while generic T is mostly detected from arguments, there are no such obvious corner cases when T does not depend on arguments:

fun tupleLast<T>(t: tuple): T
    asm "LAST"

var last = tupleLast(t);    // error, can not deduce T

To make this valid, T should be provided externally:

var last: int = tupleLast(t);       // ok, T=int
var last = tupleLast<int>(t);       // ok, T=int
var last = tupleLast(t) as int;     // ok, T=int

someF(tupleLast(t));       // ok, T=(paremeter's declared type)
return tupleLast(t);       // ok if function specifies return type

Also note that T for asm functions must occupy one stack slot, whereas for a user-defined function, T could be of any shape. Otherwise, asm body is unable to handle it properly.

#include → import. Strict imports

FunC	Tolk
`#include "another.fc";`	`import "another"`

In Tolk, you can not use a symbol from a.tolk without importing this file. In other words, import what you use.

All stdlib functions are available out of the box. Downloading stdlib and #include "stdlib.fc" is unnecessary. See below about embedded stdlib.

There is still a global scope of naming. If f is declared in two different files, it's an error. We import a whole file with no per-file visibility, and the export keyword is now supported but probably will be in the future.

#pragma → compiler options

In FunC, experimental features like allow-post-modifications were turned on by a pragma in .fc files (leading to problems when some files contain it, some don't). Indeed, it's not a pragma for a file, it's a compilation option.

In Tolk, all pragmas were removed. allow-post-modification and compute-asm-ltr were merged into Tolk sources (as if they were always on in FunC). Instead of pragmas, there is now an ability to pass experimental options.

As for now, there is one experimental option introduced — remove-unused-functions, which doesn't include unused symbols to Fift output.

#pragma version xxx was replaced by tolk xxx (no >=, just a strict version). It's good practice to annotate the compiler version you are using. If it doesn't match, Tolk will show a warning.

tolk 0.12

Late symbols resolving. AST representation

In FunC, like in С, you can not access a function declared below:

int b() { a(); }   ;; error
int a() { ... }    ;; since it's declared below

To avoid an error, a programmer should first create a forward declaration. The reason is that symbol resolution is performed right during parsing.

Tolk compiler separates these two steps. At first, it does parsing, and then it does symbol resolving. Hence, a snippet above would not be erroneous.

It sounds simple, but internally, it's a very huge job. To make this available, I've introduced an intermediate AST representation, which was completely missed in FunC. That's an essential point for future modifications and performing semantic code analysis.

null keyword

Creating null values and checking variables on null looks very pretty now.

FunC	Tolk
`a = null()`	`a = null`
`if (null?(a))`	`if (a == null)`
`if (~ null?(b))`	`if (b != null)`
`if (~ cell_null?(c))`	`if (c != null)`

throw and assert keywords

Tolk dramatically simplifies working with exceptions.

If FunC has throw(), throw_if(), throw_arg_if(), and the same for unless, Tolk has only two primitives: throw and assert.

FunC	Tolk
`throw(excNo)`	`throw excNo`
`throw_arg(arg, excNo)`	`throw (excNo, arg)`
`throw_unless(excNo, condition)`	`assert(condition, excNo)`
`throw_if(excNo, condition)`	`assert(!condition, excNo)`

Note, that !condition is possible since logical NOT is available, see below.

There is a long (verbose) syntax of assert(condition, excNo):

assert(condition) throw excNo;
// with a possibility to include arg to throw

Also, Tolk swaps catch arguments: it's catch (excNo, arg), both optional (since arg is most likely empty).

FunC	Tolk
`try { } catch (_, _) { }`	`try { } catch { }`
`try { } catch (_, excNo) { }`	`try { } catch(excNo) { }`
`try { } catch (arg, excNo) { }`	`try { } catch(excNo, arg) { }`

do ... until → do ... while

FunC	Tolk
`do { ... } until (~ condition);`	`do { ... } while (condition);`
`do { ... } until (condition);`	`do { ... } while (!condition);`

Note, that !condition is possible since logical NOT is available, see below.

Operator precedence became identical to C++ / JavaScript

In FunC, such code if (slices_equal() & status == 1) is parsed as if( (slices_equal()&status) == 1 ). This approach is a reason for various errors in real-world contracts.

In Tolk, & has a lower priority, which is identical to C++ and JavaScript.

Moreover, Tolk fires errors on potentially wrong operators' usage to eliminate such errors:

if (flags & 0xFF != 0)

will lead to a compilation error (similar to gcc/clang):

& has lower precedence than ==, probably this code won't work as you expected.  Use parenthesis: either (... & ...) to evaluate it first, or (... == ...) to suppress this error.

Hence, you should rewrite the code:

// either to evaluate it first (our case)
if ((flags & 0xFF) != 0)
// or to emphasize the behavior (not our case here)
if (flags & (0xFF != 0))

I've also added a diagnostic for a common mistake in bitshift operators: a << 8 + 1 is equivalent to a << 9, probably unexpected.

int result = a << 8 + low_mask;

error: << has lower precedence than +, probably this code won't work as you expected.  Use parenthesis: either (... << ...) to evaluate it first, or (... + ...) to suppress this error.

Operators ~% ^% /% ~/= ^/= ~%= ^%= ~>>= ^>>= no longer exist.

Immutable variables, declared via val

Like in Kotlin: var for mutable, val for immutable, optionally followed by a type. FunC has no analogue of val.

val flags = msgBody.loadMessageFlags();
flags &= 1;         // error, modifying an immutable variable

val cs: slice = c.beginParse();
cs.loadInt(32);     // error, since loadInt() mutates an object
cs.preloadInt(32);  // ok, it's a read-only method

Parameters of a function are mutable, but since they are copied by value, called arguments aren't changed. Exactly like in FunC, just to clarify.

fun some(x: int) {
    x += 1;
}

val origX = 0;
some(origX);      // origX remains 0

fun processOpIncrease(msgBody: slice) {
    val flags = msgBody.loadInt(32);
    ...
}

processOpIncrease(msgBody);  // by value, not modified

In Tolk, a function can declare mutate parameters. It's a generalization of FunC ~ tilda functions, read below.

Deprecated command-line options removed

Command-line flags -A, -P, and others were removed. Default behavior

/path/to/tolk {inputFile}

is more than enough. Use -v to print version and exit. Use -h for all available command-line flags.

Only one input file can be passed, others should be import'ed.

stdlib functions renamed to verbose clear names, camelCase style

All names in the standard library were reconsidered. Now, functions are called using longer but clear names.

FunC	Tolk
`cur_lt() car(l) get_balance().pair_first() raw_reserve(count) dict~idict_add?(...) dict~udict::delete_get_max() t~tpush(triple(x, y, z)) s.slice_bits() ~dump(x) ...`	`blockchain.logicalTime() listGetHead(l) contract.getOriginalBalance() reserveToncoinsOnBalance(count) dict.iDictSetIfNotExists(...) dict.uDictDeleteLastAndGet() t.push([x, y, z]) s.remainingBitsCount() debug.print(x) ...`

A former "stdlib.fc" was split into multiple files: common.tolk, tvm-dicts.tolk, and others.

Continue here: Tolk vs FunC: standard library.

stdlib is now embedded, not downloaded from GitHub

FunC	Tolk
Download stdlib.fc from GitHub Save into your project `#include "stdlib.fc";` Use standard functions	Use standard functions

In Tolk, stdlib a part of distribution. Standard library is inseparable, since keeping a triple language, compiler, stdlib together is the only correct way to maintain release cycle.

It works in such a way. Tolk compiler knows how to locate a standard library. If a user has installed an apt package, stdlib sources were also downloaded and exist on a hard disk, so the compiler locates them by system paths. If a user uses a WASM wrapper, they are provided by tolk-js. And so on.

Standard library is split into multiple files: common.tolk (most common functions), gas-payments.tolk (calculating gas fees), tvm-dicts.tolk, and others. Functions from common.tolk are available always (a compiler implicitly imports it). Other files are needed to be explicitly imported:

import "@stdlib/tvm-dicts"   // ".tolk" optional

...
var dict = createEmptyDict();
dict.iDictSet(...);

Mind the rule import what you use, it's applied to @stdlib/... files also (with the only exception of common.tolk).

JetBrains IDE plugin automatically discovers stdlib folder and inserts necessary imports as you type.

Logical operators && ||, logical not !

In FunC, there are only bitwise operators ~ & | ^. Developers making first steps, thinking "okay, no logical, I'll use bitwise in the same manner", often do errors, since operator behavior is completely different:

`a & b`	`a && b`
sometimes, identical:
`0 & X = 0`	`0 & X = 0`
`-1 & X = -1`	`-1 & X = -1`
but generally, not:
`1 & 2 = 0`	`1 && 2 = -1 (true)`

`~ found`	`!found`
sometimes, identical:
`true (-1) → false (0)`	`-1 → 0`
`false (0) → true (-1)`	`0 → -1`
but generally, not:
`1 → -2`	`1 → 0 (false)`

`condition & f()`	`condition && f()`
`f()` is called always	`f()` is called only if `condition`

`condition \| f()`	`condition \|\| f()`
`f()` is called always	`f()` is called only if `condition` is false

Tolk supports logical operators. They behave exactly as you get used to (right column). For now, && and || sometimes produce not optimal Fift code, but in the future, Tolk compiler will become smarter in this case. It's negligible, just use them like in other languages.

FunC	Tolk
`if (~ found?)`	`if (!found)`
`if (~ found?) { if (cs~load_int(32) == 0) { ... } }`	`if (!found && cs.loadInt(32) == 0) { ... }`
`ifnot (cell_null?(signatures))`	`if (signatures != null)`
`elseifnot (eq_checksum)`	`else if (!eqChecksum)`

Keywords ifnot and elseifnot were removed, since now we have logical not (for optimization, Tolk compiler generates IFNOTJMP, btw). The elseif keyword was replaced by the traditional else if.

Remember that a boolean true, transformed as int, is -1, not 1. It's a TVM representation.

Indexed access tensorVar.0 and tupleVar.0

Use tensorVar.{i} to access i-th component of a tensor. Modifying it will change the tensor.

var t = (5, someSlice, someBuilder);   // 3 stack slots
t.0         			// 5
t.0 = 10;   			// t is now (10, ...)
t.0 += 1;               // t is now (11, ...)
increment(mutate t.0);  // t is now (12, ...)
t.0.increment();        // t is now (13, ...)

t.1         // slice
t.100500    // compilation error

Use tupleVar.{i} to access i-th element of a tuple (does INDEX under the hood). Modifying it will change the tuple (does SETINDEX under the hood).

var t = [5, someSlice, someBuilder];   // 1 tuple on a stack with 3 items
t.0                     // "0 INDEX", reads 5
t.0 = 10;               // "0 SETINDEX", t is now [10, ...]
t.0 += 1;               // also works: "0 INDEX" to read 10, "0 SETINDEX" to write 11
increment(mutate t.0);  // also, the same way
t.0.increment();        // also, the same way

t.1         // "1 INDEX", it's slice
t.100500    // compilation error

It also works for untyped tuples, though the compiler can't guarantee index correctness.

var t = createEmptyTuple();
t.tuplePush(5);
t.0                     // will head 5
t.0 = 10                // t will be [10]
t.100500                // will fail at runtime

It works for nesting var.{i}.{j}. It works for nested tensor, nested tuples, tuples nested into tensors. It works for mutate. It works for globals.

t.1.2 = 10;    // "1 INDEX" + "2 SETINDEX" + "1 SETINDEX"
t.1.2 += 10;   // "1 INDEX" + "2 INDEX" + sum + "2 SETINDEX" + "1 SETINDEX"

globalTuple.1.2 += 10;  // "GETGLOB" + ... + "SETGLOB"

Type address

In TVM, all binary data is just a slice. Same goes for addresses: while TL-B describes the entity MsgAddress (internal/external/none/var address),

and TVM assembler has instructions to load/validate addresses, nevertheless: at the low level, it's just a slice. That's why in FunC's standard library loadAddress returned slice, and storeAddress accepted slice.

Tolk introduces the dedicated address type. It's still a TVM slice at runtime (internal/external/none), but it differs from an abstract slice from the type system point of view:

Integrated with auto-serialization: compiler knows how to pack/unpack it (LDMSGADDR and STSLICE)
Comparable: operators == and != work on addresses:

if (senderAddress == msg.owner)

Introspectable: address.isNone(), address.isInternal(), address.isExternal(), address.getWorkchain() and address.getWorkchainAndHash() (valid for internal addresses)

Passing a slice instead leads to an error:

var a: slice = s.loadAddress();  // error, can not assign `address` to `slice`

Embedding a const address into a contract

Use the built-in address() function, which accepts a standard address. In FunC, there was a postfix "..."a that returned a slice.

address("EQCRDM9h4k3UJdOePPuyX40mCgA4vxge5Dc5vjBR8djbEKC5")
address("0:527964d55cfa6eb731f4bfc07e9d025098097ef8505519e853986279bd8400d8")

Casting slice to address and vice versa

If you have a raw slice, which is actually an address, you can cast it via as operator. In practice, this can occur if you've composed an address with a builder, having manually written its binary representation:

var b = beginCell()
       .storeUint(0b01)   // addr_extern
       ...;
var s = b.endCell().beginParse();
return s as address;   // `slice` as `address`

A reversed cast also is valid: someAddr as slice (why would you need it, is an open question, though).

Different types of addresses

According to a standard, there are different types of addresses. The most frequently used is a standard address — just and address of a smart contract, like EQ.... But also, there are external and none addresses. In a binary TL-B representation,

10 (internal prefix) + 0 (anycast, always 0) + workchain (8 bits) + hash (256 bits) — that's EQ...: it's 267 bits
01 (external prefix) + len (9 bits) + len bits — external addresses
00 (none prefix) — address none, 2 bits

The address type can hold any of these. Most often, it's an internal address. But the type system does not restrict it exactly: this can't be done without heavy runtime checks.

So, if address comes from an untrusted input, you should probably validate it:

val newOwner = msg.nextOwnerAddress;
assert(newOwner.isInternal()) throw 403;
assert(newOwner.getWorkchain() == BASECHAIN) throw 403;

All in all, if you don't trust inputs — you should validate everything: numbers, payload, and addresses particularly. But if an input comes from a trusted source (your own contract storage, for example) — of course, you can rely on its contents. The compiler does not insert hidden instructions. Just remember, that address in general can hold all valid types.

Type aliases type NewName = <existing type>

Tolk supports type aliases, similar to TypeScript and Rust. An alias creates a new name for an existing type but remains interchangeable with it.

type UserId = int32
type MaybeOwnerHash = bytes32?

fun calcHash(id: UserId): MaybeOwnerHash { ... }

var id: UserId = 1;       // ok
var num: int = id;        // ok
var h = calcHash(id);
if (h != null) {
    h as slice;           // bytes32 as slice
}

Nullable types T?, null safety, smart casts, operator !

Tolk has nullable types: int?, cell?, and T? in general (even for tensors). Non-nullable types, such as int and cell, can never hold null values.

The compiler enforces null safety: you cannot use nullable types without first checking for null. Fortunately, these checks integrate smoothly and organically into the code thanks to smart casts. Smart casts are purely a compile-time feature — they do not consume gas or extra stack space.

var value = x > 0 ? 1 : null;  // int?

value + 5;               // error
s.storeInt(value);       // error

if (value != null) {
    value + 5;           // ok, smart cast
    s.storeInt(value);   // ok, smart cast
}

Remember that when a variable's type is not specified, it's auto-inferred from the assignment and never changes:

var i = 0;
i = null;       // error, can't assign `null` to `int`
i = maybeInt;   // error, can't assign `int?` to `int`

Such a code will not work. You must explicitly declare the variable as nullable::

// incorrect
var i = null;
if (...) {
    i = 0;     // error
}

// correct
var i: int? = null;
// or
var i = null as int?;

Smart casts (similar to TypeScript and Kotlin) make it easier to deal with nullable types, allowing code like this:

if (lastCell != null) {
    // here lastCell is `cell`, not `cell?`
}

if (lastCell == null || prevCell == null) {
    return;
}
// both lastCell and prevCell are `cell`

var x: int? = ...;
if (x == null) {
    x = random();
}
// here x is `int`

while (lastCell != null) {
    lastCell = lastCell.beginParse().loadMaybeRef();
}
// here lastCell is 100% null

// t: (int, int)?
t.0                // error
t!.0               // ok
if (t.0 != null) {
    t.0            // ok
}

Note that smart casts don't work for globals; they only work for local vars.

Tolk has the ! operator (non-null assertion, compile-time only), like ! in TypeScript and !! in Kotlin. If you are certain that a variable is not null, this operator allows you to skip the compiler's check.

fun doSmth(c: cell);

fun analyzeStorage(nCells: int, lastCell: cell?) {
    if (nCells) {           // then lastCell 100% not null
        doSmth(lastCell!);  // use ! for this fact
    }
}

In practice, you'll use this operator working with low-level dicts API. Tolk will have a high-level map<K,V> in the future. For now, working with dicts will require the ! operator.

// it returns either (slice, true) or (null, false)
@pure
fun dict.iDictGet(self, keyLen: int, key: int): (slice?, bool)
    asm(key self keyLen) "DICTIGET" "NULLSWAPIFNOT"

var (cs, exists) = myDict.iDictGet(...);
// if exists is true, cs is not null
if (exists) {
    cs!.loadInt(32);
}

You can also declare always-throwing functions that return never:

fun alwaysThrows(): never {
    throw 123;
}

fun f(x: int) {
    if (x > 0) {
        return x;
    }
    alwaysThrows();
    // no `return` statement needed
}

The never type implicitly occurs when a condition can never happen:

var v = 0;
// prints a warning
if (v == null) {
    // v is `never`
    v + 10;   // error, can not apply `+` `never` and `int`
}
// v is `int` again

If you encounter never in compilation errors, there is most likely a warning in the preceding code.

Non-atomic nullable are also allowed: (int, int)?, (int?, int?)?, or even ()?. Then, a special value presence stack slot is implicitly added. It holds 0 if a value is null, and not 0 (currently, -1) if not null:

// t: (int, int)?
t = (1, 2);    // 1 2 -1
t = (3, 4);    // 3 4 -1
t = null;      // null null 0

// t: ()?
t = ();         // -1
t = null;       // 0

All in all, nullability is a significant step forward for type safety and reliability. Nullable types eliminate runtime errors, enforcing correct handling of optional values.

Union types T1 | T2 | ..., operators match, is, !is

Union types allow a variable to hold multiple possible types, similar to TypeScript.

fun whatFor(a: bits8 | bits256): slice | UserId { ... }

var result = whatFor(...);  // slice | UserId

Nullable types T? are now formally T | null. Union types have intersection properties. For instance, B | C can be passed/assigned to A | B | C | D.

The only way to work with unions from code is pattern matching:

match (result) {
    slice  => { /* result is smart-casted to slice  */ }
    UserId => { /* result is smart-casted to UserId */ }
}

Example:

match (result) {
    slice => {
        return result.loadInt(32);
    }
    UserId => {
        if (result < 0) {
            throw 123;
        }
        return loadUser(result).parentId;
    }
}

match must cover all union cases (should be exhaustive). It can also be used as an expression:

type Pair2 = (int, int)
type Pair3 = (int, int, int)

fun getLast(tensor: Pair2 | Pair3) {
    return match (tensor) {
        Pair2 => tensor.1,
        Pair3 => tensor.2,
    }
}

Syntax details:

commas are optional with but required for expressions
a trailing comma is allowed
semicolon is not required after match used as a statement
for match-expressions, its arm can terminate, then its type is considered never:

return match (msg) {
    ...
    CounterReset => throw 403,  // forbidden
}

Variable declaration inside match is allowed:

match (val v = getPair2Or3()) {
    Pair2 => {
        // use v.0 and v.1
    }
    Pair3 => {
        // use v.0, v.1, and v.2
    }
}

How are union types represented on the stack, at the TVM level?

Internally, at the TVM level, they are stored as tagged unions, like enums in Rust:

each type is assigned a unique type ID, which is stored alongside the value
the union occupies N + 1 stack slots, where N is the maximum size of any type in the union
a nullable type T? is just a union with null (type ID = 0); int? and other atomics still use 1 stack slot

var v: int | slice;    // 2 stack slots: value and typeID
                       // - int:   (100, 0xF831)
                       // - slice: (CS{...}, 0x29BC)
match (v) {
    int =>     // IF TOP == 0xF831 { ... }
        // v.slot1 contains int, can be used in arithmetics
    slice =>   // ELSE { IF TOP == 0x29BC { ... } }
        // v.slot1 contains slice, can be used to loadInt()
}

fun complex(v: int | slice | (int, int)) {
    // Stack representation:
    // - int:        (null, 100, 0xF831)
    // - slice:      (null, CS{...}, 0x29BC)
    // - (int, int): (200, 300, 0xA119)
}

complex(v);   // passes (null, v.slot1, v.typeid)
complex(5);   // passes (null, 5, 0xF831)

Besides match, you can test a union type by is. Smart casts work as expected:

fun f(v: cell | slice | builder) {
    if (v is cell) {
        v.cellHash();
    } else {
        // v is `slice | builder`
        if (v !is builder) { return }
        // v is `slice`
        v.sliceHash();
    }
    // v is `cell | slice`
    if (v is int) {
        // v is `never`
        // a warning is also printed, condition is always false
    }
}

Pattern matching for expressions (switch-like behavior)

match can also be used for constant expressions, similar to switch:

val nextValue = match (curValue) {
    1 => 0,
    0 => 1,
    else => -1
};

Rules:

only constant expressions are allowed on the left-hand side (1, SOME_CONST, 2 + 3)
branches can contain return and throw
else is required for expression form but optional for statement form:

// statement form
match (curValue) {
    1 => { nextValue = 0 }
    0 => { nextValue = 1 }
    -1 => throw NEGATIVE_NOT_ALLOWED
}

// expression form, else branch required
val nextValue = match (curValue) {
    ...
    else => <expression>
}

Structures

Looks like TypeScript — but works in TVM!

struct Point {
    x: int
    y: int
}

fun calcMaxCoord(p: Point) {
    return p.x > p.y ? p.x : p.y;
}

// declared like a JS object
var p: Point = { x: 10, y: 20 };
calcMaxCoord(p);

// called like a JS object
calcMaxCoord({ x: 10, y: 20 });

// works with shorthand syntax
fun createPoint(x: int, y: int): Point {
    return { x, y }
}

a struct is just a named tensor
Point is identical to (int, int) at the TVM level
field access p.x works like accessing tensor elements t.0, for reading and writing

This means no bytecode overhead — you can replace unreadable tensors with clean, structured types.

Fields can be separated by newlines (recommended) or by ; / , (both are valid, like in TypeScript).

When creating an object, you can specify StructName { ... } or simply { ... } if the type is clear from the context (e.g., return type or assignment):

var s: StoredInfo = { counterValue, ... };
var s: (int, StoredInfo) = (0, { counterValue, ... });

// also valid
var s = StoredInfo { counterValue, ... };

Default values for fields are supported:

struct DefDemo {
    f1: int = 0
    f2: int? = null
    f3: (int, coins) = (0, ton("0.05"))
}

var d: DefDemo = {};         // ok
var d: DefDemo = { f2: 5 };  // ok

Structs can have methods as extension functions, read below.

Generic structs and aliases

They exist only at the type level (no runtime cost).

struct Container<T> {
    isAllowed: bool
    element: T?
}

struct Nothing

type Wrapper<T> = Nothing | Container<T>

Example usage:

fun checkElement(c: Container<T>) {
    return c.element != null;
}

var c: Container<int32> = { isAllowed: false, element: null };

var v: Wrapper<int> = Nothing {};
var v: Wrapper<int32> = Container { value: 0 };

Since it's a generic, you should specify type arguments when using it:

fun getItem(c: Container)        // error, specify type arguments
fun getItem(c: Container<int>)   // ok
fun getItem<T>(c: Container<T>)  // ok

var c: Container = { ... }       // error, specify type arguments
var c: Container<int> = { ... }  // ok

When you declare a generic function, the compiler can automatically infer type arguments for a call:

fun doSmth<T>(value: Container<T>) { ... }

doSmth({ item: 123 });         // T = int
doSmth({ item: cellOrNull });  // T = cell?

Demo: Response<TResult, TError>:

struct Ok<TResult> { result: TResult }
struct Err<TError> { err: TError }

type Response<R, E> = Ok<R> | Err<E>

fun tryLoadMore(slice: slice): Response<cell, int32> {
    return ...
        ? Ok { result: ... }
        : Err { err: ErrorCodes.NO_MORE_REFS }
}

match (val r = tryLoadMore(inMsg)) {
    Ok => { r.result }
    Err => { r.err }
}

Methods: for any types, including structures

Methods are declared as extension functions, similar to Kotlin. A method can accept the first self parameter (then it's an instance method) or not accept it (then it's a static method).

fun Point.getX(self) {
    return self.x
}

fun Point.create(x: int, y: int): Point {
    return { x, y }
}

Methods can be created for any type, including aliases, unions, and built-in types:

fun int.isZero(self) {
    return self == 0
}

type MyMessage = CounterIncrement | ...

fun MyMessage.parse(self) { ... }
// this is identical to
// fun (CounterIncrement | ...).parse(self)

Methods perfectly work with asm, since self is just a regular variable:

@pure
fun tuple.size(self): int
    asm "TLEN"

By default, self is immutable. It means that you can't modify it or call mutating methods. To make self mutable, you should explicitly declare mutate self:

fun Point.assignX(mutate self, x: int) {
    self.x = x;   // without mutate, an error "modifying immutable object"
}

fun builder.storeInt32(mutate self, v: int32): self {
    return self.storeInt(v, 32);
}

Methods for generic structs created seamlessly. Note, that no extra <T> is required: while parsing the receiver type, compiler treats unknown symbols as generic arguments.

struct Container<T> {
    item: T
}

// compiler treats T (unknown symbol) as a generic parameter
fun Container<T>.getItem(self) {
    return self.item;
}

// and this is a specialization for integer containers
fun Container<int>.getItem(self) {
    ...
}

Another example:

struct Pair<T1, T2> {
    first: T1
    second: T2
}

// both <T1,T2>, <A,B>, etc. work: any unknown symbols
fun Pair<A, B>.create(f: A, s: B): Pair<A, B> {
    return {
        first: f,
        second: s,
    }
}

Similarly, any unknown symbol (typically, T) can be used to make a method accepting anything:

// any receiver
fun T.copy(self): T {
    return self;
}

// any nullable receiver
fun T?.isNull(self): bool {
    return self == null;
}

When you call someObj.method(), multiple methods with the same name may exist and theoretically be acceptable:

fun T.copy(self) { ... }
fun int.copy(self) { ... }

someVar.copy();   // ???

So, the compiler performs matching to find as precise method as follows:

Search for exact type receiver like int.copy (most practical cases finish here)
Search for non-generic receivers that are acceptable (like int32.copy / int?.copy)
Search for generic receivers except T (like Container<T>.copy)
Search for T receivers (T.copy)

fun int.copy(self) { ... }
fun T.copy(self) { ... }

6.copy()              // int.copy (rule 1)
(6 as int32).copy()   // int.copy (rule 2)
(6 as int32?).copy()  // T.copy with T=int? (rule 4)

type MyMessage = CounterIncrement | CounterReset
fun MyMessage.check() { ... }
fun CounterIncrement.check() { ... }

MyMessage{...}.check()         // first (rule 1)
CounterIncrement{...}.check()  // second (rule 1)
CounterReset{...}.check()      // first (rule 2)

In case of ambiguity, an error is printed:

fun int?.doSmth(self) { ... }
fun int64.doSmth(self) { ... }

var v: int32;
v.doSmth();   // error: no exact match, but two possible acceptable receivers

You can assign a generic function to the variable, but you should explicitly specify types:

fun genericFn<T>(v: T) { ... }
fun Container<T>.getItem(self) { ... }

var callable1 = genericFn<slice>;
var callable2 = Container<int32>.getItem;
callable2(someContainer32);   // pass it as self

Auto-detect and inline functions

Tolk can inline functions at the compiler level without using PROCINLINE as defined by Fift.

fun Point.create(x: int, y: int): Point {
    return {x, y}
}

fun Point.getX(self) {
    return self.x
}

fun sum(a: int, b: int) {
    return a + b;
}

fun main() {
    var p = Point.create(10, 20);
    return sum(p.getX(), p.y);
}

is compiled to:

main PROC:<{
  30 PUSHINT
}>

The compiler automatically detects what functions to inline. The attribute @inline also forces the compiler to inline. The attribute @noinline forces a function to stay in a dict, @inline_ref remains "inline ref" (perfect for "unlikely" execution paths).

Compiler inlining is efficient in terms of stack manipulations. It works with arguments of any stack width. It works with any functions and methods, except those that are recursive or contain return statements in the middle. It works with mutate and self.

You should not worry that "simple getters" like fun Point.getX(self) { return self.x } will require stack reordering. You can extract small functions; they are zero-cost. You don't think about @inline and Fift because no inlining is deferred to Fift; the compiler handles everything in advance.

How does auto-inline work?

In two words,

simple small functions are inlined always
if a function is called only once, it's inlined

Some details.

For every function, the compiler calculates a weight (a heuristic, AST-based metric) and the usage count.

If weight < THRESHOLD, it's inlined always.
If usages == 1, it's inlined always.
Otherwise, there is some empirical formula.

No one prevents you from forcing @inline annotation for big functions also. For example, if you know that all usages correspond to hot paths. Sometimes, on the contrary, you'd want to prevent inlining with @inline_ref for example, even if a function is called once — if it's an unlikely path. Anyway, if you are keen on optimization, consider covering your contract with gas benchmarks and experimenting with inlining, branch reordering, etc.

What can NOT be auto-inlined?

A function will NOT be inlined (even marked as @inline) if:

It has return in the middle. If a function has multiple return points, the compiler can't currently do anything with it. In Tolk v1.x, it will be partially resolved for some use cases.
A function that appears in a recursive call chain (f -> g -> f). It is not very confident in practice.
It's used as a non-call. For example, you take a reference to it like val callback = f.

No tilda ~ methods, mutate keyword instead

If FunC has .methods() and ~methods(), Tolk has only a dot, and the only way to call a method is .method(). Basically, Tolk just works as expected:

b.storeUint(x, 32);   // modifies a builder, can be chainable
s.loadUint(32);       // modifies a slice, returns integer

Continue reading on a separate page: Mutability in Tolk.

Auto-packing to/from cells/builders/slices

Having any struct, you can unpack in from a cell or pack and object to a cell:

struct Point {
    x: int8
    y: int8
}

var value: Point = { x: 10, y: 20 }

// makes a cell containing "0A14"
var c = value.toCell();
// back to { x: 10, y: 20 }
var p = Point.fromCell(c);

Continue reading on a separate page: Auto-packing to/from cells.

Universal createMessage: avoid manual cells composition

No more manual beginCell().storeUint(...).storeRef(...) boilerplate. Just describe the message in a literal, and let the compiler do the rest.

val reply = createMessage({
    bounce: false,
    value: ton("0.05"),
    dest: senderAddress,
    body: RequestedInfo { ... }
});
reply.send(SEND_MODE_REGULAR);

Continue reading on a separate page: Universal createMessage.

Modern onInternalMessage

In Tolk, you don't need to manually parse msg_cell to retrieve sender_address or fwd_fee. Everything is pretty straightforward:

fun onInternalMessage(in: InMessage) {
    in.senderAddress
    in.originalForwardFee
    in.valueCoins   // typically called "msg value"

    in.|   // IDE suggests you
}

While the old-fashioned approach of accepting 4 parameters, like recv_internal, still works, the pattern above is preferred. It's also significantly more efficient: the fields of InMessage are now directly mapped to new TVM-11 instructions.

Recommended pattern:

Define each message as a struct (typically with a 32-bit opcode).
Define a union of all allowed messages.
Use val msg = lazy MyUnion.fromSlice(in.body).
Match on msg, handling each branch—and possibly an else.

Avoid the legacy approach of manually extracting fwd_fee and other fields at the start of the function. There's no need for it anymore—access them on demand via in.smth.

type AllowedMessageToMinter =
    | MintNewJettons
    | BurnNotificationForMinter
    | RequestWalletAddress

fun onInternalMessage(in: InMessage) {
    val msg = lazy AllowedMessageToMinter.fromSlice(in.body);

    match (msg) {
        BurnNotificationForMinter => {
            var storage = lazy MinterStorage.load();
            ...
            storage.save();    
            ...
        }
        RequestWalletAddress => ...
        MintNewJettons => ...
        else => {
            // for example:
            // ignore empty messages, "wrong opcode" for others
            assert (in.body.isEmpty()) throw 0xFFFF
        }
    }
}

A separate onBouncedMessage

In FunC, you parsed msg_cell, read 4-bit flags, and tested for flags & 1 to check whether a message is bounced.

In Tolk, if you want to handle bounces, you create a separate entry point:

fun onBouncedMessage(in: InMessageBounced) {
}

It's automatically called by the compiler, similarly to this:

fun onInternalMessage(in: InMessage) {
    // the compiler inserts this automatically:
    if (MSG_IS_BOUNCED) { onBouncedMessage(...); return; }

    ... // your code
}

If you don't declare onBouncedMessage, all bounces will be just filtered out:

fun onInternalMessage(in: InMessage) {
    // the compiler inserts this automatically:
    if (MSG_IS_BOUNCED) { return; }

    ... // your code
}

Remember about 256 bits on bounces

Currently, TON Blockchain works in such a way that when a message is bounced, only the first 256 bits are present, starting with 0xFFFFFFFF ("bounced prefix"). That's why you manually control which fields you can read (they fit the rest of the 224 bits) and which do not.

fun onBouncedMessage(in: InMessageBounced) {
    in.bouncedBody    // 256 bits

    // typically, you'll do
    in.bouncedBody.skipBouncedPrefix();   // skips 0xFFFFFFFF
    // handle rest of body, probably with lazy match
}

Where to go next?

Explore the Tolk vs FunC benchmarks. These are real-world contracts (Jetton, NFT, Wallet, etc.) migrated from FunC — same logic, but written in a cleaner, more expressive style.

Try a FunC-to-Tolk converter. It's a great starting point for incremental migration.

Run npm create ton@latest and just start experimenting.

Was this article useful?

Traditional comments​

2+2 is 4, not an identifier. Identifiers can only be alpha-numeric​

Impure by default, compiler won't drop user function calls​

New functions syntax: fun keyword, @ attributes, types on the right (like in TypeScript, Kotlin, Python, etc.)​

get instead of method_id​

It's essential to declare types of parameters (though optional for locals)​

Variables are not allowed to be redeclared in the same scope​

String postfixes removed, compile-time functions introduced​

Trailing comma support​

Optional semicolon for the last statement in a block​

Function ton("...") for human-readable amounts of Toncoins​

Changes in the type system​

Clear and readable error messages on type mismatch​

bool type, casting boolVar as int​

Generic functions and instantiations like f<int>(...)​

#include → import. Strict imports​

#pragma → compiler options​

Late symbols resolving. AST representation​

null keyword​

throw and assert keywords​

do ... until → do ... while​

Operator precedence became identical to C++ / JavaScript​

Immutable variables, declared via val​

Deprecated command-line options removed​

stdlib functions renamed to verbose clear names, camelCase style​

stdlib is now embedded, not downloaded from GitHub​

Logical operators && ||, logical not !​

Indexed access tensorVar.0 and tupleVar.0​

Type address​

Type aliases type NewName = <existing type>​

Nullable types T?, null safety, smart casts, operator !​

Union types T1 | T2 | ..., operators match, is, !is​

Pattern matching for expressions (switch-like behavior)​

Structures​

Generic structs and aliases​

Methods: for any types, including structures​

Auto-detect and inline functions​

No tilda ~ methods, mutate keyword instead​

Auto-packing to/from cells/builders/slices​

Universal createMessage: avoid manual cells composition​

Modern onInternalMessage​